Statistical Relational Artificial Intelligence. Luc De Raedt

Читать онлайн.
Название Statistical Relational Artificial Intelligence
Автор произведения Luc De Raedt
Жанр Компьютерное Железо
Серия Synthesis Lectures on Artificial Intelligence and Machine Learning
Издательство Компьютерное Железо
Год выпуска 0
isbn 9781681731803



Скачать книгу

is the head of the clause, and B1, …, Bm is the body of the clause. If m = 0 the implication “: -” can be omitted and the clause is called a fact. A logic program is a set of definite clauses.

      Example 2.3 The definite clause

image

      can be read as: for all X, for all Y and for all Z: X is a grandparent of Y if X is a parent of Z and Z is a parent of Y. grandparent(X, Y) in the head of the clause, and parent(X, Z), parent(Z, Y) is the body of the clause.

      Example 2.4 Figure 2.4 shows a logic program defining grandparent/2 with two parent/2 facts and Fig. 2.5 a program defining nat/1.

      Reasoning with logic involves manipulating the formula as data structures. An expression is a term, atom, conjunction, or clause. The set of logical variables in expression E is denoted as Var(E). For example, in the clause c defined in Example 2.3, Var(c) = {X, Y, Z}. An expression E is ground when there is no variable occurring in E, i.e., Var(E) = ∅. To connect variables and (structure) ground terms, we make use of substitutions.

      A substitution is of the form θ = {V1/t1, …, Vn/tn}, e.g., {Y/ann}, where the Vi are distinct logical variables, and ti are terms. Applying a substitution θ to an expression e yields the expression where all occurrences of the variables Vi are simultaneously replaced by the term ti, e.g., if c is the definite clause (2.3) above, c{Y/ann} is

image

      A substitution θ is called a unifier for two expressions e1 and e2 if e1θ = e2θ, that is, if the expressions become identical after applying the substitution. For instance, the substitution {X/jef, Y/ann} is a unifier for parent(X, ann) and parent(jef, Y).

      It may seem that it is impossible to reason about clauses where variables can represent anything, e.g., numbers, people, or symbols. It turns out that if a set of clauses has a model where the variables can represent anything, it has a model where the variables denote symbolic objects in the Herbrand base. The Herbrand base of a set of clauses P, denoted as hb(P), is the set of all ground atoms constructed with the predicate, constant and function symbols in the alphabet of P (and inventing a new constant if P doesn’t contain any constants).

      Example 2.5 The Herbrand bases of the grandparent and nat logic programs in Figs. 2.4 and 2.5 are

image

      hb(grandparent) has 18 elements and hb(nat) is infinite.

      When considering the constants a and b, the clausal theory for Example 2.2 has the following Herbrand base:

image

      A Herbrand interpretation for a set of clauses P is a subset of hb(P). Herbrand interpretation I represents the possible world where all of the elements in I are true and the elements of hb(P) \ I are false. A Herbrand interpretation I is a model of clause c, if and only if, for all substitutions, θ that grounds all variables in c such that body(c)θI holds, it also holds that head(c)θI. Interpretation I is a model of a set of clauses P if I is a model of all clauses in P. A clausal theory P entails a clausal theory P′, denoted as PP′, if and only if, each model of P is also a model of P′.

      Example 2.6 The following two interpretations are models for Example 2.2, which shows that a set of clauses can have more than one model:

image

      These models are also minimal w.r.t. set inclusion.

      While clausal theories may have no, one, or more minimal models, a definite clause program has a unique least Herbrand model. The least Herbrand model LH(P), consists of all ground atoms f ∈ hb(P) such that P logically entails f, i.e., Pf. It is also a minimal model. All ground atoms in the least Herbrand model are provable.

      Example 2.7 The least Herbrand models of our programs from Example 2.5 are

image

      That is, nat(a) is provable for all atoms a in the Herbrand base of nat.

      In Part II on inference, we will discuss how to use clausal logic for inference and for reasoning. This will also include the computation of the least Herbrand model of a logic program. For a detailed introduction to logic programming, see Lloyd [1989], for a more gentle introduction, see Flach [1994], and for a detailed discussion of Prolog, see Sterling and Shapiro [1986].

      CHAPTER 3

       Relational Probabilistic Representations

      Probability theory can be seen as extending the propositional calculus to include uncertainty; we can ask for the probability of a proposition conditioned on a proposition. Likewise, the (first-order) predicate calculus can be seen as extending propositional calculus to include relations (represented by predicate symbols), individuals, and logical variables that can be universally or existentially quantified. Relational probabilistic representations can be seen as extending the predicate calculus to include probabilities, or extending probabilistic models to include relations, individuals, and variables.

      Probability has seen a resurgence in AI due mainly to exploiting the independency inherent in graphical models, and most relational probabilistic models are built from these graphical models. In relational probabilistic models, the random variables are built from individuals and relationships among individuals. Just as the first-order predicate calculus extends the propositional calculus to reason about individuals and relations, relational probabilistic models extend probabilistic models, such as Bayesian networks and Markov networks, to allow reasoning about individuals and relations.

      With such models, a reasoning agent observes individuals, their properties, and relations among them, and conditioning on these allows probabilistic predictions about individuals, properties and relationships. We often want to build the models before we know which individuals exist in a population so that the models can be applied to diverse populations. When we condition on observations of particular individuals and relations among particular individuals, we can learn something about those individuals, which can then be used for subsequent reasoning and learning. In the open-universe case, we can be unsure about whether two descriptions or names refer to the same individual, whether an individual that fits a description actually exists or how many individuals there are.

      These are key properties of StarAI domains that we have encountered already in the examples used in the introduction.

      Example