Resolution Method. Herbrand Theorem. Clause Form. Mathematical Logic. Part 5.

resolution principle, Robinson, Herbrand theorem, normal form, clausal, clause, form, resolution method, normal, clausal form, clause form, resolution, Herbrand, theorem, unification, mgu, unifier, most general unifier

Back to title page

Left

Adjust your browser window

Right

5.4. Clause Form

Clause Forms of Propositional Formulas

Which form is more "natural" - DNF, or CNF? Of course, CNF is more natural. Indeed, a DNF D₁vD₂v ... vD_m asserts that one (or more) of the formulas D_i is true. This is a very complicated assertion - sometimes D₁ is true, sometimes D₂ is true, etc. But, if we have a CNF instead - C₁&C₂& ... &C_n? It asserts that all the formulas C_i are true, i.e. we can replace the long formula C₁&C₂& ... &C_n by a set of shorter formulas C₁, C₂, ..., C_n. For human reading and for computer processing, a set of shorter formulas is much more convenient than a single long formula.

Let us return to our example formula ((A->B)->C)->(C->B) of Section 5.2, for which we obtained a DNF

(~A&~C)v(B&~C)v(~C)v(B).

and a CNF:

(~AvBv~C)&(Bv~C).

Without a transformation, the above DNF is hard for reading, understanding and analyzing. The CNF is more convenient - it says simply that ~AvBv~C is true and Bv~C is true.

As another step, making the formulas easier to understand, we could apply the following equivalencies:

[L₁-L₁₁, MP]: |- ~AvB <->A->B,

[L₁-L₁₁, MP]: |- ~Av~BvC <-> A&B->C,

[L₁-L₁₁, MP]: |- ~AvBvC <-> A->BvC,

[L₁-L₁₁, MP]: |- ~Av~BvCvD <-> A&B->CvD,

etc.

Exercise 5.4.1. Verify these equivalencies by proving that, generally,

[L₁-L₁₁, MP]: ~A₁v~A₂v ... v~A_mvB₁vB₂v ... vB_n <-> (A₁&A₂& ... &A_m -> B₁vB₂v ... vB_n).

Thus, we can replace our set of two formulas ~AvBv~C and Bv~C by

A&C->B,
C->B.

The conjunction of these two formulas is equivalent to the initial formula ((A->B)->C)->(C->B).

Note. Of course, in this particular case (not in general!), we could drop the first clause (because it is a consequence of the second one).

Formulas having the form

A₁&A₂& ... &A_m -> B₁vB₂v ... vB_n,

or, alternatively,

~A₁v~A₂v ... v~A_mvB₁vB₂v ... vB_n,

where A₁, A₂, ... , A_m, B₁, B₂, ... , B_n are atoms, are called clauses. Clauses are very well suited for computer processing. Indeed, in the computer memory, we can represent the formula A₁&A₂& ... &A_m -> B₁vB₂v ... vB_n simply as a pair of sets of atoms - negative {A₁, A₂, ... , A_m} and positive {B₁, B₂, ... , B_n}.

What, if one (or both) of these sets is (are) empty?

If, in the formula ~A₁v~A₂v ... v~A_mvB₁vB₂v ... vB_n, we have m = 0 and n > 0, then, of course, this formula asserts simply that B₁vB₂v ... vB_n, i.e. "converting" it into the implication -> B₁vB₂v ... vB_n (with empty premise) leads us to the following definition: the clause -> B₁vB₂v ... vB_n means the same as B₁vB₂v ... vB_n.

If, in the formula ~A₁v~A₂v ... v~A_mvB₁vB₂v ... vB_n, we have m > 0 and n = 0, then, of course, this formula asserts simply that~A₁v~A₂v ... v~A_m, i.e. "converting" it into the implication A₁&A₂& ... &A_m -> (with empty consequence) leads us to the following definition: the clause A₁&A₂& ... &A_m -> means the same as ~(A₁&A₂& ... &A_m).

If, in the formula ~A₁v~A₂v ... v~A_mvB₁vB₂v ... vB_n, we have both m = 0 and n = 0, then we have an empty disjunction, i.e. an empty clause means "false", i.e. is equivalent to A&~A.

Note. Clauses are similar to sequents - pairs of sets of formulas (S₁, S₂), used in the proof of Theorem 4.4.5 (completeness of the constructive propositional logic) in Section 4.4. In a sequent (S₁, S₂), the sets S₁, S₂ could contain arbitrary formulas, but, in a clause, S₁, S₂ are sets of atoms.

Sets (i.e. conjunctions) of clauses are called clause forms (in some texts - clausal forms). By Theorem 5.2.1, every propositional formula can be reduced to a (possibly empty) CNF. Since every CNF can be converted into a clause form, we have established the following

Theorem 5.4.1. In the classical logic, every propositional formula can be reduced to a clause form. More precisely, assume, the formula F has been built of formulas B₁, B₂, ..., B_n by using propositional connectives only. Then there is a (possibly empty) clause form over B₁, B₂, ..., B_n such that in [L₁-L₁₁, MP] the formula F is equivalent to the conjunction of the clauses contained in the form.

Exercise 5.4.2. Obtain clause forms of the formulas mentioned in the Exercise 5.2.2.

Clause forms are very well suited for computer processing. In the computer memory, every clause form can be represented as "a set of pairs of sets of atoms" - i.e. it means less character string processing and less expression parsing!

Clause Forms of Predicate Formulas

Of course (unfortunately), nothing comparable to clause forms could be obtained for predicate formulas, if we insisted that the clause form must be equivalent to the initial formula. Still, reducing of predicate formulas to "clause forms" becomes possible, if we drop this requirement, and replace it by the requirement that the "clause form" must be satisfiable, iff the initial formula is satisfiable. And - if we allow extending the language by adding new constant letters and new function letters.

Then, by Skolem's Theorem (Theorem 5.3.1), for each closed formula F, we can obtain a Skolem normal form Ax₁Ax₂...Ax_k G, where k>=0, the formula G does not contain quantifiers, and this form is satisfiable, iff so is F. By Theorem 5.4.1, let us convert G into a clause form G' (with atomic sub-formulas of G playing the role of atoms B₁, B₂, ..., B_n). Since G' is equivalent to G, the formula Ax₁Ax₂...Ax_k G' is satisfiable, iff so is F.

The set of clauses G' is called clause form of the formula F. For predicate formulas, clauses consist of atomic formulas, i.e. the formulas having the form p(t₁, ..., t_m), where p is a predicate letter, and t₁, ..., t_m are terms (possibly, containing variables).

Thus, we have proved the following

Theorem 5.4.2. Let L be a first order language. There is an algorithm allowing to construct, for each closed formula F of this language, a finite set S of clauses (in a language L' obtained from L by adding a finite set of new constant letters and new function letters - depending on F) such that F is satisfiable, iff is satisfiable the (closed) formula Ax₁Ax₂...Ax_n &S, where &S is the conjunction of the clauses contained in S, and x₁, x₂, ..., x_n are all the variables appearing in the clauses.

Note. In most texts, the closed formula Ax₁Ax₂...Ax_n &S (i.e. where all the variables appearing in &S are universally quantified) is called the universal closure of &S. Thus, we could define clause forms of predicate formulas as universal closures of sets of clauses.

As an example, let us consider the formula asserting that there are infinitely many prime numbers:

"x is a prime number": x>1 & ~EyEz (y>1 & z>1 & x=y*z),

AuEx (x>u & x is a prime number),

AuEx (x>u & x>1 & ~EyEz (y>1 & z>1 & x=y*z)) --------------- (*)

Convert it into a prenex normal form:

AuEx (x>u & x>1 & AyAz ~(y>1 & z>1 & x=y*z)),

AuExAyAz (x>u & x>1 & ~(y>1 & z>1 & x=y*z)).

Replace AuEx by Au - introduce a Skolem function g:

AuAyAz (g(u)>u & g(u)>1 & ~(y>1 & z>1 & g(u)=y*z)).

In this Skolem normal form, convert the quantifier free part into a conjunctive normal form:

AuAyAz (g(u)>u & g(u)>1 & (~(y>1) v ~(z>1) v ~(g(u)=y*z))).

This formula is satisfiable; iff so is the initial formula (*).

Thus, we have obtained a set of 3 clauses:

g(u)>u,
g(u)>1,
~(y>1) v ~(z>1) v ~(g(u)=y*z).

or, alternatively,

-> g(u)>u,
-> g(u)>1,
y>1, z>1, g(u)=y*z ->.

This set of 3 formulas is the clause form of the formula (*).

Exercise 5.4.3. Obtain clause forms of the formulas mentioned in the Exercise 5.1.4 (assume that B, C, D, F are predicate letters).

Horn Clauses

Alfred Horn (1918-2001) - biography (in Bulgarian) published by Dimiter Skordev.

http://wombat.doc.ic.ac.uk/foldoc/foldoc.cgi?Horn+clause

The name "Horn Clause" comes from the logician Alfred Horn, who first pointed out the significance of such clauses in 1951, in the article "On sentences which are true of direct unions of algebras", Journal of Symbolic Logic, 16, 14-21.

http://www.cs.ucsd.edu/users/goguen/courses/230/s6.html

As a footnote, Alfred Horn, for whom Horn clauses are named, had nothing to do with logic programming; he was a professor of logic at UCLA who in 1951 wrote paper using the sentences that now bear his name for reasons having little to do with computer science. As a second footnote, it seems to me rather misleading to call Prolog a "logic programming" language, since it departs rather far from logic; I would rather have had it called a "relational programming" language, because it is the use and manipulation of relations that is most characteristic of its programming style.

http://www.cs.fit.edu/~ryan/study/bibliography.html

Horn, Alfred. ``On sentences which are true of direct unions of algebras.'' Journal of Symbolic Logic, volume 16, number 1, March 1951, pages 14-21.

This paper has very little to do with Horn clauses.

To be continued.

5.5. Resolution Method for Propositional Formulas

History

J.A.Robinson. Theorem-proving on the computer. "Jour. Assoc. Comput. Mach.", vol.10, N2, 1963, pp.163-174

J.A.Robinson. A machine-oriented logic based on the resolution principle, "Jour. Assoc. Comput. Mach.", vol.12, N1, January 1965, pp.23-41 (Russian translation available: "Kib. sbornik (novaya seriya)", 7, 1970, pp.194-218)

John Alan Robinson: "Born in Yorkshire in 1930, Robinson came to the United States in 1952 with a classics degree from Cambridge University. He studied philosophy at the University of Oregon before moving to Princeton where he received his PhD in philosophy in 1956. Temporarily ``disillusioned with philosophy,`` he went to work as an operations research analyst for Du Pont, where he learnt programming and taught himself mathematics. Robinson moved to Rice University in 1961, spending his summers as a visiting researcher at the Argonne National Laboratory's Applied Mathematics Division. Its then Director, William F. Miller, pointed Robinson in the direction of theorem proving...

Miller showed Robinson a 1960 paper by Martin Davis and Hilary Putnam (coincidentally, the latter had been Robinson's PhD supervisor) proposing a predicate-calculus proof procedure that seemed potentially superior to Gilmore's, but which they had not yet turned into a practical computer program. Miller suggested that Robinson use his programming skills to implement Davis and Putnam's procedure on the Argonne IBM 704. Robinson quickly found that their procedure remained very inefficient. However, while implementing a different procedure also suggested in 1960 by Dag Prawitz, Robinson came to see how the two sets of ideas could be combined into a new, far more efficient, automated proof procedure for first-order predicate logic: "resolution"..." (According to Donald MacKenzie, The Automation of Proof: A Historical and Sociological Exploration, "IEEE Annals of the History of Computing", vol.17, N3, 1995, pp. 7-29, see also http://dream.dai.ed.ac.uk/papers/donald/donald.html).

"In retrospect, unification and resolution seem rather obvious ideas, which arise inevitably when one asks what must be syntactically true of a set of clauses which possesses the semantic property of having no Herbrand models." (J.A.Robinson, "Unification and Resolution in Retrospect", 1997, see at http://www.univ-orleans.fr/SCIENCES/LIFO/Manifestations/Jfplc_Unif_97/jfplc/invite-francais.html).

Note. Almost at the same time when J.A.Robinson invented the resolution method, Sergei Maslov invented his inverse method, which has a similar range of applications:

S.Yu.Maslov. An inverse method of establishing deducibilities in the classical predicate calculus, "Soviet Mathematics, Doklady", 1964, N5, pp.1420-1423.

About the early history of the problem see

M.Davis. The Early History of Automated Deduction.In: Handbook of Automated Reasoning, ed. by A.Robinson and A.Voronkov, Elsevier Science, 2001, vol. I, pp. 3-15 (online postscript)

Method

Assume that, in a set of clauses, two clauses are contained such that an atom C appears as a positive member in the first clause, and as a negative member in the second one:

~A₁v~A₂v ... v~A_mvB₁vB₂v ... vB_nvC, ------------------ (1)

~Cv~D₁v~D₂v ... v~D_pvE₁vE₂v ... vE_q, ------------------ (2)

or, simply,

FvC, ------- (1a)

~CvG. --------- (2a)

If C is false, then (1a) yields F, and, if C is true, then (2a) yields G. Thus, from (1a) and (2a) we have derived FvG. I.e. deriving FvG from FvC and ~CvG is "logically correct", and it is called Robinson's resolution rule (J.A.Robinson proposed it in the above 1963 paper):

FvC, ~CvG
----------------
FvG

If F is empty, then from C, ~CvG (i.e. C, C->G), the resolution rule derives G, i.e. it includes Modus Ponens as a special case.

If G is empty, then from ~FvC, ~C (i.e. F->C, ~C), the resolution rule derives ~F, i.e. it includes Modus Tollens as a special case.

Exercise 5.5.1. Derive this inference rule in the constructive logic, i.e. prove that [L₁-L₁₀, MP]: CvF, ~CvG |- FvG. Verify that it cannot be proved in the minimal logic [L₁-L₉, MP]. (Hint: in the positive part - use Theorem 2.5.1(b) [L₁, L₂, L₈, L₁₀, MP]: FvC, ~C |- F. In the negative part - note that in Section 2.5 we established that, in the minimal logic, the rule FvC, ~C |- F allows proving of L₁₀).

Thus, from the clauses (1) and (2), Robinson's resolution rule allows deriving of the following clause:

~A₁v~A₂v ... v~A_mv~D₁v~D₂v ... v~D_p v B₁vB₂v ... vB_nvE₁vE₂v ... vE_q. ------------ (3)

At first glance, this approach leads to nothing, because (3) is "much longer" than (1), and than (2). Still, this is not 100% true, because, additionally, we can reduce the repeating atoms in ~A₁v~A₂v ... v~A_mv~D₁v~D₂v ... v~D_p and in B₁vB₂v ... vB_nvE₁vE₂v ... vE_q, and, finally, the set of atoms, used in a clause form, is fixed!. And the atom C does not appear in (3) at all! Thus, of course, repeated application of the resolution rule cannot lead to an infinite growth of formulas.

The smart idea behind Robinson's resolution rule is: it is a universal tool for deriving contradictions from inconsistent sets of clauses! More precisely, it is universal, if used together with the following trivial inference rules:

FvCvDvG
----------------------------------- (Permutation),
FvDvCvG

FvCvCvG
---------------------------------- (Reduction).
FvCvG

The permutation rule allows arbitrary reordering of atoms in a clause (for example, moving C to right, and moving ~C to left). The reduction rule allows reduction of repeating identical atoms.

Exercise 5.5.2. Derive these inference rule in the minimal logic [L₁-L₉, MP].

Theorem 5.5.1 ( J.A.Robinson). In the propositional classical logic [L₁-L₁₁, MP], a finite set of propositional clauses is inconsistent, iff Robinson's resolution rule (together with permutation and reduction rules) allows deriving a contradiction from it.

Note. In some other texts, this fact is called "the refutation-completeness of the resolution rule" for the propositional logic..

Proof. 1. As you have proved in the Exercises 5.5.1 and 5.5.2, all the formulas, derived from a set of formulas K₁, K₂, ... , K_s by using the permutation, resolution and reduction rules are consequences of K₁, K₂, ... , K_s. Hence, if these rules allow deriving a contradiction from this set of formulas, then it (the set) is inconsistent.

2. Now, let us assume that a set of propositional clauses K₁, K₂, ... , K_s is inconsistent, i.e. a contradiction A&~A can be derived from it:

[L₁-L₁₁, MP]: K₁, K₂, ... , K_s |- A&~A.

Then, under the classical truth tables, the conjunction K₁&K₂& ... &K_s takes only false values (verify!). Let us mark one of the atoms (the atom C) in it. Let us denote:

- By CvF_i - the clauses containing C without negation,

- By ~CvG_j - the clauses containing C with negation,

- By H_k - the clauses that do not contain C.

All the formulas F_i, G_j, H_k are disjunctions of atoms (with or without negations) that do not contain the atom C.

Thus K₁&K₂& ... &K_s is equivalent to

&(CvF_i ) & &(~CvG_j) & &H_k. ---------------------- (4)

Let us apply (the strange) one of the distribution rules (Theorem 2.3.1): [L₁-L₈, MP] |- (A&B)vC<->(AvC)&(BvC). Hence, K₁&K₂& ... &K_s is equivalent to

(C v &F_i ) & (~C v &G_j) & &H_k.

If C is false, then this formula is equivalent to &F_i & &H_k, i.e. &F_i & &H_k takes only false values. If C is true, then it is equivalent to &G_i & &H_k, i.e. &G_i & &H_k takes only false values. Thus the disjunction

(&F_i & &H_k) v (&G_j & &H_k) ---------------------- (5)

also takes only false values. Now, let us, apply (the "normal") one of the distribution rules (Theorem 2.3.1): [L₁-L₈, MP] |- (AvB)&C<->(A&C)v(B&C), obtaining that (5) is equivalent to

(&F_i v &G_j) & &H_k. -------------------- (6)

I.e. this formula also takes only false values. And - important note! - itdoes not contain the atom C.

Finally, by applying, again, (the strange) one of the distribution rules (Theorem 2.3.1): [L₁-L₈, MP] |- (A&B)vC<->(AvC)&(BvC), we can conclude that (6) is equivalent to &&(F_i v G_j) & &H_k, i.e. to the set of clauses F_i v G_j and H_k (where i, j, k run over their initial ranges).

What does this achievement mean? If the set of propositional clauses K₁, K₂, ... , K_s is inconsistent, then there is a set of clauses F_i v G_j and H_k (where i, j, k run over their initial ranges), which is inconsistent as well, but which contains one atom less than K₁, K₂, ... , K_s.

Now, imagine, that, in the clause form (4), we have applied the resolution rule for the atom C in all the possible ways (before applying, apply the permutation rule to reorder atoms moving C to right, and ~C - to left):

F_ivC, ~CvG_j
------------------
F_ivG_j

After this, apply the permutation and reduction rules to reduce identical atoms. In this way we have obtained exactly the above-mentioned inconsistent set of clauses F_i v G_j and H_k (where i, j, k run over their initial ranges).

Thus, if some set of propositional formulas K₁, K₂, ... , K_s is inconsistent, then the resolution rule (togeher with the permutation and reduction rules) allows to derive from it another inconsistent set of propositional formulas, which contains one atom less.

By iterating this process, at the end of it, we will have an inconsistent set of propositional formulas built of a single atom B. In a clause form, there can be only one such set - the set B, ~B. This set represents a contradiction.

Q.E.D.

As an example, let use Robinson's resolution rule to prove that

BvC, C->B, B->D |- B&D.

Let us add ~(B&D) to premises BvC, C->B, B->D. We must prove that this set of 4 formulas is inconsistent. First, let us obtain clause forms:

BvC in clause form is BvC,
C->B in clause form is ~CvB,
B->D in clause form is ~BvD,
~(B&D) is equivalent to ~Bv~D.

Now, let us apply resolution to derive a contradiction from this set of 4 clauses: BvC, ~CvB, ~BvD, ~Bv~D:

From BvC, ~CvB we derive B, and have now 5 clauses: BvC, ~CvB, ~BvD, ~Bv~D, B.

From ~BvD, ~Bv~D we derive ~B, and have now 6 clauses: BvC, ~CvB, ~BvD, ~Bv~D, B, ~B.

Thus, we have derived a contradiction: B, ~B. Q.E.D.

Exercise 5.5.3. Use the resolution rule to prove the following:

From a Programmer's Point of View

Of course, when implementing the resolution rule in a computer program, we do not need decorations like the permutation and reduction rules. In a program, we will represent each clause ~A₁v~A₂v ... v~A_mvB₁vB₂v ... vB_n as a pair of sets: negative atoms, N = {A₁, A₂, ... , A_m}, and positive atoms, P = {B₁, B₂, ... , B_n}. Of course, the sets N, P do not intersect (if they do, then the clause contains ~CvCv..., i.e. it is a tautology, and can be dropped as "non-informative").

Resolution rule (non-refined version). If there are two clauses N₁, P₁ and N₂, P₂ such that P₁ and N₂ (or N₁ and P₂) contain a common atom C, then we can derive the clause N₁UN₂-{C}, P₁UP₂-{C}.

Of course, the set union operation includes reduction of identical members automatically.

The condition "P₁ and N₂ (or N₁ and P₂) contain a common atom C" can be expressed as "C in (N_i^P_j)U(P_i^N_j)", where ^ means set intersection.

If, in the resulting clause, the sets N₁UN₂-{C}, P₁UP₂-{C} intersect, then we should ignore such result. Fortunately, this can be detected in advance. Indeed,

(N₁UN₂)^(P₁UP₂) = (N₁^P₁)U(N₁^P₂)U(N₂^P₁)U(N₂^P₂) = (N₁^P₂)U(N₂^P₁),

because N₁^P₁ and N₂^P₂ are empty sets. The set (N₁^P₂)U(N₂^P₁) is exactly the set of all atoms C allowing application of the resolution rule to clauses N₁, P₁ and N₂, P₂. Hence, the sets N₁UN₂-{C}, P₁UP₂-{C}will not intersect, iff the set (N₁^P₂)U(N₂^P₁) contains exactly one atom C, i.e., iff there is exactly one atom allowing application of the resolution rule.

Resolution rule (refined version). If there are two clauses N₁, P₁ and N₂, P₂ such that the set (N₁^P₂)U(N₂^P₁) contains exactly one atom C, then we can derive the clause N₁UN₂-{C}, P₁UP₂-{C}.

Now, let us try to design a program implementing the last step of "proving by resolution" - suppose, we have already the initial list of clauses, and we wish to apply the resolution rule trying to derive a contradiction.

The main data storage will be a growing list of clauses (the main list):

(N₁, P₁), (N₂, P₂), ..., (N_k, P_k), ...

It will start as the initial list, and each application of the resolution rule will append a new clause to it.

To guarantee a success, we must apply the resolution rule in all the possible ways, i.e. we must scan all pairs of clauses (N_i, P_i)(N_j, P_j), where i = 1, 2, ...; j = i+1, i+2, ... To achieve this, let us use the following pair enumeration process:

(N₁, P₁)(N₂, P₂) - first, scan all pairs (i, j) with j=2, i<j.

(N₁, P₁)(N₃, P₃), (N₂, P₂)(N₃, P₃)- after this, scan all pairs (i, j) with j=3, i<j.

(N₁, P₁)(N₄, P₄), (N₂, P₂)(N₄, P₄), (N₃, P₃)(N₄, P₄) - after this, scan all pairs (i, j) with j=4, i<j.

Etc.

The process will stop, when we arrive at the level j, and the main list contains less than j (in fact, j-1) clauses. For a set of n atoms, there are only 3ⁿ different clauses. For example, for two atoms A, B, there are 9 different clauses: ~Av~B, ~AvB, Av~B, AvB, ~A, A, ~B, B, and the empty clause (contradiction). I.e., if we will prohibit duplicate clauses in the main list, then our process will always stop.

Thus, the following pseudo-code will do (no string processing, no expression parsing necessary!):

function propositional resolution (initial list) { of clauses }
begin
if initial list contains contradiction then return TRUE { contradiction found }
main list = eliminate duplicates (initial list)
for j = 2 by 1
begin
- if count (main list) < j then return FALSE { no contradiction derived }
- else
- for i = 1 to j-1 by 1
- - { consider i-th and j-th clauses in the main list: (N_i, P_i), (N_j, P_j) }
- - if (N_i^P_j)U(P_i^N_j) contains exactly one element C then { ^ means set intersection }
- - begin
- - - {apply resolution}
- - - if (N_iUN_j-{C}, P_iUP_j-{C}) not in main list then
- - - begin
- - - - add it to main list
- - - - if main list contains contradiction then return TRUE { contradiction derived }
- - end
- - - end
end
end

Exercise 5.5.4. Develop a computer program implementing the above pseudocode.

Note. See my version of the program in C++: header file, implementation, download the entire Borland C++ project (200K zip).

5.6. Herbrand's Theorem

Jacques Herbrand (1908-1931) "... After leaving Goettingen, Herbrand decided on a holiday in the Alps before his intended return to France. However he was never to complete his plans for he died in a mountaineering accident in the Alps only a few days after his holiday began. His death at the age of 23 in one of the tragic losses to mathematics." (according to MacTutor History of Mathematics archive).

Herbrand proved his famous theorem in 1929:

J.Herbrand. Recherches sur la th�orie de la d�monstration. Ph.D. Thesis, University of Paris, 1930 (approved in April 1929). (Buy the original at Simon Finch Rare Books Online).

Unlike the proof presented below, the original proof of Herbrand's Theorem does not depend on Goedel's Completeness Theorem (or Model Existence Theorem). Herbrand completed his Ph.D. thesis in 1929. In the same 1929 Goedel completed his doctoral dissertation about completeness (see Section 4.3). In fact, Herbrand's method allows proving of Goedel's Completeness Theorem, but he (Herbrand) "did not notice it". Why? See

Samuel R. Buss. On Herbrand's Theorem. "Lecture Notes in Computer Science", Vol.960, 1995, Springer-Verlag, pp.195-209 (see online at http://math.ucsd.edu/~sbuss/ResearchWeb/herbrandtheorem/).

The flavour of this famous theorem can be best presented in its simplest version. In this version, F(x) is a quantifier free formula containing only one variable x. Herbrand's Theorem says:

The formula ExF(x) is logically valid, iff there is a finite set of constant (closed) terms t₁, ..., t_n such that the disjunction F(t₁)v...vF(t_n) is logically valid.

Or, equivalently (via Goedel's Completeness Theorem),

The formula ExF(x) is provable in the classical logic, iff there is a finite set of constant (closed) terms t₁, ..., t_n such that the disjunction F(t₁)v...vF(t_n) is provable in the classical logic.

As we will see in the proof, Herbrand's Theorem is "caused" by the simple "fact" that in any proof of ExF(x) only a finite set of terms could be used.

Now, more precisely.

Let L be a first order language, containing at least one constant letter, and let F be a quantifier free formula.

Idea #1 (Author?). The formula p(c₁) & q(c₂, f(x)) is quantifier free (c₁, c₂ are constant letters, f - a function letter, p, q - predicate letters). In a sense, any "closed" interpretation domain for this formula must contain objects denoted by the terms c₁, c₂, f(c₁), f(c₂), f(f(c₁)), f(f(c₂)),...

So, let us define the so-called Herbrand's universe of the formula F (let us denote it by HU_F) as the minimum set of all constant (closed) terms such that:

a) If c is a constant letter occurring in F, then c is in HU_F.

b) If F does not contain constant letters, then one of the constant letters of the language L is in HU_F.

c) If terms t₁, ..., t_k are in HU_F, and f is a k-ary function letter occurring in F, then the term f(t₁, ..., t_k) is in HU_F.

Exercise 5.6.1. Verify that HU_F is a non-empty finite or countable set (provide an algorithm generating the members of HU_F).

Theorem 5.6.1 (Herbrand's Theorem - the simplest case). Let L be a first order language, containing at least one constant letter, and let F(x) be a quantifier free formula containing only one free variable x. Then the formula ExF(x) is logically valid (i.e. provable in the classical predicate logic), iff there is a finite set of terms t₁, ..., t_n from HU_F such that the disjunction F(t₁)v...vF(t_n) is logically valid (i.e. provable in the classical predicate logic).

Proof. Let us assume the contrary - that none of the disjunctions F(t₁)v...vF(t_n) is logically valid (t_i-s are terms from HU_F). Idea #2 - then the following theory T is consistent:

T = { ~F(t) | t is a term from HU_F}.

Indeed, if T would be inconsistent, then there would be a T-proof of some formula B&~B. In this proof, only a finite set of the axioms ~F(t) would be used, i.e. for some terms t₁, ..., t_n from HU_F:

[L₁-L₁₅, MP, Gen]: ~F(t₁), ..., ~F(t_n) |- B&~B.

Hence, by Deduction Theorem 2 (it is applicable here, because F(x) contains only one free variable, and t_i-s are constant terms, i.e. every ~F(t_i) is a closed formula):

[L₁-L₁₅, MP, Gen]: |- ~F(t₁)&... &~F(t_n) -> B&~B,

[L₁-L₁₅, MP, Gen]: |- ~(F(t₁)v... vF(t_n)) -> B&~B,

and thus,

[L₁-L₁₅, MP, Gen]: |- F(t₁)v... vF(t_n).

I.e., F(t₁)v... vF(t_n) is logically valid. This contradicts our assumption, that none of the disjunctions F(t₁)v...vF(t_n) is logically valid. Hence, T is a consistent theory.

Idea #3 - if T is consistent, then, by the Model Existence Theorem, there is a model J of T. In this model, all the axioms of T are true, i.e. so are all the formulas ~F(t) with t from the set HU_F.

Idea #4 - let us restrict the domain of the model J to those elements of it, which are interpretations of terms from the set HU_F, and let us restrict the entire interpretation correspondingly. Let us denote this new interpretation by J₁. Then,

a) All the formulas ~F(t) (with t from the set HU_F) are true in J₁. Indeed, ~F(t) contains only constant terms from HU_F (idea #1 working!), and all of them have the same interpretations in J₁ that they had in J. Thus, if ~F(t) was true in J, it remains true in J₁.

b) Hence, the formula Ax~F(x) is true in J₁ (because the domain of J₁ consists only of those elements, which are interpretations of terms from the set HU_F).

c) Hence, the formula ExF(x) is false in J₁.

This contradicts the logical validity of ExF(x).

Q.E.D.

Exercise 5.6.2. Repeat the above proof, proving a more general form of Herbrand's Theorem:

Theorem 5.6.2 (Herbrand's Theorem - the simplest case). Let L be a first order language, containing at least one constant letter, and let F(x₁, ..., x_m) be a quantifier free formula containing only m free variables x₁, ..., x_m. The formula Ex₁...Ex_mF(x₁, ..., x_m) is logically valid, iff there is a finite set of m-tuples tt₁, ..., tt_n of terms from HU_F such that the disjunction F(tt₁)v...vF(tt_n) is logically valid.

As you verified it in the Exercise 4.1.3, any formula G is logically valid, iff ~G is unsatisfiable. Thus, Ex₁...Ex_mF(x₁, ..., x_m) is logically valid, iff Ax₁...Ax_m~F(x₁, ..., x_m) is unsatisfiable. On the other hand, F(tt₁)v...vF(tt_n) is logically valid, iff ~F(tt₁)&...&~F(tt_n) is unsatisfiable. Now, let us replace F by ~F, and we have proved

Theorem 5.6.3 (Herbrand's Theorem - a more useful alternative form). Let L be a first order language, containing at least one constant letter, and let F(x₁, ..., x_m) be a quantifier free formula containing only m free variables x₁, ..., x_m. The formula Ax₁...Ax_mF(x₁, ..., x_m) is unsatisfiable (i.e. inconsistent in the classical logic), iff there is a finite set of m-tuples tt₁, ..., tt_n of terms from HU_F such that the conjunction F(tt₁)&...&F(tt_n) is unsatisfiable (i.e. inconsistent in the classical logic).

Note. As you verified it in the Exercise 4.3.6, a set of formulas is inconsistent in the classical logic, iff it is unsatisfiable.

Why is this form "more useful"? Let us try applying this form of Herbrand's Theorem to sets of formulas in clause form.

1) The "meaning"of any set of closed formulas F₁, ... , F_k is represented by their conjunction F₁& ... &F_k.

2) A clause is any disjunction of atomic formulas or their negations. For example, ~p(c₁) v p(c₂) v q(x, f(y)), or p(x) v ~q(y, f(z)). The "meaning" of a set of clauses is represented by their universally quantified conjunction. For example, AxAyAz([~p(c₁) v p(c₂) v q(x, f(y))] & [p(x) v ~q(y, f(z))]).

3) As we know from the previous Section 5.4, the set F₁, ... F_k can be reduced to a clause from, i.e. there is a set of clauses S such that F₁, ... , F_k is unsatisfiable, iff S is unsatisfiable.

Now, let us apply the above form of Herbrand's Theorem (Theorem 5.6.3). If S contains m variables (of course, all of them are universally quantified), then S is unsatisfiable, iff there is a finite set of m-tuples tt₁, ..., tt_n of terms from HU_S such that the conjunction S(tt₁)&...&S(tt_n) is unsatisfiable.

If we take a clause from S, and substitute some terms from HU_S for all its variables, then we obtain a (so-called) ground clause of S. For example, if

S = { ~p(c₁) v p(c₂) v q(x, f(y)); p(x) v ~q(y, f(z)) },

then the substitution { c₁ / x; c₂ / y; f(c₂) / z } yields the following two ground clauses:

~p(c₁) v p(c₂) v q(f(c₁), f(c₂)),

p(c₁) v ~q(c₂, f(f(c₂))).

Of course, the conjunction S(tt₁)&...&S(tt_n) is a set of ground clauses. Thus, if S is unsatisfiable, then there is an unsatisfiable finite set of ground clauses of S. And conversely?

If there is an unsatisfiable finite set C = {C₁, ..., C_n}of ground clauses of S, then each C_i is generated by some substitution, which can be represented as an m-tuple tt_i of terms from HU_S. If {C₁, ..., C_n} is unsatisfiable, then {S(tt₁), ..., S(tt_n)} - as a super-set of the former, is unsatisfiable, too ("even more unsatisfiable").

Now, if S would be satisfiable, then (because all the variables of S are meant universally quantified) so would be the formula S(tt₁)& ...&S(tt_n). Contradiction.

Thus, we have proved another form of Herbrand's Theorem.

Theorem 5.6.4 (Herbrand's Theorem - the most useful form. Author - Herbert B.Enderton?). Let L be a first order language, containing at least one constant letter, and let F₁, ..., F_k be a set of closed formulas in L. Then this set is unsatisfiable, iff its clause form allows an unsatisfiable finite set of ground clauses.

Why is this form "the most useful"? Because (let us ignore performance problems),

a) The clause form of F₁, ..., F_k is a finite set S, generated by a simple (but a very slow) algorithm (see Sections 5.1-5.4).

b) Herbrand's universe HU_S is a finite or infinite set of constant terms, generated by a simple algorithm (see Exercise 5.6.1).

c) Thus, all the possible finite sets of ground clauses of S can be generated by a simple combination of the above two algorithms.

d) Unsatisfiability of each finite set of ground clauses can be detected by a simple (but a very slow) algorithm (see Lemma 5.6.5 below).

Thus, we have here a simple (but a very slow) algorithm for checking provability in the classical predicate logic.

Lemma 5.6.5. A finite set of ground clauses is unsatisfiable, iff the conjunction of these clauses is unsatisfiable under the classical truth tables.

Proof. In the above example of ground clauses:

~p(c₁) v p(c₂) v q(f(c₁), f(c₂)),

p(c₁) v ~q(c₂, f(f(c₂))),

we have 5 different atoms: p(c₁), p(c₂), q(f(c₁), f(c₂)), q(c₂, f(f(c₂))). Let us denote these atoms by Q₁, Q₂, Q₃, Q₄. Thus we obtain the following propositional formula

(~Q₁ v Q₂ v Q₃) & (Q₁ v ~Q₄).

1. If this formula cannot be satisfied under the classical truth tables, then we cannot assign truth values to predicates p, q in a way making all the corresponding clauses true. I.e. then the corresponding set of ground clauses also cannot be satisfied. Q.E.D.

2. If this formula can be satisfied under the classical truth tables, then we can find a truth value assignement making it true, for example:

Q₁=false (this makes the first disjunction true),
Q₄=false (this makes the second disjunction true).

Now, we can define the following interpretation J making the corresponding ground clauses true:

D_J = { c₁, c₂, f(c₁), f(c₂), f(f(c₂) } (the set of all terms appearing in the clauses, i.e. a subset of the Herbrand universe);

p(c₁)=false, q(c₂, f(f(c₂))=false (these assignements make both ground clauses true).

All the other truth values are irrelevant, so, we can define them, for example, as follows:

p(c₂)=true, p(f(c₁))=true, p(f(c₂))=true, p(f(f(c₂))=true;

q(x, y)=true, if x is not c₂, or y is not f(f(c₂).

Q.E.D.

...

To be continued.

...

5.7. Resolution Method for Predicate Formulas

If we are interested only in deriving contradictions from inconsistent sets of formulas, then we can note that a set of closed predicate formulas is inconsistent (i.e. allows deriving a contradiction in the classical logic), iff the conjunction of these formulas is unsatisfiable (Exercise 4.3.6). Thus, instead of the initial set, we can analyze the set of clause forms of these formulas. Indeed, if we derive a contradiction from the set of clause forms, then this set is unsatisfiable, i.e., by Theorem 5.4.2, so is the initial set, and hence, the initial set is inconsistent. And conversely, if the initial set of formulas is consistent, then it is satisfiable, i.e. so is the set of clause forms, i.e. we will be not able to derive a contradiction from it.

The next step forward - in clause forms, we can drop all the universal quantifiers. Indeed, if we derive a contradiction from a set universally quantified clause forms, then we can derive it from the corresponding non-quantified set (we can apply the Gen inference rule F(x) |- AxF(x) to obtain the quantified forms from the non-quantified ones). And conversely, if we derive a contradiction from a set of non-quantified clause forms, then we can derive it from the corresponding universally quantified set (apply the Axiom L₁₂: AxF(x) -> F(t) to obtain non-quantified forms from the quantified ones).

After dropping quantifiers, sets of clause forms become simply sets of clauses (conjunction of conjunctions is equivalent to a "joint" conjunction).

Thus, we can concentrate on sets of clauses that do not contain quantifiers, like as the one obtained in Section 5.4:

-> g(u)>u,
-> g(u)>1,
y>1, z>1, g(u)=y*z ->.

Note that clauses consist of atomic formulas only.

Another step forward - in clause forms, we can rename variables in such a way that two clauses do not contain common variables. For example, we can replace the above set of clauses by the following one:

-> g(u₁)>u₁,
-> g(u₂)>1,
y>1, z>1, g(u₃)=y*z ->.

Indeed,

Exercise 5.7.1. Return to Section 3.3 and verify that, in the minimal logic

[L₁-L₉, MP]: |- Ax(B(x)&C(x)) <-> Ax₁Ax₂[B(x₁)&C(x₂)]. What does it mean?
[L₁-L₉, MP]: |- Ax(B(x)&C(x)&D(x)) <-> Ax₁Ax₂Ax₃[B(x₁)&C(x₂)& D(x₃)]. Etc.

At first glance, this step may seem "redundant". Still, note that, in fact, it allows complete separation of clauses "by the meaning", and this separation will greatly simplify processing of clauses by means of substitution (see below).

Will the Robinson's resolution rule remain a universal tool for deriving contradictions also from inconsistent sets of predicate formulas (i.e. sets of non-quantified clauses, consisting of atomic formulas)?

Let us imagine, we have derived the following two formulas (p is a unary predicate letter, 0 - a constant letter):

p(x₁) v F(x₁, y₁), ~p(0) v G(x₂, y₂).

To apply the Robinson's resolution rule, we must first, in p(x₁), substitute 0 for x₁:

p(0) v F(0, y₁), ~p(0) v G(x₂, y₂).

Now, we can apply the resolution rule, obtaining the formula

F(0, y₁) v G(x₂, y₂).

Surprisingly, this simple idea of "unification by substitution" appears to be sufficient to make Robinson's resolution rule a universal tool for deriving contradictions also from inconsistent sets of predicate formulas! And, in general, the necessary substitutions are not much more complicated than in the above simplest example.

Theorem 5.7.1 ( J.A.Robinson). In the classical predicate logic [L₁-L₁₁, L₁₂-L₁₅, MP, Gen], a set of predicate clauses is inconsistent; iff Robinson's resolution rule (together with permutation, reduction and substitution rules) allows deriving a contradiction from it.

Note. In some other texts, this fact is called "the refutation-completeness of the resolution rule".

Proof. 1. All the formulas, derived from a set of clauses K₁, K₂, ... , K_s by using permutation, reduction, substitution and resolution rules, are consequences of K₁, K₂, ... , K_s. Hence, if these rules allow deriving a contradiction from this set of clauses, then it (the set) is inconsistent.

2. Now, let us assume that the set of clauses S = {K₁, K₂, ... , K_s} is inconsistent. Then it is unsatisfiable (Exercise 4.3.6). And then, by Herbrand's Theorem, it allows a finite unsatisfiable set of ground clauses C₁, ..., C_n. Each C_i of these ground clauses is obtained from some clause in S by means of some substitution sub_i (of terms from the Herbrand universe HU_S), i.e. by applying the substitution rule.

By Lemma 5.6.5, the set C₁, ..., C_n is unsatisfiable, iff the conjunction C₁& ... &C_n is unsatisfiable under the classical truth tables, i.e., iff the set C₁, ..., C_n is inconsistent. And, by Theorem 5.5.1, a finite set of propositional clauses is inconsistent; iff Robinson's resolution rule (together with permutation and reduction rules) allows deriving a contradiction from it.

Q.E.D.

Refinements - Step 1 (First of the Two Smart Ideas)

Let us examine once more the part two of the proof of Theorem 5.7.1, where a specific (hopeless!) "proof strategy" is used.

First, since two clauses K_ido not contain common variables, we can think that each of the substitutions sub_j is applied to a single clause, i.e. we can think, in fact, of a (finite) set of substitutions sub_ij, where each sub_ij is applied only to the clause K_i. Let us denote by F.sub the result of application of the substitution sub to the formula F.

Second, to derive a contradiction from {K₁, K₂, ... , K_s}, we may apply, first, all the necessary substitutions (stage 1 - substitutions only!), and, after this, all the necessary permutations, reductions and resolutions (stage 2 - no more substitutions!). This is exactly the above-mentioned specific (hopeless!) "proof strategy". Why hopeless? Because, before applying the substitutions sub_ij, we must find them among all the possible substitutions of terms from the infinite set HU_S. This is a performance problem that does not affect our above theoretical considerations, but could make their result useless. The smart ideas #1 and #2 introduced below, allow to restrict the substitution search area considerably.

Imagine one of the resolutions of stage 2, where C₁ is an atomic formula:

F₁vC₁, ~C₁vG₁
----------------,
F₁vG₁

If both premises F₁vC₁, ~C₁vG₁ are coming directly from stage 1, then they have been obtained from some initial clauses FvC, ~DvG by two substitutions sub₁ and sub₂ such that:

F₁ is F.sub₁, C₁ is C.sub₁, ~C₁ is ~D.sub₂, G₁ is G.sub₂.

We can call such pair of substitutions a unifier, because C.sub₁ and D.sub₂ represent the same atomic formula (compare the example before the text of Theorem 5.7.1).

If one (or both) of the premises does not come directly from stage 1, then it is either an initial clause, or the result of a previous resolution. By putting an empty substitution (which does no change formulas) instead of sub₁ or sub₂ (or both) we can still think of the premises as obtained by a unification.

And, finally, if, to derive a contradiction B, ~B from K₁, K₂, ... , K_s, we do not need resolution at all, then we need, nevertheless, unifying substitutions, converting two clauses B₁ and ~B₂ into B and ~B.

Thus (smart idea #1), to derive contradictions, we can do with one specific kind of the substitution rule - the unification rule:

a) Take two clauses, mark a positive atom C in the first clause, and a negative atom ~D in the second one. Thus, we are considering two clauses: FvC and ~DvG.

b) Try to find two substitutions sub₁ and sub₂ such that C.sub₁ and D.sub₂ represent the same atom C_1.And you do not need to introduce variables of the other clauses! If you succeed, you have obtained two clauses: F₁vC₁, ~C₁vG₁, where C₁ is C.sub₁ (=D.sub₂), F₁ is F.sub₁ and G₁ is G.sub₂. Since clauses do not contain common variables, the union sub₁Usub₂ is a substitution (a unifier of C and D).

c) Apply resolution, obtaining the clause F₁vG₁.

We have proved the following refined version of Theorem 5.7.1:

Theorem 5.7.2 ( J.A.Robinson). In the classical predicate logic [L₁-L₁₁, L₁₂-L₁₅, MP, Gen], a set of predicate clauses is inconsistent; iff Robinson's resolution rule (together with permutation, reduction and unification rules) allows deriving a contradiction from it.

Why is this refinement important? Because now, instead of trying out all the possible substitutions (of terms from HU_S for clause variables), we can concentrate on substitutions that unify two clauses. This allows to restrict the substitution search area considerably.

Refinements - Step 2 (Second of the Two Smart Ideas)

Substitution "Algebra"

In general, each substitution involves a list of distinct variables x₁, ..., x_k and a list of terms t₁, ...,t_k. All occurrences of the variable x_i are replaced by the term t_i. Thus, this operation can be most naturally represented by the set of pairs { t₁ / x₁, ..., t_k / x_k }. The order of pairs t_i / x_i is irrelevant because of the following "anti-cascading" condition: the new occurrences of the variables x₁, ..., x_k created by the substitution, are not replaced. The result of application of some substitution sub to some expression (term or formula) F, is usually denoted by F.sub.

For example, if F is p(x, f(y)) and sub = { f(z) / x, z / y }, then F.sub is p(f(z), f(z)).

The empty set of pairs {} represents the so-called empty substitution. Of course, F.{} = F, for any expression F.

If the variable sets of two substitutions sub₁ and sub₂ do not intersect, and the terms of sub₁ do not contain the variables of sub₂, and the terms of sub₂ do not contain the variables of sub₁, then the union sub₁Usub₂ (of two sets of pairs) defines a substitution.

Still, the most important operation on substitutions is composition. If sub₁ and sub₂ are two substitutions, then sub₁.sub₂ denotes the composed substitution "apply first sub₁, and after this, apply sub₂". For example, if sub₁ = { f(z) / x, z / y } and sub₂ = { f(w) / z }, then

sub₁.sub₂ = { f(f(w)) / x, f(w) / y, f(w) / z }.

Exercise 5.7.2. a) Verify that the substitution composition is associative and non-commutative (provide a counter-example), and that the empty substitution is the only "unit element" (i.e. {}.sub = sub.{} = sub for any substitution sub). b) Is there any algebraic correlation between composition and union of substitutions?

Most General Unifiers

How do behave unifiers in the substitution "algebra"? Assume, sub₁ and sub₂ are two different unifiers of the same pair of expressions F and G. I.e.

F.sub₁ = G.sub₁, F.sub₂ = G.sub₂.

If there would be a substitution sub such that sub₂=sub₁.sub, then we could say that sub₁ is a no less general unifier than sub₂. For example, let us try to unify the first members of the following two formulas:

p(x₁) v F(x₁, y₁), ~p(f(x₂)) v G(x₂, y₂).

It would be natural to use the substitution sub₁ = { f(z) / x₁, z / x₂ }, obtaining

p(f(z)) v F(f(z), y₁), ~p(f(z)) v G(z, y₂).

But, in principle, one could use also the substitution sub₂ = { f(f(z)) / x₁, f(z) / x₂ }, obtaining

p(f(f(z))) v F(f(f(z)), y₁), ~p(f(f(z))) v G(f(z), y₂).

Of course, sub₁ is "better", because sub₂ = sub₁.{ f(z) / z }. Why? If our purpose was unifying p(x₁) with p(f(x₂)), then sub₁ performs this (as well as sub₂), but it "leaves more space" for subsequent substitutions (than sub₂). Indeed, to continue after sub₁, instead of sub₂ = sub₁.{ f(z) / z }, we can choose also sub₃ = sub₁.{ g(z) / z } etc. Thus, using a more general unifier is preferable.

So, let us call a unifier sub of two expressions F and G a most general unifier (mgu) of F and G, iff it is no less general than any other unifier of F and G (i.e. iff, for any other unifier sub' of F and G, there is a substitution sub'' such that sub' = sub.sub'').

Lemma 5.7.3. If two expressions lists FF and GG are unifiable, then there exists an mgu of FF and GG.

Proof (long, but easy). Let us define the total length of an expression list as follows: a) (atomic expressions) the total length of a constant or of a variable is 1, b) the total length of the expression list e₁, ..., e_n is the sum of the total lengths of the members e₁, ..., e_n, c) (composite expressions) the total length of the expression f(t₁, ..., t_n) (where f is function letter or predicate letter), is the total length of the expression list t₁, ..., t_n plus 1.

Let us prove our Lemma by induction using min(total_length(FF), total_length(GG)) as the induction parameter.

1) Induction base. The total length of FF or GG is 1. Let us assume total_length(FF)=1.

a) FF is a constant c. Then FF and GG are unifiable, iff GG is the same constant c. Then, empty substitution is the only possible mgu (verify).

b) FF is a variable x. Then, FF and GG are not unifiable, if: b₁) GG is a list of more than one expression, or, b₂) GG is a composite expression that contains x (then any substitution of t for x makes GG longer than t). And, FF and GG are unifiable, iff GG is a variable, or GG is a composite expression that does not contain x.

If GG is the variable x, then the empty substitution is the only possible mgu (verify).

If GG is a variable y (other than x), then all unifications of FF and GG have the form { t / x, t / y, ... }, where t is any term. Among them, mgu-s are { z / x, z / y }, where z is any variable (verify).

If GG is a composite expression that does not contain x, then all unifications of FF and GG have the form { GG.sub / x, ... } U sub, where sub is any substitution that does not substitute for x (verify). Among them, mgu-s are { GG.sub / x}U sub, where sub substitutes distinct variables for distinct variables (verify).

This completes the induction base.

2) Induction step. Assume, min(total_length(FF), total_length(GG))=n, where n>1. If FF and GG are unifiable, then, as lists, they contain the same number of members.

2a) FF and GG contain are single expressions. Since min(total_length(FF), total_length(GG))>1, both are composite expressions - suppose, FF is f(s₁, ..., s_m) (where f is function letter or predicate letter, and s₁, ..., s_m are terms), and GG is g(t₁, ..., t_n) (where g is function letter or predicate letter, and t₁, ..., t_n are terms). FF and GG are unifiable, iff a) f and g represent the same letter, and b) the lists s₁, ..., s_m and t₁, ..., t_n are unifiable. Thus, the unifiers of FF and GG coincide with the unifiers of lists. Since min(total_length(s₁, ..., s_m), total_length(t₁, ..., t_n))<n, by the induction assumption, Lemma 5.7.3 holds for the lists, i.e. it holds also for FF and GG.

2b) FF and GG contain two or more members. If FF and GG are unifiable, then so are their first members ("heads") F₁ and G₁. Let us denote by FF₂ and GG₂ the rests of lists ("tails"). Since min(total_length(F₁), total_length(G₁))<n, by the induction assumption, there exists at least one mgu of F₁ and G₁. The same is true also for FF₂ and GG₂.

Let us denote by mgu₁ an arbitrary mgu of F₁ and G₁

Now, let us consider an arbitrary unifier u of FF and GG. It must unify also F₁ with G₁, and FF₂ with GG₂. Hence, u = mgu₁.sub₁, where sub₁ is some substitution. We know that F₁.mgu₁ = G₁.mgu₁.

But what about FF₂.mgu₁ and GG₂.mgu₁? Let us apply sub₁ to both:

FF₂.mgu₁.sub₁ = FF₂.u

GG₂.mgu₁.sub₁ = GG₂.u

Since u unifies FF₂ with GG₂,

FF₂.mgu₁.sub₁ = GG₂.mgu₁.sub₁,

i.e. sub₁ unifies FF₂.mgu₁ with GG₂.mgu₁. Let us denote by mgu₁₂ an arbitrary mgu of FF₂.mgu₁ and GG₂.mgu₁. Then, sub₁ = mgu₁₂.sub₁₂, where sub₁₂ is some substitution, and

mgu₁.mgu₁₂.sub₁₂ = mgu₁.sub₁=u.

Thus, we have established that for an arbitrary unifier u of FF and GG there is a substitution sub₁₂ such that mgu₁.mgu₁₂.sub₁₂ = u. Of course, the composition mgu₁.mgu₁₂ unifies FF with GG (since it unifies F₁ with G₁, and FF₂ with GG₂). Hence, mgu₁.mgu₁₂ is an mgu of FF and GG.

Q.E.D.

Unification Algorithm

How could we determine, can two atomic formulas C and D be unified, or not? This problem can be solved by the following simple pseudo-code GetMostGeneralUnifier, which follows the above proof of Lemma 5.7.3, and where expression lists are defined in LISP style:

1) Each variable, constant, function letter and predicate letter is an expression list (consisting of a single member).

2) If s₁, ..., s_n are expression lists, then the list of s₁, ..., s_nis an expression list (consisting of members s₁, ..., s_n). The first member s₁ is called the head of the list, and the list of s₂, ..., s_n - the tail of the list.

Thus, instead of, for example, f(t₁, .., t_n), we use simply the (LISP style) list f, t₁, .., t_n. This simplifies the recursion interface.

This program detects, are two expression lists unifiable, or not, and, if they are, it returns one of their most general unifiers.

function GetMostGeneralUnifier (expression_list1, expression_list2)
begin
if length(expression_list1) > 1 and length(expression_list2) > 1 then
begin
--- h1 = head(expression_list1);
--- h2 = head(expression_list2);
--- subH = GetMostGeneralUnifier(h1, h2);
--- if subH = false then return false; {unification impossible}
--- t1 = tail(expression_list1).subH;
--- t2 = tail(expression_list2).subH;
--- subT = GetMostGeneralUnifier(t1, t2);
--- if subT = false then return false; {unification impossible, note that subH is a mgu!}
--- return subH.SubT; {this composition unifies expression_list1 and expression_list2}
end
{now, expression_list1, or expression_list2 consists of a single member: m1 or m2}
if length(expression_list1) = 1 and m1 is variable then
begin
--- if m1 = expression_list2 then return {}; {empty substitution}
--- if m1 occurs in expression_list2 then return false; {unification impossible - verify!}
--- return {expression_list2 / m1}; {substitute expression_list2 for m1}
end
if length(expression_list2) = 1 and m2 is variable then
begin
--- if m2 = expression_list1 then return {}; {empty substitution}
--- if m2 occurs in expression_list1 then return false; {unification impossible - verify!}
--- return {expression_list1 / m2}; {substitute expression_list1 for m2}
end
{now, expression_list1, or expression_list2 consists of a single member that is not variable}
if expression_list1 = expression_list2 then return {}; {empty substitution}
return false; {unification impossible - verify!}
end

Exercise 5.7.3. Verify that this program detects, are two expression lists unifiable, or not, and, if they are, it returns one of their mgu-s. (Hint: repeat the proof of Lemma 5.7.3.)

Smart idea #2:

To derive contradictions, we can do with even more specific kind of the unification rule - the mgu-rule:

a) Take two clauses, mark a positive atom C in the first clause, and a negative atom ~D in the second one. Thus, we are considering two clauses: FvC and ~DvG.

b) Try to find any mgu of C and D. If you succeed, you have obtained two clauses: F.mgu v C₁, ~C₁ v G.mgu, where C₁ is C.mgu (=D.mgu).

c) Apply resolution, obtaining the clause F.mgu v G.mgu.

Theorem 5.7.4 ( J.A.Robinson). In the classical predicate logic [L₁-L₁₁, L₁₂-L₁₅, MP, Gen], a set of predicate clauses is inconsistent; iff Robinson's resolution rule (together with permutation, reduction and mgu-rules) allows deriving a contradiction from it.

Why is this (second!) refinement important? Because now, instead of trying out all the possible unifications, we can concentrate on mgu-s that unify two clauses. This allows to restrict the substitution search area even more (when comparing with Theorem 5.7.2).

The hard part of the proof is inventing of the following

Lemma 5.7.5. Any proof K₁, K₂, ... , K_s |- K (all K-s are clauses), where only permutation, reduction, substitution and resolution rules are used, can be converted into a proof K₁, K₂, ... , K_s |- K' such that: a) in the proof, only permutation, reduction, mgu and resolution rules are used; b) K can be obtained from K' by a single (possibly empty) substitution, followed by a chain of permutations and reductions.

Proof of Theorem 5.7.4. Assume, the set of clauses K₁, K₂, ... , K_s is inconsistent. Then, by Theorem 5.7.1, there are two proofs K₁, K₂, ... , K_s |- B, K₁, K₂, ... , K_s |- ~B, where where only permutation, reduction, substitution and resolution rules are used. From clauses, these rules allow deriving only of clauses. Hence, B is an atomic formula.

By Lemma 5.7.5, both proofs can be converted into proofs K₁, K₂, ... , K_s |- B₁, K₁, K₂, ... , K_s |- ~B₂ such that: a) in the proofs, only permutation, reduction, mgu and resolution rules are used; b₁) B can be obtained from B₁ by a single (possibly empty) substitution (permutations and reductions do not apply to atomic formulas), b₂) B can be obtained from B₂ by a single (possibly empty) substitution.

Thus, B₁ and B₂ are unifiable. Let us take their mgu, and apply it. As the result, we obtain a contradiction B', ~B', where B' is B₁.mgu (= B₂.mgu). And we have obtained this contradiction from the clauses K₁, K₂, ... , K_s by using only permutation, reduction, mgu and resolution rules. Q.E.D.

Proof Lemma 5.7.5.. Induction by the "height of the resolution tree" (see below).

1. Induction base - no resolutions applied in the proof K₁, K₂, ... , K_s |- K. Then K is obtained from some K_i by a chain of permutations, reductions and substitutions. Add to this fact an "empty" proof K₁, K₂, ... , K_s |- K_i. And let us compose all the substitutions into a single substitution. Q.E.D.

2. Induction step. Assume, we have the proof K₁, K₂, ... , K_s |- K, containing at least one resolution. Imagine the last resolution in this proof (C is an atomic formula):

FvC, ~CvG
----------------.
FvG

Then K is obtained from the formula FvG by a chain of permutations, reductions and substitutions.

The proofs of the formulas FvC, ~CvG possess a "height of the resolution tree" less than the one of the proof K₁, K₂, ... , K_s |- K. Thus, by induction assumption, we can convert these proofs into permutation-reduction-mgu-resolution proofs of some formulas F₁vC₁vF₂ and G₁v ~C₂vG₂ such that:

a) FvC can be obtained from F₁vC₁vF₂ by a single (possibly empty) substitution sub₁, followed by a chain of permutations and reductions. Under sub₁, the atomic formula C₁ is converted into C.

b) ~CvG can be obtained from G₁v ~C₂vG₂ by a single (possibly empty) substitution sub₂, followed by a chain of permutations and reductions. Under sub₂, the atomic formula C₂ is converted into C.

Since the clauses F₁vC₁vF₂ and G₁v ~C₂vG₂ do not contain common variables, the substitutions sub₁ and sub₂ do not intersect, hence, their union sub₁Usub₂ is a substitution sub (a unifier of C₁ and C₂) such that:

a₁) F can be obtained from (F₁vF₂).sub by a chain of permutations and reductions.

b₁) G can be obtained from (G₁vG₂).sub by a chain of permutations and reductions.

As we know from the above, the atomic formulas C₁ and C₂ are unifiable. Let us take their mgu, and apply it to the formulas F₁vC₁vF₂ and G₁v ~C₂vG₂. Let us denote by C' the formula C₁.mgu (it is equal to C₂.mgu). Thus, we have two formulas F₁.mgu v C' v F₂.mgu and G₁.mgu v ~C' v G₂.mgu, and, by permutation and resolution, we can obtain the formula (F₁vF₂).mgu v (G₁vG₂).mgu.

Thus, for the formula (F₁vF₂).mgu v (G₁vG₂).mgu, we have a permutation-reduction-mgu-resolution proof. It remains to show that, from this formula, FvG can be obtained by a single substitution, followed by a chain of permutations and reductions.

Since the substitution sub is a unifier of C₁ and C₂, then, by the definition of mgu, sub=mgu.sub', where sub' is some substitution. Hence,

a₂) F can be obtained from (F₁vF₂).mgu by the substitution sub', followed by a chain of permutations and reductions.

b₂) G can be obtained from (G₁vG₂).mgu by the substitution sub', followed by a chain of permutations and reductions.

Thus, FvG can be obtained from (F₁vF₂).mgu v (G₁vG₂).mgu by the substitution sub', followed by a chain of permutations and reductions. Q.E.D.

...

To be continued.