Model Theory. Skolem's Paradox. Ramsey's Theorem.

model theory, Skolem paradox, Ramsey theorem, Loewenheim, categorical, Ramsey, Skolem, Gödel, completeness theorem, categoricity, Goedel, theorem, completeness, Godel

Back to title page.

Appendix 1 About Model Theory

Some widespread Platonist superstitions were derived from other important results of mathematical logic (omitted in the main text of this book): Goedel's completeness theorem for predicate calculus, Loewenheim-Skolem theorem, the categoricity theorem of second order Peano axioms. In this short Appendix I will discuss these results and their methodological consequences (or lack of them).

All these results have been obtained by means of the so-called model theory. This is a very specific approach to investigation of formal theories as mathematical objects. Model theory is using the full power of set theory. Its results and proofs can be formalized in ZFC. Model theory is investigation of formal theories in the metatheory ZFC.

The main structures of model theory are interpretations. Let L be the language of some (first order) formal theory containing constant letters c₁, ..., c_k, function letters f₁, ..., f_m, and predicate letters p₁, ..., p_n. An interpretation J of the language L consists of the following objects:

a) a non-empty set D_J - the domain of interpretation (it will serve as the range of variables),

b) a mapping int_J that assigns:

- with each constant letter c_i - a member int_J(c_i) of the domain D_J,

- with each function letter f_i - a function int_J(f_i) from D_Jx ... x D_Jinto D_J (of course, int_J(f_i) has the same number of arguments as f_i),

- with each predicate letter p_i - a predicate int_J(p_i) on D_J, i.e. a subset of D_Jx ... x D_J (of course, int_J(p_i) has the same number of arguments as p_i).

As an example, let us consider the so-called standard interpretation S of Peano arithmetic PA:

a) The domain is D_S = {0, 1, 2, ...} (the set w in terms of ZF).

b) The mapping int_S assigns: with the constant 0 - the number 0 (the empty set o), with the constant 1 - the number 1 (the set {o}), with the function letter "+" - the function x+y (addition of natural numbers), with the function letter "*" - the function x*y (multiplication of natural numbers), with the predicate letter "=" - the predicate x=y (equality of natural numbers).

Having an interpretation J of the language L, we can define the notion of true formula (more precisely - the notion of formulas that are true under the interpretation J).

As the first step, terms of the language L are interpreted as members of D_Jor functions over D_J. Terms are defined as constant letters, or variable letters, or their combinations by means of function letters. The term c_i is interpreted as the member int_J(c_i) of D_J. The variable x_i is interpreted as the function X_i(x_i) = x_i. And, if t = f_i(t₁, ..., t_q), then int_J(t) is defined as the function obtained by substituting of functions int_J(t₁), ..., int_J(t_q) into the function int_J(f_i). For example, the standard interpretation of the term (x+y+1)*(x+y+1) is the function (x+y+1)².

As the next step, the notion of true atomic formulas is defined. Of course, if a formula contains variables (as, for example, the formula x=1), then its "truth value" must be defined for each combination of values of these variables. Thus, to obtain the truth value of the formula p_i(t₁, ..., t_q) for some fixed values of the variables contained in t₁, .., t_q, we must first "compute" the values of these terms, and then substitute these values into the predicate int_J(p_i).

Note. The equality letter "=" is always interpreted in standard way - as the equality of members of D_J.

And finally, we can "define" the notion of true compound formulas of the language L under the interpretation J (of course, for a fixed combination of values of their free variables):

a) Truth-values of the formulas ~B, B&C, BvC and B->C can be computed from the truth values of B and C.

b) The formula AxB(x) is true, iff B(c) is true for all members c of the domain D_J.

c) The formula ExB(x) is true, iff there is a member c of the domain D_Jsuch that B(c) is true.

Note that for an infinite domain D_J this notion of truth is extremely non-constructive: to establish, for example, truth-value of the formula AxB(x), we must check truth of B(c) for infinitely many values of c. The "degree of constructivity" of the formulas like AxEyC(x,y), AxEyAzD(x,y,z) etc. is even less... (compare my "critique" of the notion of true arithmetical formula in Section 3.1).

Let us say that a formula of the language L is true under the interpretation J, iff this formula is true for all combinations of values of its free variables.

Some formulas are true for all interpretations, for example:

(B->C) & (C->D) -> (B->D),

Ax(C->D(x)) -> (C->AxD(x)),

where C does not contain x. Such formulas are called logically valid (because they are true independently of the interpretation of their "meaning"). Note that the notion of logically valid formula is doubly non-constructive in the sense that the universal quantifier "for all interpretations" is added to the (already) non-constructive definition of (simply) true formula.

You could check easily that: a) all the logical axioms (see Appendix 3) are logically valid, b) the logical inference rules (see Appendix 3) allow to prove (from logically valid formulas) only logically valid formulas. Hence, in this way only logically valid formulas can be proved. Still, is our list of logical axioms complete in the sense that all logically valid formulas can be proved? The answer is "yes" - as Kurt Goedel established in 1929 (i.e. just a year BEFORE...):

K.Goedel. Die Vollstaendigkeit der Axiome des logischen Funktionenkalkuels. "Monatshefte fuer Mathematik und Physik", 1930, Vol.37, pp.349-360.

Goedel's Completeness Theorem. A formula (in any first order language) is logically valid, iff it can be proved by using the logical axioms and inference rules.

Unexpectedly, this theorem is an easy corollary of the so-called Model Existence theorem.

If T is a formal theory, and J is an interpretation of its language, then (traditionally) J is called a model of T, iff J makes true all axioms (and hence, all theorems) of T. Such notion of model seems somewhat strange: in "normal" branches of science theories serve as basis for building models of natural phenomena or technical devices, yet in mathematical logic we have turned this approach upside down.

Model Existence Theorem. If a (first order) formal theory is consistent (in the sense that it does not contain contradictions), then it has a finite or countable model (i.e. a model in which the domain is finite or countable).

This theorem solved a serious mental problem of anti-formalists. They thought that mere consistency of a theory (in the syntactic sense of the word - as the lack of contradictions) is not sufficient to regard theory as "meaningful". Model Existence theorem says that (syntactic) consistency of a theory is sufficient to regard it as "meaningful": if a theory does not contain contradictions, then it describes at least some kind of "mathematical reality".

See Mendelson [1997] for an elegant proof of Model Existence theorem (due to L.Henkin and G.Hasenjaeger, see also http://theory.lcs.mit.edu/~dmjones/hbp/bsyml/Authors/henkinleon.html). Or do the Exercise A.1.1 below.

Proof of the Completeness theorem. Of course, the only non-trivial part of the work is proving that each logically valid formula in the (first order) language L can be proved by using the logical axioms and inference rules (of the Appendix 3).

Let us assume that some formula F in the language L is logically valid, yet it cannot be proved by using our axioms and rules. Let us consider the theory T in the language L which has (besides the logical axioms) only one specific axiom - the formula ~F. Since F cannot be derived from logical axioms, T is a consistent theory. Hence, by Model Existence theorem, T has a model, i.e. an interpretation J that makes all its axioms true. Under this interpretation the formula ~F (as an axiom of T) is true. On the other hand, since F is logically valid, it is true under all interpretations, i.e. it is true also under the interpretation J. Hence, formulas F and ~F both are true under J. This is impossible, hence, F must be provable from logical axioms. Q.E.D.

Such a simple proof seems almost impossible! We are proving that the logical axioms and rules of inference are strong enough, but where come these axioms in? They come in - in the proof of Model Existence theorem: this theorem says that if some formal theory T does not have models, then the logical axioms and rules of inference are strong enough to derive a contradiction from the axioms of T.

Corollary. In any first order language the set of all logically valid formulas is effectively enumerable. I.e. given a language L, we can write a computer program that (working ad infinitum) prints out all logically valid formulas of L.

This makes Goedel's completeness theorem very significant: it shows that the "doubly non-constructive" notion of logically valid formula is at least 50% constructive!

Still, unfortunately, this notion is not 100% constructive. In 1936, Alonzo Church proved that at least some first order languages do not have algorithm determining, is a given formula logically valid or not (i.e. an algorithm solving the famous Entscheidungsproblem - decision problem):

A.Church. A note on the Entscheidungsproblem. "Journal of Symb. Logic", 1936, vol.1, pp.40-41

After this, L.Kalmar proved that none of serious first order languages can have such an algorithm (languages of PA and ZF included):

L.Kalmar. Die Zurueckfuehrung des Entscheidungsproblems auf den Fall von Formeln mit einer einzigen, binaeren Funktionsvariablen. "Compositio Math.", 1936, Vol.4, pp.137-144

For details, see Mendelson [1997].

Initially, Model Existence theorem was proved in a weaker form in 1915 (by Leopold Loewenheim) and 1919 (by Thoralf Skolem): if a first order theory has a model, then it has a finite or countable model (the famous Loewenheim-Skolem theorem). Proof (1929): if T has a model, then T is consistent, i.e. T has a finite or countable model.

L.Loewenheim. Ueber Moeglichkeiten im Relativkalkuel. "Mathematische Annalen", 1915, Vol.76, pp.447-470 (see also http://thoralf2.uwaterloo.ca/htdocs/scav/lowen/lowen.html)

T.Skolem. Logisch-kombinatorische Untersuchungen ueber Erfuellbarkeit oder Beweisbarkeit mathematischer Saetze nebst einem Theoreme ueber dichte Mengen. "Skrifter utgit av Videnskapsselskapet in Kristiania, I, Mat.-Nat. Kl.", 1919, N4, pp.1-36 (see also http://thoralf2.uwaterloo.ca/htdocs/scav/skolem/skolem.html)

Model Existence theorem is steadily provoking the so-called Skolem's paradox. Indeed, in ZF we can prove existence of uncountable sets. Still, according to Model Existence theorem, if ZF is consistent, then there is a countable model of ZF. I.e. ZF proves existence of uncountable sets, yet it has a countable model! Is this possible? Does it mean that ZF is inconsistent? Platonists could say even more: any consistent axiomatic set theory has countable models, hence, no axiom system can represent the "intended" set theory (i.e. "the" Platonist "world of sets") adequately.

For a formalist, Skolem's paradox is not a paradox at all. I would rather call it Skolem's effect - like as photo-effect, it is simply a striking phenomenon. Indeed, let J be a countable model of ZF. In ZF we can prove that the set r of all real numbers is uncountable, i.e.

~Ef (f is 1-1 function from r into w), -----------(1)

where w is the set of all natural numbers. What is the meaning of this theorem in the countable model J? Interpretations of r and w are subsets of the domain D_J, i.e. they both are countable sets, i.e.

Ef (f is 1-1 function from r_J into w_J). ----------(2)

Interpretation of (1) in J is

~Ef((f in D_J) & (f is 1-1 function from r_J into w_J)).

Hence, the mapping f of (2) does exist, yet it exists outside the model J!. Do you think that f of (2) "must" be located in the model? Why? If you are living (as an "internal observer") within the model J, the set r_J seems uncountable to you (because you cannot find a 1-1 function from r_J into w_J in your world J). Still, for me (an an "external observer") your uncountable r_J is countable - in my world I have a 1-1 function from r_J into w_J!

Hence, indeed, Skolem's paradox represents simply a striking phenomenon. It is worth of knowing, yet there is no danger in it.

Another Platonist superstition is connected with the so-called categoricity theorem of second order Peano axioms. By second order Peano axioms I mean the initial variant of axioms of arithmetic proposed by R.Dedekind. Modern version of this system is represented in the axioms P1, P2 and P3 of Section 3.1. The notion of models for these axioms can be discussed comfortably within ZF as a metatheory. Namely, any such model (according to our general definition above in this Appendix) is a triple (v, q, s), where v is a set (its members represent "natural numbers" of the model), q is a member of v (it represents the number 0), and s(x) is a function from v into v (it represents the function x+1). The triple (v, q, s) is a model of Peano axioms P1, P2, P3, iff

P1: ~(s(x)=q) for all x in v.

P2: If ~(x=y), then ~(s(x)=s(y)) for all x, y in v.

P3: If u is a subset of v, then

(q in u) & Ay(y in u -> s(y) in u) -> u=v.

Of course, the set w of all "set-theoretical" natural numbers (see Section 2.3), together with the empty set (representing the number 0) and the function xU{x} (representing s(x)) is a model of Peano axioms. This model is called traditionally the standard model of arithmetic. Let us say, that some other model (v, q, s) is isomorphic with the standard model, iff there is a 1-1 function f from w onto v ("onto v" means that range(f)=v) such that:

a) f(o) = q.

b) f(nU{n}) = s(f(n)) for all n in w.

The following theorem can be proved in ZF:

Categoricity Theorem. Any model of second order Peano axioms is isomorphic with the standard model.

This theorem was first proved by R.Dedekind (the author of Peano axioms, see Section 3.1) in his remarkable book:

R. Dedekind. Was sind und was sollen die Zahlen. Braunschweig, 1888 (see also http://thoralf2.uwaterloo.ca/htdocs/scav/dedek/dedek.html)

Proof. Assuming that the axioms P1, P2, P3 are true in the model (v, q, s), let us define by recursion the following function from w into v:

f(o) = q, f({o}) = s(q), f({o,{o}}) = s(s(q)), ..., f(nU{n}) = s(f(n)), ...

Let us prove that f is the required isomorphism.

a) f(o) = q by definition.

b) f(nU{n}) = s(f(n)) for all n in w - also by definition.

c) Let us show that range(f)=v. Of course, q is in range(f), and if some x is in range(f) (i.e. x=f(n) for some n), then s(x)=s(f(n))=f(nU{n}), i.e. s(x) also is in range(f). Hence, by P3 (this axiom is true in the model (v, q, s)) we obtain that range(f)=v.

d) Let us show that f is 1-1 function, i.e. let us prove that, if f(m)=f(n), then m=n. We must consider three cases:

d1) m=n=o. Q.E.D.

d2) m=o, n>o. Then f(m)=q, but f(n)=s(f(n-1)), i.e. f(n) is not q by the axiom P1. Q.E.D.

d3) m>o, n>o. Then f(m)=s(f(m-1))=f(n)=s(f(n-1)), and by the axiom P2 we obtain that f(m-1)=f(n-1). Let us repeat this argument enough times, and we will have the case d1) or d2 at the end. Q.E.D.

Q.E.D.

Thus, it seems that the second order Peano axioms contain an "unambiguous definition" of the structure of their models. For this reason, sometimes, Categoricity theorem is considered as an additional evidence in favor of the Platonist opinion that natural numbers exist as a specific "world" where each definite assertion "must be" either true or false. Still, note that Categoricity theorem is a theorem of ZF. How could a theorem of ZF have "super-natural" consequences?

Exercise A.1.1. (for smart students). Prove Model Existence theorem by using the following smart ideas (see Mendelson [1997]). Let T be a consistent theory. We must build a model of T, what kind of "bricks" should we use for this "building"? Idea #1: let us use language constant letters! So, let us add to the language of T an infinite set of new constant letters b₁, b₂, b₃, ... (and modify the logical axioms accordingly). Prove that this extended theory T₀ is consistent. The model we are building must contain all "objects" whose existence can be proved in T₀. Idea #2: for each formula F of T₀ having exactly one free variable (for example, x) let us add to the theory T₀ an axiom ExF(x)->F(b_i), where the constant b_i is unique for each F. If T₀ proves ExF(x), then this b_i will represent in our model the "object" x having the property F. Prove that this extended theory T₁ is consistent. Idea #3: prove the (non-constructive) Lindenbaum's lemma: any consistent theory has a consistent complete extension (the axiom set of the extension may not be effectively solvable). After this, extend T₁ to a consistent complete theory T₂. Idea #4: let us take as the domain of the interpretation M the set of all those terms of T₀ that do not contain variables. And let us interpret a function letter f as the "syntactic constructor function" f', i.e. let define the value f'(t₁, ..., t_n) simply as the character string "f(t₁, ..., t_n)". Finally, let us interpret a predicate letter p as the relation p' such that p'(t₁, ..., t_n) is true in M, iff T₂ proves p'(t₁, ..., t_n). To complete the proof, prove that an arbitrary formula G is true in M, iff T₂ proves G. Hence, all theorems of the initial theory T are true in M.

(Adolf Lindenbaum - logico, collaboratore e amico di Alfred Tarski - according to Famous mathematicians.)

Appendix 2 Around Ramsey's Theorem

The attitude of many working mathematicians to Goedel's incompleteness theorem is generally indifferent. Some methodological basis for such a position is given in the following quote from Parikh [1971]:

"... Thus exponentiation is not only a means for denoting "large numbers" but also the means for introducing "nonmathematical" questions into number theory. Why do we say "nonmathematical"? Because consider Goedel's formula A which says "I am not provable". Now this formula does express properties of N, since it can be written with quantifiers and connectives. However to see that A is true but not provable, we do not use properties of N, but properties of the intuitive notion "provable". Thus to say that A is a statement about numbers is like arguing that human behaviour is a problem of physics since human beings are physical entities. Even if such an assertion is true, it is very theoretical and not very useful."

Since human beings are physical entities, I do not believe that there are ghosts, I think that astrology is nonsense, etc. Hence, for me, this assertion is very practical and very useful.

So is Goedel's incompleteness theorem. For me, this theorem predicts that a fundamental mathematical theory cannot be perfect: while developing any such theory we will inevitably come either into contradictions, or into unsolvable problems. Is such a prediction practical? Some 30 years after Goedel's proof, in 1963 P.Cohen proved that if ZFC is a consistent theory, then this theory is not able to solve Cantor's continuum problem. If you prefer calling Goedel's theorem "very theoretical", then Cohen's result must be acknowledged as its "empirical confirmation". Since 1963 many classical problems of set theory were proved to be unsolvable, so we can speak about "massive empirical confirmation" of Goedel's theoretical prediction. Can we exclude that one day also some classical problems of number theory (for example, the twin prime conjecture) will be proved unsolvable?

Perhaps, the first steps in this direction were made before 1977 by Laurence Kirby, Jeff Paris and Leo Harrington. They proved that an extension of the so-called Finite Ramsey's theorem (a statement of discrete combinatorics that can be proved in set theory) cannot be proved in first order arithmetic (PA). Thus, for the first time, it was established that a relatively interesting assertion about natural numbers is unprovable when we are using the elementary (i.e. first order) notion of natural numbers. Still, using the extended "post-Cantor" (i.e. second order) notion of natural numbers we can prove this assertion.

Infinite Ramsey's Theorem

Frank P. Ramsey published this theorem in 1930:

F.P.Ramsey. On a problem of formal logic. "Proc. London Math. Soc.", 1930, vol.30, pp.264-285

Infinite Ramsey's theorem. Suppose, M is an infinite set, and e, r are positive integers. Imagine that each e-member subset of M is marked by one of r colors. Then there is an infinite subset of M such that all its e-member subsets are marked by the same color.

Proof (in ZFC, i.e. by using the axiom of choice, see Graham [1981]).

1. For e=1 the proof is obvious. Indeed, if M is an infinite set, and each member of it is marked by one of r colors, then at least one of the colors is assigned to an infinite subset of members. Q.E.D.

2. For e=2 the proof is straightforward. Here, M is an infinite set, and each pair of its members {a, b} is marked by one of r colors. Let us select a member b₀ of M, and consider all pairs {b₀, b} where b is in M-{b₀}. There is a color c₀ such that the set

M₁ = { b in M | {b₀, b} is marked by c₀}

is infinite. As the next step, let us a select member b₁ of M₁, and consider all pairs {b₁, b} where b is in M₁-{b₁}. There is a color c₁such that the set

M₂ = { b in M₁-{b₁} | {b₁, b} is marked by c₁}

is infinite. Etc.

As the result, we obtain three infinite sequences:

- the sequence of members of M: b₀, b₁, b₂, ...,

- the sequence of colors: c₀, c₁, c₂, …,

- the sequence of subsets: M = M₀ > M₁ > M₂ > ...,

where each b_i is in M_i- M_i+1, and all pairs {b_i, b} with b in M_i+1 are marked by the color c_i.

One of the colors (let us denote it by c) occurs an infinite number of times in this sequence: c = c_i for i = i₀, i₁, i₂, .... The set of corresponding members:

H = { b_i | i = i₀, i₁, i₂, ...}

is an infinite subset of M, and all pairs {a, b} with a, b in H are marked by the same color c. Indeed, if a = b_i and b = b_j, where i = i_k, j = i_m, and k<m, then a is in M_i- M_i+1, and b is in M_i+1, hence, {a, b} is marked by the color c_i, i.e. by c. Q.E.D.

3) For e>=3 the proof is by induction, i.e. by using the following

Lemma. Suppose, M is an infinite set, and e, r are positive integers. Imagine that each e-member subset of M is marked by one of r colors. If the Infinite Ramsey's theorem is true for e-1, then for each infinite subset M' of M and each member b' of M' there is an infinite subset H' of M' - {b'} such that all e-member sets consisting of b' and e-1 members of H' are marked by the same color.

Proof of the Lemma. Let us derive from the "r-coloring" of e-member subsets of M the following "r-coloring" of (e-1)-member subsets of M' - {b'}:

Mark {x₁, ..., x_e-1} by the color c, iff {b', x₁, ..., x_e-1} is marked by the color c.

From the Infinite Ramsey's theorem for e-1 we obtain an infinite subset H' of M' - {b'} such that all its (e-1)-member subsets are marked by the same color. Now add b' to each of these subsets. Q.E.D.

Having this Lemma we can derive Ramsey's theorem for e from Ramsey's theorem for e-1 by repeating the argument we used for e=2. Indeed, our Lemma allows obtaining an infinite subset M_i+1 of M_i - {b_i} such that all e-member subsets of M_i+1U{b_i} containing b_i are marked by the same color (denoted by c_i). Q.E.D.

Our formulation and proof of the Infinite Ramsey's theorem belong to the set theory ZFC. The language of the first order arithmetic (PA) does not allow discussing arbitrary infinite sets, i.e. in PA we cannot even formulate this version of Ramsey's theorem.

Open problem? Can the Infinite (or, at least, the Countable) Ramsey's theorem be proved in ZF, i.e. without the (countable?) axiom of choice (or DC)? Maybe, Ramsey's theorem is itself a kind of "axiom of choice"?

Finite Ramsey's Theorem

The following finite version of Ramsey's theorem can be both formulated and proved in PA. Having an infinite set M we were searching for an infinite "single color" subset H. Now, dealing with finite sets M we are interested in the following question: how large must be the set M to have "single color" subsets with at least k members?

Let us denote by |M| the cardinality (i.e. number of members) of M.

Finite Ramsey's theorem. There is a computable function R(e, r, k) such that for all positive integers e, r, k and each finite set M the following holds: if |M|>=R(e, r, k), and each e-member subset of M is marked by one of r colors, then there is a subset H of M such that |H|=k, and all e-member subsets of H are marked by the same color.

Proof (in PA). 1) For r=1 the proof is obvious: we can take R(e, 1, k) = k.

2) Now let us consider the case r=2, when e-member subsets of M are marked by two colors. Surprisingly, the following generalization of the theorem is easier to prove: there is a computable function R'(e, r, k₁, k₂) such that if |M|>=R'(e, 2, k₁, k₂), then there is either a subset H₁ of M such that |H₁|=k₁, and all e-member subsets of H₁ are marked by the first color, or a subset H₂ such that |H₂|=k₂, and all e-member subsets of H₂ are marked by the second color.

The proof is by induction from e-1, (e, k₁-1, k₂), and (e, k₁, k₂ -1) to (e, k₁, k₂).

Induction base. For e=1 we can take R'(1, 2, k₁, k₂) = k₁+k₂. Indeed, in this case the-members of M themselves are marked by using two colors.

For the minimum k₁, i.e. k₁=e we can take R'(e, 2, e, k₂) = k₂ (where k₂>=e). Indeed, if |M|>=k₂, and there is an e-member subset x marked by the first color, then we can take H₁ = x. If there are no such subsets, then all e-member subsets of M are marked by the second color, and we can take H₂ = M.

For the minimum k₂, i.e. k₂=e we can take R'(e, 2, k₁, e) = k₁. The proof is identical.

Induction step. Let k₁, k₂ >= e. Let us show that we can take

R'(e, 2, k₁, k₂) = 1+R'(e-1, 2, R'(e, 2, k₁-1, k₂), R'(e, 2, k₁, k₂-1)).

Indeed, suppose |M| >= R'(e, 2, k₁, k₂), select a member b of M, and consider all e-member subsets of M that contain b: {b, x₁, ..., x_e-1}. Each of these subsets is marked either by the first, or by the second color. Let us define the following "2-coloring" of (e-1)-member subsets of M -{b} (i = 1, 2):

{x₁, ..., x_e-1} is marked by i-th color, iff {b, x₁, ..., x_e-1} is marked by i-th color.

Since |M - {b}| >= R'(e-1, 2, T₁, T₂), where T₁ = R'(e, 2, k₁-1, k₂), and T₂ = R'(e, 2, k₁, k₂-1), then by our modified theorem for e-1 we obtain a subset M' of M - {b} such that:

a) Either |M'| = T₁ and all (e-1)-member subsets of M' are marked by the first color.

b) Or |M'| = T₂ and all (e-1)-member subsets of M' are marked by the second color.

In the case a), for all subsets {x₁, ..., x_e-1} of M' the e-member set {b, x₁, ..., x_e-1} is marked by the first color. Since |M'| = T1 = R'(e, 2, k₁-1, k₂), by our modified theorem for (e, k₁-1, k₂) we obtain a subset H' of M' such that:

a1) Either |H'| = k₁-1 and all e-member subsets of H' are marked by the first color. Then for the case (e, k₁, k₂) we can take H = H'U{b}.

a2) Or |H'| = k₂ and all e-member subsets of H' are marked by the second color. Then for the case (e, k₁, k₂) we can take H = H'.

The proof for the case b) is similar.

To complete the case r=2 of the Finite Ramsey's theorem we can take

R(e, 2, k) = R'(e, 2, k, k).

Q.E.D. for r=2.

3) The case r>2 we will prove by induction, namely by showing that we can take

R(e, r, k) = R(e, 2, R(e, r-1, k)).

Indeed, assume that |M| >= R(e, r, k) and all e-member subsets of M are marked by using r colors. To reduce the situation to the case r=2 let us "merge" the second and all the following colors (i.e. except the first one). Then, by the case r=2 we obtain a subset M' of M such that |M'| = R(e, r-1, k) and:

a) Either all e-member subsets of M' are marked by the first color.

b) Or all e-member subsets of M' are marked by the second (i.e. "merged") color.

In the case a), since R(e, r-1, k) >=k, we obtain immediately a subset H of M' such that |H| = k and all e-member subsets of H are marked by the first color. Q.E.D.

In the case b) we have an "(r-1)-coloring" of all e-member subsets of M'. Since |M'| = R(e, r-1, k), we have the case r-1 of the Finite Ramsey's theorem that is supposed to be true, i.e. there is a subset H of M' such that |H| = k and all e-member subsets of H are marked by the same color. Q.E.D.

Q.E.D. for the entire Finite Ramsey's theorem.

It may seem that the Finite Ramsey's theorem is discussing arbitrary finite sets, not natural numbers. Still, since this theorem does not involve properties of members of finite sets, we can simply replace these-members by natural numbers. Each finite set of natural numbers {n₁, ..., n_k} we can represent by two numbers (b, c) (we could use, for example, Goedel's beta-function, see Section 3.3):

beta(b, c, 0) = k, beta(b, c, 1) = n₁, ..., beta(b, c, k) = n_k.

In this way the Finite Ramsey's theorem can be formulated in the language of PA, and our very elementary proof of it - converted into a formal proof in PA.

Extended Finite Ramsey's Theorem

Now we have two extreme versions of Ramsey's theorem:

a) The infinite version that can be neither formulated, nor proved in PA, yet it can be both formulated and proved in ZFC.

b) The finite version that can be both formulated and proved in PA.

In 1977 an intermediate version of Ramsey's theorem was discovered (= invented) that can be formulated in PA, and can be proved in ZFC, yet not in PA (if PA is consistent).

L.Kirby, J.Paris. Initial segments of models of Peano's axioms. "Proceedings of the Bierutowice Conference 1976", Springer, Berlin, 1979

J.Paris. Independence results for Peano arithmetic. "J. Symbolic Logic", 1978, vol.43, N4, pp.725-731

J.Paris, L.Harrington. A mathematical incompleteness in Peano arithmetic. In Barwise [1977].

The authors are telling their story in the third of these papers:

"The first examples of strictly mathematical statements about natural numbers which are true but not provable in PA (Peano arithmetic) were due to the first author (see Paris [to appear], and grew out of the work in Paris and Kirby [to appear]. The second author's contribution was to show that Paris's proof could be carried through with the particularly simple extension of the Finite Ramsey Theorem..."

If the finite sets discussed in the Finite Ramsey's theorem would come from some fixed countable "universe" (for example, if we decided to consider only sets of natural numbers), then we could not restrict ourselves to counting of members of these sets. And we could consider also some other properties of them.

For example, let us call a property g of finite sets of some "universe" U a dense property, iff:

a) If a finite set H₁ possess the property g, and H₁ is a subset of a finite set H₂, then H₂ also possess the property g.

b) For each infinite set H₂ there is a finite subset H₁ that possess the property g (this is the "density" of g).

A simple example of a dense property of sets of natural numbers is the so-called property of being "relatively large": a set H of natural numbers is called relatively large, iff min(H) <= |H|.

Indeed:

a) If min(H₁)<=|H₁|, and H₁ is a subset of H₂, then min(H₂)<=min(H₁)<= |H₁|<=|H₂|, i.e. min(H₂)<=|H₂|.

b) If H₂ is an infinite set of natural numbers, take as H₁ the set of min(H₂) least members of H₂. Then min(H₁)=min(H₂)=|H₁|, i.e. H₁ is relatively large.

This is an example of a computable dense property: having all members of some finite set, we can effectively decide, possess this set the property or not.

Extended Finite Ramsey's theorem. Let us consider only sets of natural numbers. For each computable dense property g there is a computable function R_g(e, r, k) such that for all positive integers e, r, k and each finite set M the following holds: if |M|>=R_g(e, r, k), and each e-member subset of M is marked by one of r colors, then there is a subset H of M such that |H|>=k, H possess the property g, and all e-member subsets of H are marked by the same color.

This theorem differs from the Finite Ramsey's theorem only "a little" - additionally, the set H possess the property g.

Proof - part1 (in PA). Since properties of members of the set M do not affect the problem to be solved, we can replace M, for example, by the set of natural numbers {0, 1, ..., |M|-1}, i.e. in terms of Section 2.3 we can say that M is a natural number.

For M=e there is only one e-member subset {0, 1, ..., e-1}. The number of possible "r-colorings" of this subset is r.

Suppose, we have a pair (M, C), where C is some r-coloring of e-member subsets of M. Of course, for each triple (e, r, M) there is only a finite number of different r-colorings of e-member subsets of M. Hence, if we proceed from M to M+1, where

M+1 = {0, 1, ..., M-1, M} = M U {M},

then from the r-coloring C we can obtain only a finite number of r-colorings (of e-member subsets) of M+1 that are extensions of C. (Some coloring C' of M+1 is called an extension of C, iff each e-member subset of M is marked in C' by the same color as it is marked in C.)

|-------(e, P₀')------ ...

|-------(e, P₀'')------ ...

|------- ...

|-------(e, P₀⁽ⁱ⁾)------ ... --------(M, P)-------|-------(M+1, P^(j)) -----------

|------- ...

|-------(e, P₀^(r))------ ...

If we proceed in this way from e to e+1, after this - to e+2, e+3 etc., then we obtain an infinite tree of pairs (M, C) having at each of its nodes only a finite number of branches. Indeed, let us start from a fictive empty node O. At the next level (M=e) we have r branches to r different r-colorings of the only e-member subset of e. Etc., from each node (M, C), where C is an r-coloring of e-member subsets of M, a finite number of branches is starting to all the possible nodes (M+1. C') such that C' is an r-coloring of e-member subsets of M+1 that extends C.

Of course, (for fixed e and r) this tree contains all the possible pairs (M, C), where C is an r-coloring of e-member subsets of M, and it defines some natural ordering of them.

Let us say that (M, C) is a "good" node, if there is a subset H of M such that |H| >=k, H possess the property g, and all e-member subsets of H are marked (in the coloring C) by the same color.

Exercise A.2.1. Verify that if (M, C) is "good", and there is a branch from (M, C) to (M+1, P'), then (M+1, P') also is "good". I.e. if some node is "good", then then the entire subtree of it is "good".

Since each level of the tree contains only a finite number of nodes, the Extended Finite Ramsay's theorem would be proved, if we could prove that there is only a finite number of "bad" nodes. Indeed, then we could produce the following algorithm computing the function R_g(e, r, k). Let us scan all levels of the tree one by one, determining for each node, is it "good" or "bad". If there is only a finite number of "bad" nodes, then at some level all nodes will be "good". Let us take the level number M for the value of R_g(e, r, k).

Exercise A.2.2. Describe an algorithm determining, is a given tree node "good" or "bad". How much time is required to solve this task?

Of course, we do not need set theory to define the above algorithm. Indeed, let us repeat its definition once more:

Input: numbers e, r, k. Build the corresponding (M, C)-tree. Scan all levels of this tree one by one, determining for each node, is it "good" or "bad". If at some level all nodes are "good", take the level number M and output it as the value R_g(e, r, k).

Hence, no problem to write a computer program that takes numbers e, r, k as input, and either calculates the number R_g(e, r, k) as output, or ... does not halt (if there are no tree levels with "good" nodes only).

Surprisingly, we need set theory to prove that this program halts for all triples (e, r, k).

Proof - part2 (in ZFC). Let us assume the opposite - that there is an infinite number of "bad" nodes. If there is a branch from (M, C) to (M+1, C'), and the node (M+1, C') is "bad", then (M, C) also is "bad". Hence, the substructure of "bad" nodes in our tree is itself a tree - a finitely branching infinite tree.

Exercise A.2.3. Prove the following version of the so-called Koenig's lemma: if a finitely branching tree has infinite set of nodes, then this tree contains an infinite branch.

Hence, our tree contains an infinite branch B consisting of "bad" nodes only. This branch defines a single r-coloring C'' of all e-member sets of natural numbers. Indeed, if {x₁, ..., x_e} is a set of natural numbers, then take M = max{x₁, ..., x_e}, consider the node (M, C_M) of the branch B, and mark the set {x₁, ..., x_e} (in the coloring C'') by the color it is marked in the coloring C_M. This definition of C'' is "stable" in the sense that on the branch B the coloring C_M is the first one that assigns a color to the set {x₁, ..., x_e}, and all the following colorings C_M+1, C_M+2, ... cannot change this color, since they all are extensions of C_M. I.e. C'' is an extension of C_M for all M.

Let us apply the Infinite Ramsey's theorem to the set N of all natural numbers and the r-coloring C''. I.e. there is an infinite subset H'' of N such that all e-member subsets of H'' are marked by the same color. Since g is a dense property, there is a finite subset H of H'' that possess the property g. Let us add to H enough members of H'' to ensure that |H| >= k. This extended set also possess the property g. If we take M = max(H)+1, then H appears to be a subset of M such that: |H| >= k, H possess the property g, and all e-member subsets of H are marked by the same color in the coloring C_M. Hence, (M, P_M) is a "good" node - on the branch B consisting of "bad" nodes only!

I.e. our tree always contains only a finite number of "bad" nodes. And hence, our algorithm computing the function R_g(e, r, k) halts for all triples (e, r, k). Q.E.D.

Extended Finite Ramsey's theorem cannot be proved in PA

For any computable dense property g (such as min(H) <= |H|) the Extended Finite Ramsey's theorem can be formulated in PA. We have proved this theorem in ZFC, yet it cannot be proved in PA (if PA is consistent) even for any single property g.

Proof. See the above paper by Paris and Harrington.

Thus, since 1977 we know an example of a "strictly mathematical statement about natural numbers" that cannot be proved in first order arithmetic. And since 1977, some similar results were established: see Natural Independence Phenomenon in Eric Weisstein's World of Mathematics, and Goodstein's Amazing Sequence.

All this means that Greeks having only their first order notion of natural numbers could not prove the Extended Finite Ramsey's theorem and some other "strictly mathematical statements about natural numbers". These proofs became possible only in 1870s when Georg Cantor invented set theory. By introducing the notion of arbitrary infinite sets Cantor added new features also to the 2400 years old notion of natural numbers. Q.E.D.

Now let us return to the beginning of this Appendix where the problem of introducing "nonmathematical" questions into number theory was discussed. If you believe that formulas of first order arithmetic (PA) used in Goedel's proofs are not normal mathematical statements about natural numbers, then what would you say about the following theorem from the same famous paper by Paris and Harrington?

Traditionally, the so-called SIGMA1-formulas are defined as formulas of PA having the form Ex₁...Ex_nF(...), where F belongs to the class of the so-called "primitive recursive" formulas, and all quantifiers before F are existential. Still, as we have proved in Section 4, any such formula has a Diophantine representation

Ex₁..Ex_nEy₁...Ey_k P = 0

where P=0 is a Diophantine equation. Since our proof can be formalized in PA, we can define SIGMA1-formulas simply as Diophantine representations, i.e. as Diophantine equations preceeded by existential quantifiers.

The following statement can be formulated in the language of PA:

"For all SIGMA1-formulas F(x) having exactly one free variable x, if for each n PA proves F(n), then AxF(x)"

This statement is called the uniform SIGMA1 reflection principle for PA. This principle says that if PA proves all cases of some SIGMA1-formula F(x), then F(x) is true for all x, i.e. it asserts the soundness of PA in the area of SIGMA1-formulas. Of course, this principle can be proved in ZF by using the standard model of PA (see Appendix 1). Still, it cannot be proved in PA, moreover, it cannot be proved even in PA + Con(PA) (if this extended theory is consistent, see the chapter about incompleteness theorems in Barwise [1977]). Hence, the uniform SIGMA1 reflection principle for PA is a stronger hypothesis than the hypothesis "PA is consistent".

Is the uniform SIGMA1 reflection principle for PA (as a formula of first order arithmetic) an example of introducing "nonmathematical" questions into number theory? Of course, it is. Is the Extended Finite Ramsey's theorem an example of a "strictly mathematical statement about natural numbers"? Of course, it is. Still, one can prove the following

Theorem. It can be proved in PA, that the Extended Finite Ramsey's theorem (for any fixed property g) is equivalent to the uniform SIGMA1 reflection principle for PA!

Proof. See the above paper by Paris and Harrington.

A deadlock? Not for me. I find more interesting the conclusion that Extended Finite Ramsey's theorem cannot be proved not only in PA, but it cannot be proved also in PA+Con(PA).

The function R(e, r, k) from the Finite Ramsey's theorem is known as a very fast growing function (see Graham [1981]). Still, R_g(e, r, k) exceeds in this area any possible expectations. Namely, for any fixed dense property g the "diagonal" function R_g(k, k, k+1) is growing faster than any function f(k) such that

PA proves: AxEyF(x, y),

where formula F(x, y) represents f in PA (see Section 3.3). I.e. if you can prove in PA, that your algorithm for computing f(k) halts for all k, then f(k) as a function of k is growing slower than R_g(k, k, k+1).

Proof. See the above paper by Paris and Harrington.

Appendix 3. What Is Logic, Really?

Appendix 4. Descriptive Set Theory

model theory, Skolem paradox, Ramsey theorem, Loewenheim, categorical, Ramsey, Skolem, Gödel, completeness theorem, categoricity, Goedel, theorem, completeness, Godel

Back to title page.