\documentclass[10pt,a4paper,twoside]{article}
\usepackage[polutonikogreek,english]{babel}
%The following makes *everything* Greek!
%\usepackage{greek}
\renewcommand{\thetable}{\Roman{table}}
\usepackage{amsthm}
\swapnumbers
\input{../../math/abbreviations}
%\newcommand{\pref}[1]{(\ref{#1})}
\theoremstyle{definition}
\newtheorem{exercise}[theorem]{Exercise}
\theoremstyle{plain}
\newtheorem{Peano}[theorem]{Peano Axioms}
\newtheorem{Infinity}[theorem]{Axiom of Infinity}
\newtheorem{Choice}[theorem]{Axiom of Choice}
\input{../../math/format}
\newcommand{\todaysdate}{2004.2.20}
\title{Notes on Set-Theory}
\author{David Pierce}
\date{\todaysdate}
%\pagestyle{headings}
\pagestyle{myheadings}
\markboth{NOTES ON SET-THEORY}{\leftmark}
\newcommand{\sectbegin}{\S\ \thesection\quad }
% \usepackage{fancyheadings} \pagestyle{fancy}
% \lhead[\thepage]{\S~\thesection} \rhead[\S~\thesection]{\thepage}
% \chead[]{} \lfoot[]{} \rfoot[]{} \cfoot[]{}
% \addtolength{\voffset}{-2cm}
% \addtolength{\textheight}{4cm}
\newcommand{\Tur}[1]{\texttt{#1}} % for words in Turkish
\newcommand{\Eng}[1]{\textsl{#1}} %for words in English
%\newcommand{\Gk}[1]{\textsf{#1}}
\newcommand{\Lat}[1]{\textsc{#1}}
\newcommand{\Gk}[1]{\begin{greektext}{#1}\end{greektext}}
\newcommand{\lett}[1]{\textsf{#1}}
\renewcommand{\theenumi}{\fnsymbol{enumi}}
\renewcommand{\labelenumi}{\textnormal{(\theenumi)}}
\renewcommand{\theequation}{\fnsymbol{equation}}
\newcommand{\axz}{Axiom Z}
\newcommand{\axu}{Axiom U}
\newcommand{\axi}{Axiom I}
\begin{document}
\setcounter{section}{-1}
\maketitle\thispagestyle{empty}
\section{Introduction}\label{sect:intro}
\markright{\sectbegin Introduction}
\subsection{}
The book of Landau \cite{MR12:397m} that influences these notes
begins with two prefaces, one for the student and one for the
teacher. The first asks the student not to read the second.
Perhaps Landau hoped to \emph{induce} the student to read the Preface
for the Teacher, but not to worry about digesting its contents.
I have such a hope concerning \S~\ref{subsect:teacher} below.
An earlier version of these notes\footnote{I prepared the earlier
version for
the first-year course at METU called `Fundamentals of Mathematics'
(Math 111); but those notes contained much more than that course had
time for.} began immediately with a study of the natural numbers.
The set-theory in those notes was somewhat \emph{na\"\i ve}, that
is, non-axiomatic. Of the usual so-called Zermelo--Fraenkel
Axioms with Choice, the notes \emph{did} mention the Axioms of
Foundation, Infinity and Choice, but not (explicitly) the others.
The present notes \emph{do} give all of the axioms\footnote{In their
order of appearance here, they are: Extensionality
(p.~\pageref{ax:extensionality}), Pairing (p.~\pageref{ax:pairing}),
Comprehension (p.~\pageref{ax:comprehension}), Power-set
(p.~\pageref{ax:power-set}), Union (p.~\pageref{ax:union}),
Replacement (p.~\pageref{ax:replacement}), Infinity
(p.~\pageref{ax:choice}) and Foundation (p.~\pageref{ax:foundation}).}
of $\zfc$.
What is a set?
First of all, a set is many things that can be considered as
one; it is a multitude
that is also a unity; it is something like a \tech{number}\footnote{I
may set technical terms in a slanted font thus, by way of
acknowledging that they \emph{are} technical terms.}.
Therefore, set-theory might be an
appropriate part of the education of the guardians of an ideal
city---namely, the city that
Plato's Socrates describes in the \emph{Republic}.
The following translation from Book VII (524d--525b) is mine, but
depends on the translations of Shorey \cite{Shorey} and Waterfield
\cite{Waterfield}. I have inserted some of the original Greek words,
especially\footnote{I have also included certain derivatives of the
present participle \Gk{>'ont-} corresponding to the English
\Eng{being}. Addition of the abstract-noun suffix \Gk{-'ia} yields
\Gk{o>us'ia}; the corresponding Turkish might be \Tur{olurluk}. The
Greek \Gk{o>us'ia} is sometimes translated as \Eng{substance}, and
indeed both words can connote wealth.
Putting the definite article in front of the nominative neuter form
of \Gk{>'ont-} creates \Gk{t`o >'on}.}
those that are origins of English words. (See Table \ref{table:Greek}
below for transliterations.)
\begin{table}[b]
\caption{The Greek alphabet}\label{table:Greek}
\begin{center}
\begin{tabular}{| c l | c l | c l | c l |} \hline
\Gk{A a} & \textbf alpha & \Gk{H h} & \textbf{\=e}ta & \Gk{N n} & \textbf nu
& \Gk{T t} & \textbf tau \\
\Gk{B b} & \textbf beta & \Gk{J j} & \textbf{th}eta & \Gk{X x} & \textbf xi
& \Gk{U u} & \textbf upsilon \\
\Gk{G g} & \textbf gamma & \Gk{I i} & \textbf iota & \Gk{O o} & \textbf
omicron& \Gk{F f} & \textbf{ph}i\\
\Gk{D d} & \textbf delta & \Gk{K k} & \textbf kappa & \Gk{P p} & \textbf pi
& \Gk{Q q} & \textbf{ch}i\\
\Gk{E e} & \textbf epsilon& \Gk{L l} & \textbf lambda& \Gk{R r} & \textbf{r}ho
& \Gk{Y y} & \textbf {ps}i\\
\Gk{Z z} & \textbf zeta & \Gk{M m} & \textbf mu & \Gk{S sv/s} & \textbf
sigma & \Gk{W w} & \textbf{\=o}mega\\ \hline
\end{tabular}
\end{center}
The first letter or two of the (Latin) name provides a transliteration
for the Greek letter. In texts, the rough-breathing mark \Gk{<} over
an initial
vowel (or \Gk r) corresponds to a preceeding \lett h; the
smooth-breathing mark \Gk{>} and the three tonal accents can be
ignored.
\end{table}
\begin{quotation}
`So this is what I [Socrates] was just trying to explain: Some things
are thought-provoking, and some are not. Those things are called
\defn{thought-provoking} that strike our sense together with their
opposites. Those that do not, do not
tend to awaken reflection.'
`Ah, now I understand' he [Glaucon] said. `It seems that way to me,
too.'
`Okay then. Which of these do \emph{multiplicity}
(\Gk{>arijm'oc}) and \emph{unity} (\Gk{t`o <'en}) seem to be?'
`I can't imagine' he said.
`Well,' I said `reason it out from
what we said. If unity is fully grasped alone, in itself, by sight
or some other sense, then it must be [an object] like a finger, as we
were explaining: it does not draw us towards \emph{being-ness}
(\Gk{o>us'ia}). But if some discrepancy is always seen with it, so as
to appear not rather \emph{one} (\Gk{<'en}) than its opposite, then a
decision is needed---indeed, the \emph{soul} (\Gk{yuq'h}) in itself is
compelled to be puzzled, and to cast about, arousing thought within
itself, and to ask:
What then is unity as such? And so the \emph{study}
(\Gk{m'ajhsic}) of unity must be among those that lead and guide
[the soul] to the sight of \emph{that which is} (\Gk{t`o >'on}).'
`But certainly' he
said `vision is especially like that. For, the same thing is seen
as one and as \emph{indefinite multitude}
(\Gk{>'apeira t`o pl\~hjoc}).'
`If it is so with unity,' I said `is it not so with every \emph{number}
(\Gk{>arijm'oc})?'
`How could it not be?'
`But \emph{calculation}
(\Gk{logistik'h}) and \emph{number-theory} (\Gk{>arijmhtik'h}) are
entirely about number.'
`Absolutely.'
`And these things appear to lead to truth.'
`Yes, and extremely well.'
`So it seems that these must be some of the \emph{studies}
(\Gk{majhm'ata}) that
we are looking for. Indeed, the \emph{military} (\Gk{polemik'on}) needs to
learn them for deployment [of troops],---and the philosopher,
because he has to rise out of [the world of] \emph{becoming}
(\Gk{g'enesic}) in order to take hold of being-ness, or else he will
never \emph{become a calculator} (\Gk{logistik\~w| gen'esjai}).'
`Just so' he said.
`And our guardian happens to be both military man and philosopher.'
`Of course.'
`So, Glaucon, it is appropriate to require this study by law and to
persuade those who intend to take part in the greatest affairs of
the city to go into calculation and to engage in it not \emph{as a pastime}
(\Gk{>idiwtik\~wc}), but until they have attained, by thought
itself, the vision of the nature of numbers, not [for the sake of]
buying and selling, as if they were preparing to be merchants or
shopkeepers, but for the sake of war and an easy turning of the soul
itself from becoming towards truth and being-ness.'
`You speak superbly' he said.
\end{quotation}
(In reading this passage from Plato, and in particular the comments on
war, one can hardly be sure
that Socrates is not pulling Glaucon's leg. Socrates previously
(369b--372c) described a primitive, peaceful,
vegetarian city, which Glaucon rejected (372c--d) as
being fit only for pigs.)
The reader of the present notes is not assumed to have much knowledge
`officially'. But the reader should have some awareness of the
Boolean connectives of propositional logic and their connexion with
the Boolean operations on sets. (A dictionary of the connectives is
in Table \ref{table:connectives} below.)
\begin{table}[b]
\caption{Boolean connectives}\label{table:connectives}
\begin{center}
\begin{tabular}{| c | l | l |}\hline
$\land$ & \Eng{and} & conjunction\\ \hline
$\lor$ & \Eng{or} & disjunction\\ \hline
$\lnot$ & \Eng{not} & negation \\ \hline
$\to$ & \Eng{implies} & implication\\ \hline
$\iff$ & \Eng{if and only if} & biconditional\\ \hline
\end{tabular}
\end{center}
\end{table}
One theme of these notes is the relation between \tech{definition by
recursion} and \tech{proof by induction}. The development of
propositional logic already requires recursion and
induction.\footnote{Here I use the words `recursion' and
`induction' in a more general sense than in the definitions on
pp.~\pageref{page:induction} and \pageref{page:recursion}.} For
example, \defn{propositional formulas}\footnote{Words in bold-face
in these notes are being defined.} are defined recursively:
\begin{enumerate}
\item
\tech{Propositional variables} and $0$ and $1$ are propositional formulas.
\item
If $F$ is a propositional formula, then so is $\lnot F$.
\item
If $F$ and $G$ are propositional formulas, then so is
$(F\Bcon G)$, where $\Bcon$ is $\land$, $\lor$, $\to$ or $\iff$.
\end{enumerate}
The \defn{sub-formulas} of a formula are also defined recursively:
\begin{enumerate}
\item
Every formula is a sub-formula of itself.
\item
Any sub-formula of a formula $F$ is a sub-formula of $\lnot F$.
\item
Any sub-formula of $F$ or $G$ is a sub-formula of $(F\Bcon G)$.
\end{enumerate}
Now, two formulas are \defn{equivalent} if they have the same
\tech{truth-table}. For example, $(P\to Q)$ and $(\lnot P\lor Q)$ are
equivalent, because their truth-tables are, respectively:
\begin{center}
\begin{tabular}{c|c|c}
$P$ & $\to$ & $Q$ \\ \hline
$0$ & $1$ & $0$\\
$1$ & $0$ & $0$\\
$0$ & $1$ & $1$\\
$1$ & $1$ & $1$
\end{tabular}\qquad
\begin{tabular}{c|c|c|c}
$\lnot$ & $P$ & $\lor$ & $Q$\\ \hline
$1$&$0$&$1$&$0$\\
$0$&$1$&$0$&$0$\\
$1$&$0$&$1$&$1$\\
$0$&$1$&$1$&$1$
\end{tabular}
\end{center}
Suppose $F$ and $G$ are equivalent; this is denoted
\begin{equation*}
F\sim G.
\end{equation*}
Suppose also $F$ is a sub-formula of $H$, and $H'$ is the result of
replacing $F$ in $H$ with $G$. The \defn{Substitution Theorem} is
that
\begin{equation*}
H\sim H'.
\end{equation*}
Because of the recursive definition of propositional formulas, we can
prove the Substitution Theorem by induction as follows:
\begin{enumerate}
\item
The claim is trivially true when $H$ is a
propositional variable or $0$ or $1$, since then $F$ \emph{is} $H$, so
$H'$ is $G$.
\item
Suppose, as an inductive hypothesis, that the
claim is true when $H$ is $H_0$. Then we can
show that the claim is true when $H$ is $\lnot H_0$.
\item
Suppose, as an inductive hypothesis, that the
claim is true when $H$ is $H_0$ and when $H$ is $H_1$. Then we can
show that the claim is true when $H$ is $(H_0\Bcon H_1)$, where
$\Bcon$ is as above.
\end{enumerate}
Such a proof is sometimes said to be a `proof by induction on the
complexity of propositional formulas'.
A conjunction corresponds to an
\tech{intersection} of sets, and so forth, but this is spelled out in
\S~\ref{sect:sets} below. I shall also use formulas of
\tech{first-order} logic, and in particular the \tech{quantifiers}
(given in Table \ref{table:quantifiers} below).
\begin{table}[b]
\caption{Quantifiers}\label{table:quantifiers}
\begin{center}
\begin{tabular}{| c | l | l |}\hline
$\forall$ & \Eng{for all} & universal\\ \hline
$\exists$ & \Eng{there exists\dots such that} & existential\\ \hline
\end{tabular}
\end{center}
\end{table}
For emphasis, instead of $\to$ and $\iff$,
I may use the arrows $\implies$ and $\Iff$ between formulas.
My own research-interests lie more in
model-theory than in set-theory. I aim here just to
set down some established mathematics as precisely as
possible, without much discussion. (There is a textbook that has been in use
for over two thousand years, but that contains no discussion at all, only
axioms, definitions, theorems and proofs. This is Euclid's
\emph{Elements}~\cite{MR17:814b}.) I do think that explicit reference
to models can elucidate some points. The reader should
also consult texts by people who \emph{are} set-theorists, for other
points of view, for historical references, and to see how the field
has developed beyond what is
given in these notes. Also, the reader should remember that these
notes are still a rough draft. There are not yet many exercises, and
some of them are difficult or lacking in clear answers.
\subsection{}\label{subsect:teacher}
Any text on axiomatic set-theory will introduce the set $\varN$, which
is the
smallest set that contains $\emptyset$ and that contains $x\cup\{x\}$
whenever it contains $x$. The text \emph{may} (but need not) mention
that $\varN$ is a
model of the Peano axioms for the natural numbers. The present notes
differ from some published texts in two ways:
\begin{itemize}
\item
I prove facts about the natural numbers \emph{from the Peano
axioms}, not just \emph{in $\varN$}.
\item
I mention structures that are models of some, but not all, of the
Peano axioms.
\end{itemize}
I set out a minimum of set-theory in \S~\ref{sect:sets}, enough so
that the properties of natural numbers can be derived from the Peano
axioms, starting in \S~\ref{sect:Peano}. Some set-theory books, such
as Ciesielski \cite[\S~3.1]{MR99c:04001},
will immediately give $\varN$ as a model of these axioms.
Certain properties of natural numbers are
easier to prove \emph{in this model} than \emph{by the Peano axioms}.
I prefer to follow the axiomatic approach for several
reasons:
One reason is practice. It is worthwhile to have experience with
the Peano axioms as well as $\zfc$, especially since, unlike $\zfc$,
the Peano axioms include a second-order statement. (It may be that
some writers assume that the reader has already had
sufficient practice with the Peano axioms; I do not make such an
assumption.)
The Peano axioms are more natural than their specific model $\varN$.
The elements of $\varN$ (as well as $\varN$ itself) are so-called
\tech{von Neumann ordinals}, that is, \tech{transitive} sets
that are \tech{well-ordered} by \tech{containment}.
In a slightly different
context, the model-theorist Poizat \cite[\S~8.1]{MR2001a:03072} observes:
\begin{quote}
We meet some students who are allergic to ordinals as
`well-ordering types' and who find the notion of von Neumann
ordinals easier to digest; that is a singular consequence of
dogmatic teaching, which confuses formalism with rigor, and which
favors technical craft to the detriment of the fundamental idea: It
takes a strangely warped mind to find the notion of a transitive set
natural!
\end{quote}
A third reason for taking the axiomatic approach to the natural
numbers is that it can bring out a distinction that is often ignored. The
structure of the natural numbers admits \tech{proof by induction} and
\tech{definition by recursion}. Vaught \cite[ch.~2,
\S~4]{MR95k:03001}, for example, says that
recursion \emph{is} `the same thing as definition by induction'.
Since it is just about terminology, the statement is not wrong. But
definition by `induction' or recursion\footnote{In the sense defined
on p.~\pageref{page:recursion} below.} works \emph{only} in models of
the Peano axioms, while there are other structures in which
\emph{proof} by induction\footnote{In the sense defined on
p.~\pageref{page:induction}.} works.
There are `strong' versions of induction and recursion. There is
proof by strong induction, and definition by strong recursion.
Admission of either of \emph{these} is equivalent to admission of the
other; the structures that admit them are precisely the well-ordered
sets. Some basic undergraduate texts suggest confusion on this point.
For example, in talking about the integers, one book\footnote{Namely,
Epp \cite[\S~4.4, p.~213]{Epp}, used sometimes in Math 111 and 112.}
says:
\begin{quote}
It is apparent that if the principle of strong mathematical induction
is true, then so is the principle of ordinary mathematical
induction\dots
It can also be shown that if the principle of ordinary mathematical
induction is true, then so is the principle of strong mathematical
induction. A proof of this fact is sketched in the exercises\dots
\end{quote}
Both statements about induction here are literally false.
The second statement is correct if it is understood to
mean simply that the natural numbers satisfy the principle of strong
induction. The `proof' that is
offered for the first statement uses implicitly
that every integer is a \tech{successor}, something that does not
follow from strong induction.
Finally, by emphasizing the axiomatic development of the natural
numbers, I hope to encourage the reader to watch out for unexamined
assumptions, in these notes and elsewhere. The Hajnal text
\cite{MR2000m:03001} defines
$\varN$ on the first page of \S~1 as `the set of nonnegative
integers'. Then come a hundred pages of the set-theory covered in
the present notes, and more. The Preface says that this work `is
carried out on a quite precise, but intuitive level'; only after
\emph{this} does the reader get, in an appendix, on p.~127, a
rigorous definition of $\varN$. To my mind,
the precise but intuitive way to treat the natural numbers is by means
of the Peano axioms. Perhaps the reader of Hajnal is supposed to have
seen such a treatment before, since, according to the index, the term
`Peano' appears only once, on p.~133, and there is no definition.
Devlin \cite{MR94e:03001} seems never to mention the natural numbers
as such at all, though early on (p.~6), he asserts the existence of sets
$\{a_1,\dots,a_n\}$. (Later he defines the symbol $\varN$, na\"\i
vely on p.~24, rigorously on p.~66.) Like Hajnal, Moschovakis
\cite{MR95a:04001}
\emph{names} the set of natural numbers on the first page of text; but
then he discusses set-theory for only fifty pages before devoting a
chapter to a rigorous treatment of the natural numbers.
\section{Sets and classes}\label{sect:sets}
\markright{\sectbegin Sets and classes}
A set has \defn{members}, or
\defn{elements}. A set \defn{contains} its elements, and the elements
\defn{compose} the set. To say that a set $A$ has an element $b$, we
may write
\begin{equation*}
b\in A,
\end{equation*}
using between $b$ and $A$ a symbol derived from the Greek minuscule
epsilon, which can be understood as standing for the Latin word
\Lat{elementvm}. A set is not \emph{distinct} from its
elements in the way that a box is distinct from its contents. A set
may be distinct from any \emph{particular} element. But I propose to
say that a set \emph{is} its elements, and the elements \emph{are} the
set.
This is a paradoxical statement. How can one thing be many, and many,
one? The
difficulty of answering this is perhaps reflected in the difficulties
of set-theory. In any case, if a set is its elements, then the
elements \emph{uniquely determine} the set. This is something whose
meaning we can express mathematically; it is perhaps the most
fundamental axiom of set-theory:
\begin{axiom}[Extensionality]\label{ax:extensionality}
If two sets $A$ and $B$ have the same members, then $A=B$.
\end{axiom}
The converse of this axiom is trivially true: If two sets have
different members, then of course the sets themselves are different.
A set is also the sort of thing that can \emph{be} an element: If $A$ and
$B$ are sets, then the statement $A\in B$ is meaningful, and the
statement
\begin{equation*}
A\in B \lor A\notin B
\end{equation*}
is true.
Are all elements sets themselves? We do not answer this question; we
avoid it:
\begin{definition}
A property $P$ of sets is \defn{hereditary}, provided that, if $A$
is a set with property $P$, then all elements of $A$ are \emph{sets}
with property $P$. A \emph{set} is \defn{hereditary} if it has a
hereditary property.
\end{definition}
We shall ultimately restrict our attention to hereditary
sets.\footnote{See also Kunen
\cite[ch.~1, \S~4]{MR85e:03003} for discussion of this point.} Now,
we shall not assert, as an axiom, that all sets are hereditary. We
cannot now formulate such an axiom precisely, since we do not yet have a
definition of a `property' of sets. The \emph{language} with which we
talk about sets will end up ensuring that our sets are hereditary:
Everything that we shall say about sets can be said with the symbol
$\in$, along with $=$ and the logical symbols given in Tables
\ref{table:connectives} and \ref{table:quantifiers} of
\S~\ref{sect:intro}, and with variables and names \emph{for
individual sets}.
\begin{definition}\label{defn:formula}
The \defn{$\in$-formulas} are
recursively defined\footnote{This definition uses also brackets
(parentheses) in the formulas, but the brackets do not carry meaning
in the way that the
other symbols do. The brackets are meaningful in the way that the
\emph{order} of the symbols in a formula is meaningful. Indeed, we
could dispense with the brackets by using the so-called Polish or \L
ukasiewicz notation, writing, say, $\mathord{\land}\phi\psi$ instead
of $\phi\land\psi$. I shall use the infix notation instead,
but shall omit brackets where they are not needed.} as follows:
\begin{enumerate}
\item
If $x$ and $y$ are variables, and $A$ and $B$ are names, then
$x\bigcirc y$, $x\bigcirc A$, $A\bigcirc x$ and $A\bigcirc B$ are
\defn{atomic} $\in$-formulas, where $\bigcirc$ is $\in$ or $=$.
\item
If $\phi$ and $\psi$ are $\in$-formulas, then so are $\lnot\phi$ and
$(\phi\Bcon\psi)$, where $\Bcon$ is one of $\land$, $\lor$, $\to$ and $\iff$.
\item
If $\phi$ is an $\in$-formula and $x$ is a variable, then $(\Quant
x\phi)$ is an $\in$-formula, where $\quant$ is $\forall$ or $\exists$.
\end{enumerate}
\end{definition}
The $\in$-formulas are the \tech{first-order} formulas in the
\tech{signature} consisting of $\in$ alone. Other signatures are
discussed later (see Definition \ref{defn:arb-formula}). In any
signature, the first-order formulas are
defined recursively as $\in$-formulas are, but the atomic formulas
will be different. In a first-order formula, only variables can follow
quantifiers; otherwise, the distinction between a variable and a name
is not always clear (see Exercise \ref{exer:var-name}). Also, in a
first-order formula, variables and
names refer only to individual objects, rather than, say, sets of
objects. In set-theory, our objects \emph{are} sets, so it would not
make much sense to have more than one kind of variable.\footnote{See also the
comments of Levy \cite[ch.~1, \S~1, p.~4]{MR80k:04001}.}
Variables and names in $\in$-formulas are also called \defn{terms}.
(In other signatures, there will be a more general definition of
\tech{term}.) Names used in formulas may be called \defn{parameters}.
In an $\in$-formula, instead of a sub-formula
$\mathrel{\lnot} x\in y$, we can write
\begin{equation*}
x\notin y;
\end{equation*}
and instead of $\mathrel{\lnot} x=y$, we can write
\begin{equation*}
x\neq y;
\end{equation*}
here, $x$ and $y$ are terms.
If a first-order formula contains no quantifiers,
then its variables are
\defn{free} variables. The free variables of $\Exists x\phi$ and
$\Forall x\phi$ are those of $\phi$, \emph{except} $x$. The free
variables of $\lnot \phi$ are just those of $\phi$. Finally, the free
variables of $\phi\Bcon\psi$ (where $\Bcon$ is one of $\land$, $\lor$, $\to$
and $\iff$) are those of $\phi$ or $\psi$.
A \defn{sentence} is a formula with no free variables.
We can now attempt to write the Extensionality Axiom as the sentence
\begin{equation}\label{eqn:extensionality}
\Forall x\Forall y(\Forall z(z\in x\iff z\in y)\to x=y).
\end{equation}
Now, if the variables $x$, $y$ and $z$ can refer to any sets at all,
and if some sets contain objects that are not sets, then
\eqref{eqn:extensionality} is actually stronger than Axiom
\ref{ax:extensionality}. Indeed, if $A$ is a set, and $b$ is an
object that is not a set, then there might be a set $\{A,b\}$
containing $A$ and $b$ and nothing else, and a set $\{A\}$ containing
$A$ and nothing else. Then for all \emph{sets} $z$, we have
\begin{equation*}
z\in\{A,b\}\Iff z\in \{A\}.
\end{equation*}
From this, \eqref{eqn:extensionality} seems to imply $\{A,b\}=\{A\}$,
which is evidently false. Our solution to this problem will be to restrict
the variables in formulas like \eqref{eqn:extensionality} to
\emph{hereditary} sets. In this way, \eqref{eqn:extensionality}
becomes merely a special case of the Extensionality Axiom. In
particular, since the set $\{A,b\}$ is not hereditary,
\eqref{eqn:extensionality} says nothing about it.
\begin{exercise}
Alternatively, we might let our variables range over all (mathematical)
objects, even if some of these might not be sets. If $a$ is not a
set, then we should require $\Forall x x\notin a$. In this case, if
\eqref{eqn:extensionality} is still true, prove that there is at most
one object that is not a set.
\end{exercise}
In the
`Platonic' view of set-theory, when the logical symbols in an
$\in$-sentence are interpreted as in Tables \ref{table:connectives}
and \ref{table:quantifiers} of \S~\ref{sect:intro}, and when terms are
understood to refer to hereditary sets, then
the sentence is either true or
false. (A `relative' notion of truth is given in Definition
\ref{defn:truth}.) Then we are looking for the true $\in$-sentences; in
particular, we
are looking for some `obviously' true sentences---\tech{axioms}---from which
all other true sentences about hereditary sets follow
logically.\footnote{This project must fail. By G\"odel's
Incompleteness Theorem, we cannot define a list of axioms from
which all truths of set-theory follow. We can still hope to
identify axioms from which \emph{some} interesting truths follow.
One purpose of these notes is to develop some of these
interesting truths.}
A first-order formula $\phi$ with at most one free variable is called
\defn{unary}; if that free variable is $x$, then the formula
might be written
\begin{equation*}
\phi(x).
\end{equation*}
If this is an $\in$-formula, it expresses a \defn{property} that sets
might have. If $A$ has that property, then we can assert
\begin{equation*}
\phi(A).
\end{equation*}
Formally, we obtain the sentence $\phi(A)$ from $\phi$ by replacing
each \tech{free occurrence} of $x$ with $A$. A precise recursive definition
of $\phi(A)$ is possible. Here it is, for thoroughness, although we
shall not spend time with it:
\begin{definition}
For any first-order formula $\phi$, variable
$x$ and term $t$, the formula
\begin{equation*}
\phi_t^x
\end{equation*}
is the result of \tech{freely} replacing each free occurrence of $x$ in
$\phi$ with $t$; it is determined recursively as follows:
\begin{enumerate}
\item
If $\phi$ is atomic, then $\phi_t^x$ is the result of replacing
\emph{each} instance of $x$ in $\phi$ with $t$.
\item
$(\lnot\phi)_t^x$ is $\lnot(\phi_t^x)$, and $(\phi\Bcon\psi)_t^x$ is
$\phi_t^x\Bcon\psi_t^x$.
\item
$(\Quant x\phi)_t^x$ is $\Quant x\phi$.
\item
If $y$ is not $x$ and does not appear in $t$, then $(\Quant
y\phi)_t^x$ is $\Quant y\phi_t^x$.
\item
If $y$ is not $x$, but $y$ does appear in $t$, then $(\Quant
y\phi)_t^x$ is $\Quant
z(\phi_z^y)_t^x$, where $z$ is a variable that does not appear in
$t$ or $\phi$.
\end{enumerate}
If $\phi$ is $\phi(x)$, then $\phi_t^x$
can be denoted
\begin{equation*}
\phi(t).
\end{equation*}
\end{definition}
The
point is that if, for example, $\phi$ is $\psi(x)\land\Exists
x\chi(x)$, then $\phi(A)$ is $\psi(A)\land\Exists x\chi(x)$.
Alternatively, $\phi$ might be $\Exists y(\psi(x)\land\chi(y))$, in
which case $\phi(y)$ is $\Exists z(\psi(y)\land\chi(z))$, not $\Exists
y(\phi(y)\land\chi(y))$.
The sets with the property given by $\phi(x)$ compose a
\defn{class}, denoted
\begin{equation}%\label{eqn:class}
\{x: \phi(x)\}.
\end{equation}
This is the class of sets that \defn{satisfy} $\phi$, the class of $x$
such that $\phi(x)$. But not every class is a set; not every class
is a unity to the extent that it can be considered as a member of
sets:
\begin{theorem}[Russell Paradox]
The class $\{x:x\notin x\}$ is not a set.
\end{theorem}
\begin{proof}
Suppose $A$ is a set such that
\begin{equation}\label{eqn:Russell}
x\in A\implies x\notin x
\end{equation}
for all sets $x$. Either $A\notin A$ or $A\in A$, but in the
latter case, by \eqref{eqn:Russell}, we still have $A\notin A$.
Therefore $A$ is a member of $\{x:x\notin x\}$, but not of $A$ itself; so
\begin{equation*}
A\neq\{x:x\notin x\}.
\end{equation*}
Hence $\{x:x\notin x\}$ cannot be a set.
\end{proof}
\begin{remark}
The Russell Paradox is often established by contradiction: If the
class $\{x:x\notin x\}$ is a set $A$, then both $A\in A$ and $A\notin A$,
which is absurd. However, the proof given above shows that a false
assumption is not needed.
\end{remark}
\begin{exercise}
The sets that we are considering compose the class $\{x:x=x\}$. It is
logically true that
\begin{equation*}
\Forall x\Forall y(y\in x\to y=y).
\end{equation*}
Explain how this is a proof that all of our sets are hereditary.
\end{exercise}
Not every class is a set; but every set $A$ is the class $\{x:x\in
A\}$. A
\tech{disjunction} of formulas gives us the \defn{union} of
corresponding classes:
\begin{equation*}
\{x:\phi(x)\lor\psi(x)\}=\{x:\phi(x)\}\cup\{x:\psi(x)\}.
\end{equation*}
Likewise, a \tech{conjunction} gives an \defn{intersection}:
\begin{equation*}
\{x:\phi(x)\land\psi(x)\}=\{x:\phi(x)\}\cap\{x:\psi(x)\};
\end{equation*}
and a \tech{negation} gives a \defn{complement}:
\begin{equation*}
\{x:\lnot\phi(x)\}=\{x:\phi(x)\}\comp.
\end{equation*}
Finally, we can form a \defn{difference} of classes, not corresponding
to a single Boolean connective from our list:
\begin{equation*}
\{x:\phi(x)\land\lnot\psi(x)\}=\{x:\phi(x)\}\setminus\{x:\psi(x)\}.
\end{equation*}
If $\Forall x(\phi(x)\to\psi(x))$, then $\{x:\phi(x)\}$ is a
\defn{sub-class} of $\{x:\psi(x)\}$. We can write
\begin{equation*}
\Forall
x(\phi(x)\to\psi(x))\Iff\{x:\phi(x)\}\included\{x:\psi(x)\}.
\end{equation*}
We now have several abbreviations to use in writing $\in$-formulas:
\begin{align*}
x\in A\cup B&\Iff x\in A\lor x\in B;\\
x\in A\cap B&\Iff x\in A\land x\in B;\\
x\in A\setminus B&\Iff x\in A \land x\notin B;\\
A\included B&\Iff \Forall x(x\in A\to x\in B).
\end{align*}
If sets exist at all, then any two sets ought to be members of some
set:
\begin{axiom}[Pairing]\label{ax:pairing}
For any two sets, there is a set that contains them:
\begin{equation*}
\Forall x\Forall y\Exists z(x\in z\land y\in z).
\end{equation*}
\end{axiom}
The set given by the axiom might have elements other than those two
sets; we can cast them out by means of:
\begin{axiom}[Comprehension]\label{ax:comprehension}
A sub-class of a set is a set: For any $\in$-formula $\phi(x)$,
\begin{equation*}
\Forall x\Exists y\Forall z(z\in y\iff z\in x\land \phi(z)).
\end{equation*}
\end{axiom}
Note that this axiom is not a single $\in$-sentence, but a
\tech{scheme} of $\in$-sentences.
A sub-class of a set $A$ can now be called a \defn{subset} of
$A$. A set \defn{includes} its subsets. A subset $B$ of $A$ that is
distinct from $A$ is a \defn{proper} subset of $A$, and we may then write
\begin{equation*}
B\pincluded A.
\end{equation*}
We now have that, for any $x$ and $y$, there is a set
\begin{equation*}
\{x,y\}
\end{equation*}
whose members are \emph{just} $x$ and $y$; if $x=y$, then this set is
\begin{equation*}
\{x\},
\end{equation*}
which is sometimes called a \defn{singleton}.
\begin{exercise}
Prove that the class of all sets is not a set.
\end{exercise}
If $A$ is a set and $\phi$ is a (unary) formula, then the set
$A\cap\{x:\phi(x)\}$ can be written
\begin{equation*}
\{x\in A:\phi(x)\}.
\end{equation*}
In particular, if $B$ is also a set, then
\begin{equation*}
A\cap B=\{x\in A:x\in B\}.
\end{equation*}
As long as \defn{some} set $A$ exists,
we have the \defn{empty set},
\begin{equation*}
\emptyset,
\end{equation*}
which can be defined as $\{x\in A:x\neq
x\}$. Does some set exist? I take this as a logical axiom:
\begin{equation}\label{eqn:existence}
\Exists x x=x.
\end{equation}
Indeed, \emph{something} exists, as we might argue along with
Descartes \cite[II, \P~3]{Descartes:Med}:
\begin{quote}
Therefore I will suppose that all I see is false\dots But certainly
I should exist, if I were to persuade myself of something\dots Thus
it must be granted that, after weighing everything carefully and
sufficiently, one must come to the considered judgment that the
statement `\emph{I am, I exist} (\Lat{ego svm, ego existo})' is
necessarily true every time it is
uttered by me or conceived in my mind \cite[p.~17]{Cress}.
\end{quote}
Of course, we are claiming that \emph{hereditary sets} exist. But I take
\eqref{eqn:existence} to be implicit in the assertion of any sentence,
such as \eqref{eqn:extensionality}.
For any $x$ and $y$, the \defn{ordered pair} $(x,y)$ is the set
\begin{equation*}
\{\{x\},\{x,y\}\}.
\end{equation*}
All that we require of this definition is that it allow us to prove the
following:
\begin{theorem}
$(x,y)=(u,v)\Iff x=u\land y=v$.
\end{theorem}
\begin{exercise}
Prove the theorem.
\end{exercise}
Given two classes $\class C$ and $\class D$, we can now form their
\defn{cartesian product}:
\begin{equation*}
\class C\times\class D=\{(x,y):x\in\class C\land y\in\class D\}.
\end{equation*}
\begin{lemma}
A cartesian product of classes is a well-defined class, that is, can
be written as $\{x:\phi(x)\}$ for some $\in$-formula $\phi$.
\end{lemma}
\begin{exercise}
Prove the lemma.
\end{exercise}
To prove that the cartesian product of \emph{sets} is a set, we can
use:
\begin{axiom}[Power-set]\label{ax:power-set}
If $A$ is a set, then there is a set $B$ such that
\begin{equation*}
x\included A\implies x\in B
\end{equation*}
for all sets $x$. That is,
\begin{equation*}
\Forall x\Exists y\Forall z(\Forall w(w\in z\to w\in x)\to z\in y).
\end{equation*}
\end{axiom}
Hence, for any set $A$, its \defn{power-set}
$\{x:x\included A\}$
is a set; this is denoted
\begin{equation*}
\pow A.
\end{equation*}
In particular,
$(x,y)\in\pow{\pow{\{x,y\}}}$, so
$A\times B\included\pow{\pow{A\cup B}}$.
If $A$ is a set, its \defn{union} is
$\{x:\Exists y(y\in A\land x\in y)\}$,
denoted
\begin{equation*}
\bigcup A.
\end{equation*}
In particular, for any sets $A$ and $B$,
\begin{equation*}
A\cup B=\bigcup\{A,B\}.
\end{equation*}
\begin{exercise}
What are $\bigcup\emptyset$ and $\bigcup\{\emptyset\}$?
\end{exercise}
\begin{axiom}[Union]\label{ax:union}
The union of a set is a set:
\begin{equation*}
\Forall x\Exists y\Forall z(\Exists w(z\in w\land w\in x)\to z\in
y).
\end{equation*}
\end{axiom}
The union of a set $A$ might be denoted also
\begin{equation*}
\bigcup_{x\in A}x.
\end{equation*}
Suppose that, for each $x$ in $A$, there is a set $B_x$. We shall
soon be able to define a union
\begin{equation}\label{eqn:indexed-union}
\bigcup_{x\in A}B_x.
\end{equation}
This will be the union of $\{B_x:x\in A\}$. But for now, we don't
even know that this thing is a well-defined \emph{class}, much less a
set.
\begin{theorem}
The cartesian product of sets is a set.
\end{theorem}
\begin{exercise}
Prove the theorem.
\end{exercise}
If $A$ is a set, its \defn{intersection} is
$\{x:\Forall y(y\in A\to x\in y)\}$,
denoted
\begin{equation*}
\bigcap A.
\end{equation*}
If $A$ contains a set $B$ (that is, if $B\in A$), then $\bigcap
A\included B$, so $\bigcap A$ is a
set. Also, for any sets $A$ and $B$,
\begin{equation*}
A\cap B=\bigcap\{A,B\}.
\end{equation*}
\begin{exercise}
What are $\bigcap\emptyset$ and $\bigcap\{\emptyset\}$?
\end{exercise}
A \defn{relation} between $A$ and $B$ is a subset of $A \times B$. If
$R\included A\times B$, then
\begin{equation*}
R\inv =\{(y,x):(x,y)\in R\},
\end{equation*}
a relation between $B$ and $A$. If also $S\included B\times C$, then
\begin{equation*}
S\circ R=\{(x,z):\Exists y((x,y)\in R\land (y,z)\in S)\},
\end{equation*}
a relation between $A$ and $C$.
A relation between $A$ and itself is a \defn{binary} relation on $A$.
The set
\begin{equation*}
\{(x,x):x\in A\}
\end{equation*}
is the \defn{diagonal} $\Delta_A$ on $A$. A binary
relation $R$ on $A$ is:
\begin{itemize}
\item
\defn{reflexive}, if $\Delta_A\included R$;
\item
\defn{irreflexive}, if $\Delta_A\cap R=\emptyset$;
\item
\defn{symmetric}, if $R\inv=R$;
\item
\defn{anti-symmetric}, if $R\cap R\inv \included \Delta_A$;
\item
\defn{transitive}, if $R\circ R\included R$.
\end{itemize}
Then $R$ is:
\begin{itemize}
\item
an \defn{equivalence-relation}, if it is reflexive, symmetric and
transitive;
\item
a \defn{partial ordering}, if it is anti-symmetric and transitive and
either reflexive or irreflexive;
\item
a \defn{strict} partial ordering, if it is an irreflexive partial
ordering;
\item
a \defn{total ordering}, if it is a partial ordering and
$R\cup\Delta_A\cup R\inv=A\times A$.
\end{itemize}
A relation $f$ between $A$ and $B$ is a \defn{function}, or
\defn{map}, from $A$ to $B$ if
\begin{equation*}
f\circ f\inv\included\Delta_B\land \Delta_A\included f\inv\circ f.
\end{equation*}
Suppose $f$ is thus. We may refer to the function $f:A\to B$. For
each $x$ in $A$, there is a unique element $f(x)$ of $B$ such that
$(x,f(x))\in f$. Here $f(x)$ is the \defn{value} of $f$ at $x$. We
may refer to $f$ as
\begin{equation*}
x\longmapsto f(x):A\To B.
\end{equation*}
The \defn{domain} of $f$ is $A$, and $f$ is a function \defn{on} $A$.
The \defn{range} of $f$ is the set $\{y\in B:\Exists x(x,y)\in f\}$,
that is, $\{f(x):x\in A\}$, which is denoted
\begin{equation*}\label{eqn:setim}
f\setim A.
\end{equation*}
If
$C\included A$, then $f\cap(C\times B)$ is a function on $C$, denoted
$f\rest C$ and having range $f\setim C$. This set is also the
\defn{image} of $C$ under $f$.
The function $f:A\to B$ is:
\begin{itemize}
\item
\defn{surjective} or \defn{onto}, if $\Delta_B\included f\circ f\inv$;
\item
\defn{injective} or \defn{one-to-one}, if $f\inv\circ f\included\Delta_A$;
\item
\defn{bijective}, if surjective and injective (that is, one-to-one and
onto).
\end{itemize}
All of the foregoing definitions involving relations make sense even
if $A$ and $B$ are merely classes.
To discuss functions in the most
general sense, it is convenient to introduce a new quantifier,
\begin{equation*}
\existsunique,
\end{equation*}
read `there exists a unique\dots such that'; this
quantifier is defined by
\begin{equation*}
\Existsunique x\phi(x)\Iff \Exists x\phi(x)\land\Forall y(\phi(y)\to
y=x).
\end{equation*}
A formula $\psi$ with free variables $x$ and $y$ at most can be
written
\begin{equation*}
\psi(x,y);
\end{equation*}
it is a \defn{binary} formula.
Then a function is a class
$\{(x,y):\psi(x,y)\}$
such that
\begin{equation*}
\Forall x(\Exists y\psi(x,y)\to\Existsunique y\psi(x,y)).
\end{equation*}
The domain of this function is $\{x:\Exists y\psi(x,y)\}$. If the
function itself is called $f$, and if its domain includes a set $A$, then
the image $f\setim A$ or $\{f(x):x\in A\}$ is the class
\begin{equation*}
\{y:\Exists x(x\in A\land \psi(x,y)\}.
\end{equation*}
That this class is a \emph{set} is the following:
\begin{axiom}[Replacement]\label{ax:replacement}
The image of a set under a function is a set: For all classes
$\{(x,y):\psi(x,y)\}$ that are \emph{functions},
\begin{equation*}
\Forall x\Exists y\Forall z\Forall w(z\in x\land\psi(z,w)\to w\in
y).
\end{equation*}
\end{axiom}
Like Comprehension, the Replacement Axiom is a scheme of
$\in$-sentences. Indeed, for each binary formula $\psi(x,y)$, we have
\begin{equation*}
\Forall x(\Exists y\psi(x,y)\to\Existsunique y\psi(x,y))\to
\Forall x\Exists y\Forall z\Forall w(z\in x\land\psi(z,w)\to w\in
y).
\end{equation*}
If we have a function $x\mapsto B_x$ on a set $A$, then the
union \eqref{eqn:indexed-union} above is now well-defined.
Other set-theoretic axioms will arise in the course of the ensuing
discussion.
\section{Model-theory}
\markright{\sectbegin Model-theory}
A \defn{unary} relation on a set is just a subset.
A unary \defn{operation} on a set is a function from the set to
itself. A \defn{binary} operation on a set $A$ is a function from
$A\times A$ to $A$. We can continue. A \tech{ternary} relation on
$A$ is a subset of
\begin{equation*}
A\times A\times A,
\end{equation*}
that is, $(A\times A)\times A$, also denoted $A^3$. A ternary
operation on $A$ is a function from $A^3$ to $A$. More generally:
\begin{definition}
The \defn{cartesian powers} of a set $A$ are defined recursively:
\begin{enumerate}
\item
$A$ is a cartesian power of $A$.
\item
If $B$ is a cartesian power of $A$, then so is $B\times A$.
\end{enumerate}
A \defn{relation} on $A$ is a subset of a cartesian power of $A$. An
\defn{operation} on $A$ is a function from a cartesian power of $A$
into $A$.
\end{definition}
Note that we do not (yet) assert the existence of a \emph{set}
containing the cartesian powers of $A$.
\begin{exercise}
Is there a \emph{class} containing the cartesian powers of a given
set and nothing else?
\end{exercise}
\begin{definition}\label{defn:arb-formula}
A \defn{structure} is an ordered pair
\begin{equation*}
(A,T),
\end{equation*}
where $A$ is a non-empty set, and $T$ is a set (possibly empty) whose
elements are
operations and relations on $A$ and elements of $A$. The set $A$ is
the \defn{universe} of the structure. The structure itself can then
be denoted
\begin{equation*}
\str A
\end{equation*}
(or just $A$ again). A \defn{signature} of $\str A$ is a set $S$ of
symbols for the elements of $T$: This means:
\begin{enumerate}
\item
There is a bijection $s\mapsto s^{\str A}:S\to T$.
\item
Different structures can have the same signature.
\end{enumerate}
The element $s^{\str A}$ of $T$ is the \defn{interpretation} in $\str
A$ of the symbol $s$. Usually one doesn't bother to write the
superscript for an interpretation, so $s$ might really mean $s^{\str
A}$.
\end{definition}
In the next section, we shall assert as an axiom---the Peano
Axiom---the existence of a structure
\begin{equation*}
(\N,\{{}\scr{},0\})
\end{equation*}
having certain properties. The universe $\N$ will be the set of
\tech{natural numbers}, and $\scr{}$ will be the unary operation
$x\mapsto x+1$. The structure is more conveniently written as
\begin{equation*}
(\N,{}\scr{},0);
\end{equation*}
we shall also look at structures $(\N,{}\scr{},0,P)$, where $P$ is a
unary relation on $\N$.
An $\in$-sentence is supposed to be a statement about the world
of (hereditary) sets. Structures live in this world. The signature
of a structure allows us to write sentences that are true or false
\emph{in the structure}. The Peano Axiom will be that certain
sentences of the signatures $\{{}\scr{},0\}$ and $\{{}\scr{},0,P\}$
are \tech{true in} $(\N,{}\scr{},0)$ and $(\N,{}\scr{},0,P)$.
\begin{definition}
The \defn{terms} of $\{{}\scr{},0\}$ and $\{{}\scr{},0,P\}$ are
defined recursively:
\begin{enumerate}
\item
Variables and names and $0$ are terms.
\item
If $t$ is a term, then so is $\scr t$.
\end{enumerate}
The \defn{atomic} formulas of $\{{}\scr{},0\}$ are equations $t=u$ of
terms; the signature $\{{}\scr{},0,P\}$ also has atomic formulas
\begin{equation*}
P(t),
\end{equation*}
where $t$ is a term.
From the atomic formulas, formulas are built up as in Definition
\ref{defn:formula}.
\end{definition}
The definition can be generalized to other signatures. If for example
the signature has a binary operation-symbol $+$, and $t$ and $u$ are
terms of the signature, then so is $(t+u)$.
\begin{definition}\label{defn:truth}
An atomic \emph{sentence} $\sigma$ becomes \defn{true} or
\defn{false in} a
structure $\str A$, once interpretations $c^{\str A}$ are chosen for
any names $c$ appearing in $\sigma$; if $\sigma$ is true in $\str A$,
then we write
\begin{equation}\label{eqn:models}
\str A\models\sigma,
\end{equation}
and we say that $\str A$ is a \defn{model} of $\sigma$. Note that
\eqref{eqn:models} could be written out as an $\in$-sentence. For
arbitrary sentences, we define:
\begin{align*}
\str A\models\lnot\sigma&\Iff \lnot(\str A\models \sigma),\\
\str A\models\sigma\Bcon\tau&\Iff \str A\models\sigma\Bcon \str
A\models\tau,
\end{align*}
where $\Bcon$ is $\land$, $\lor$, $\to$ or $\iff$. Finally,
\begin{equation}\label{eqn:models-forall}
\str A\models\Forall x\phi(x)
\end{equation}
if and only if $\str A\models\phi(a)$ for all $a$ in $A$; and
\begin{equation*}
\str A\models\Forall x\phi(x)\Iff \str A\models\lnot\Exists x\lnot
\phi(x).
\end{equation*}
\end{definition}
\begin{exercise}\label{exer:var-name}
In the definition of \eqref{eqn:models-forall}, is $a$ a variable or
a name?
\end{exercise}
\setcounter{equation}{0}\section{The Peano axioms}\label{sect:Peano}
\markright{\sectbegin The Peano axioms}
The five so-called Peano axioms amount to the following five-part
assertion:
\begin{axiom}[Peano]
There is a set $\N$,
\begin{enumerate}
\item
containing a distinguished element $0$ (called \defn{zero}), and
\item
equipped with a unary operation
$x\mapsto \scr x$ (the \defn{successor-operation}), such that
\item
%\textnormal{\textbf{(\axz)}}
$(\N,{}\scr{},0)\models\Forall x\scr x\neq0$;
\item
%\textnormal{\textbf{(\axu)}}
$(\N,{}\scr{})\models\Forall x\Forall y(\scr x=\scr y\to x=y)$;
\item
%\textnormal{\textbf{(\axi)}}
$(\N,{}\scr{},0,P)\models P(0)\land \Forall x(P(x)\to P(\scr x))\to
\Forall x P(x)$, for every unary relation $P$ of $\N$.
\end{enumerate}
\end{axiom}
Thus, in one sense, there is a single `Peano axiom', asserting that a
structure $(\N,{}\scr{},0)$ exists with certain properties.
Its properties are that it satisfies the following three
\tech{axioms}---where now `axiom' is used in a slightly different sense:
\begin{description}
\item[\axz]
$\forall x\qsep \scr x\neq0$;
\item[\axu]
$\forall x\qsep\forall y\qsep(\scr x=\scr y\to x=y)$;
\item[\axi]
$0\in X\land \forall x\qsep(x\in X\to \scr x\in
X)\to \forall x\qsep x\in X$, for every subset $X$.
\end{description}
The set-theoretic axioms given in \S~\ref{sect:sets} are supposed to be
true in the mathematical world. The three axioms just above are
supposed to be true \emph{in a particular structure} in the
mathematical world. Note that \axi{}, considered as a single
sentence, is not a first-order
sentence, but is \tech{second-order}, since the variable $X$ refers to
sub\emph{sets} of a model, and not to elements. (\axz{} and \axu{}
are first-order.)
\begin{remark}
In first-order logic, \axi{} is replaced by a \tech{scheme} of axioms,
consisting of one sentence
\begin{equation}\label{eqn:pa}
\phi(0)\land \Forall x(\phi(x)\to \phi(\scr x))\to
\Forall x \phi(x)
\end{equation}
for each unary first-order formula $\phi$ in the signature
$\{{}\scr{},0\}$ with parameters. This scheme of axioms is
\emph{weaker} than \axi, because not every subset of $\N$ is defined
by a first-order formula. (Later we shall be able to prove this:
There are \tech{countably} many formulas $\phi(x)$, but
$\N$ has \tech{uncountably} many subsets.) This scheme of axioms
\eqref{eqn:pa}, together with \axz{} and
\axu, might be denoted $\pa$. It is a consequence of G\"odel's
Incompleteness Theorem that $\pa$ is an \tech{incomplete} theory.
This means that some first-order sentences are true in
$(\N,{}\scr{},0)$, but are not logical consequences of $\pa$. In
fact, there are models of $\pa$ that are not models of \axi.
\end{remark}
To talk more about the Peano Axioms, we make the following:
\begin{definition}
A natural number is called a \defn{successor} if it is $\scr x$ for
some $x$ in $\N$. We have special names for certain successors:
\begin{center}
\begin{tabular}{c||c|c|c|c|c|c|c|c|c}
$x$ & 0&1&2&3&4&5&6&7&8\\ \hline
$\scr x$ &1&2&3&4&5&6&7&8&9
\end{tabular}
\end{center}
A natural number $x$ is an \defn{immediate predecessor of} $y$ if
$\scr x=y$.
\end{definition}
Later we shall define the binary operation $(x,y)\mapsto x+y$ so that
$\scr x=x+1$.
Our names for the Peano Axioms are tied to their meanings (although
these names are not in general use):
\begin{itemize}
\item
\axz\ is that \emph{Z}ero is not a successor.
\item
\axu\ is that immediate predecessors are \emph{U}nique when
they exist.
\item
\axi\ is the Axiom of \defn{Induction}:\ a set contains all natural
numbers, provided that it contains $0$ and contains the successor of
each natural number that it contains.
\end{itemize}
Also, \axz\ is that the immediate
predecessor of $0$ does \emph{not} exist.\footnote{Peano did not count
$0$ as a natural number, so
his original axioms included the assertion that $1$ had no immediate
predecessor.} \axu{} is that the successor-operation is injective.
We may henceforth write $\N$ instead of $(\N,{}\scr{},0)$.
As first examples of the Induction Axiom in action, we have:
\begin{lemma}\label{lem:zero-succ}
Every non-zero natural number is a successor. Symbolically,
\begin{equation*}\N\models\Forall x (x=0\lor\Exists y \scr
y=x).\end{equation*}
\end{lemma}
\begin{proof}
Let $A$ be the set of natural numbers comprising $0$ and the
successors. That is, $A=\{0\}\cup\{x\in\N:\Exists y\scr y=x\}$.
Then $0\in A$ by definition. Also, if $x\in A$, then
$\scr x$ is a successor, so $\scr x\in A$. By induction, $A=\N$.
\end{proof}
\begin{theorem}
The successor-operation is a bijection between $\N$ and
$\N\setminus\{0\}$.
\end{theorem}
\begin{exercise}
Prove the theorem.
\end{exercise}
\begin{lemma}Every natural number is distinct from its successor:
\begin{equation*}\N\models\Forall x \scr x\neq x.\end{equation*}
\end{lemma}
\begin{proof}
Let $A=\{x\in\N:\scr x\neq x\}$. Now, $\scr 0$ is a successor and is
therefore distinct from $0$ by \axz. Hence $0\in A$. Suppose
$x\in A$. Then $\scr x\neq x$. Therefore $\scr{(\scr x)}\neq \scr x$ by the
contrapositive of \axu; so $\scr x\in A$. By induction, $A=\N$.
\end{proof}
We can spell out \axi{} more elaborately thus: \emph{For every unary
relation
$P$ on $\N$, in order to prove $\N\models\Forall x
P(x)$, it is enough to prove two things:
\begin{enumerate}
\item
$\N\models P(0)$ (the \defn{base step});
\item
$\N\models\Forall x
(P(x)\to P(\scr x))$ (the \defn{inductive step}), that is,
$P(\scr x)$ is true under the assumption that $x$ is a natural number
and $P(x)$ is true.
\end{enumerate}
}
In the inductive step of a proof, the assumption that $x\in\N$ and
$\N\models P(x)$ is called the \defn{inductive hypothesis}. In the
proof of Lemma \ref{lem:zero-succ}, the full inductive hypothesis was
not needed; only $x\in\N$ was needed.
\setcounter{equation}{0}\section{Binary operations on natural numbers}
\markright{\sectbegin Binary operations on natural numbers}
To able to say much more about the natural numbers, we should
introduce the usual arithmetic operations. But how? We do not need new
axioms; the axioms that we already have are enough to enable us to
\emph{define} the arithmetic operations.
Let's start with \defn{addition}. This is a binary operation $+$ on
$\N$ whose values can be arranged in an (infinite) matrix as follows,
in which $m+n$ is the entry $(m,n)$, that is, the entry in row $m$
and column $n$, the counts starting at $0$:
\begin{equation*}
\begin{matrix}
0 & 1 & 2 & 3 & \cdots\\
1 & 2 & 3 & 4 & \\
2 & 3 & 4 & 5 &\\
3 & 4 & 5 & 6 &\\
\vdots &&&&\ddots
\end{matrix}
\end{equation*}
Then row $m$ of this matrix is the sequence of values of a unary
operation $f_m$ on $\N$ such that $f_m(0)=m$ and $f_m(\scr
n)=\scr{f_m(n)}$ for all $n$ in $\N$. So we can \emph{define} $m+n$
as $f_m(n)$. To do this rigorously, we need to know two facts:
\begin{enumerate}
\item
that the functions $f_m$ exist (so that an addition can be defined);
and
\item
that the $f_m$ are unique (so that there is only one addition).
\end{enumerate}
Each of these facts is established by induction, as follows:
\begin{theorem}\label{thm:addition}
There is a unique binary operation $+$ on $\N$ such
that $x+0=x$ and
\begin{equation*}
x+\scr y=\scr{(x+y)}
\end{equation*}
for all $x$ and $y$ in $\N$.
\end{theorem}
\begin{proof}
Let $A$ be the set of natural numbers $x$ for which there is a unary
operation $f_x$ on $\N$ such that $f_x(0)=x$ and
\begin{equation*}
f_x(\scr y)=\scr{f_x(y)}
\end{equation*}
for all $y$ in $\N$. We can define $f_0$ by
\begin{equation*}
f_0(y)=y.
\end{equation*}
So $0\in A$. Suppose $x\in A$. Define $f_{\scr x}$ by
\begin{equation*}
f_{\scr x}(y)=\scr{f_x(y)}.
\end{equation*}
Then $f_{\scr x}(0)=\scr{f_x(0)}=\scr x$, and
\begin{equation*}
f_{\scr x}(\scr y)=\scr{f_x(\scr y)}=\scr{(\scr{f_x(y)})}=\scr{f_{\scr
x}(y)};
\end{equation*}
so $\scr x\in A$. By induction, $A=\N$. This establishes the
\emph{existence} of the desired operation $+$, since
we can define $x+y=f_x(y)$.
For the uniqueness of $+$, it is enough to note the uniqueness of the
functions $f_x$. If $f'_x$ has the properties of $f_x$, then
$f'_x(0)=x=f_x(0)$, and if $f'_x(y)=f_x(y)$, then $f'_x(\scr
y)=\scr{f'_x(y)}= \scr{f_x(y)}=f_x(\scr y)$. By induction,
$f'_x=f_x$.
\end{proof}
\begin{lemma}\label{lem:add}
$\N$ satisfies
\begin{enumerate}
\item
$\Forall x 0+x=x$,
\item\label{succ-add}
$\Forall x \Forall y \scr y+x=\scr{(y+x)}$.
\end{enumerate}
\end{lemma}
\begin{exercise}
Prove the lemma. (For part (\ref{succ-add}), this can be
done by showing $\N=\{x:\Forall y
\scr y+x=\scr{(y+x)}\}$.)
\end{exercise}
\begin{theorem}
$\N$ satisfies
\begin{enumerate}\setcounter{enumi}{2}
\item
$\Forall x \scr x=x+1$,
\item
$\Forall x \Forall y x+y=y+x$ [that is, $+$ is \defn{commutative}],
\item
$\Forall x \Forall y \Forall z (x+y)+z=x+(y+z)$ [that
is, $+$ is \defn{associative}].
\end{enumerate}
\end{theorem}
\begin{exercise}
Prove the theorem.
\end{exercise}
We can uniquely define \defn{multiplication} on $\N$ just as we did
addition: We can show that the multiplication-table
\begin{equation*}
\begin{matrix}
0 & 0 & 0 & 0 & \cdots\\
0 & 1 & 2 & 3 & \\
0 & 2 & 4 & 6 & \\
0 & 3 & 6 & 9 & \\
\vdots &&&&\ddots
\end{matrix}
\end{equation*}
can be written in exactly one way:
\begin{theorem}\label{thm:multiplication}
There is a unique binary operation $\cdot$ on $\N$ such that $x\cdot
0=0$ and
\begin{equation*}
x\cdot\scr y=x\cdot y+x
\end{equation*}
for all $x$ and $y$ in $\N$.
\end{theorem}
\begin{exercise}
Prove the theorem.
\end{exercise}
Multiplication is also indicated by juxtaposition, so that $x\cdot y$
is $xy$.
\begin{lemma}
$\N$ satisfies
\begin{enumerate}
\item
$\Forall x 0x=0$,
\item
$\Forall x \Forall y \scr yx=yx+x$.
\end{enumerate}
\end{lemma}
\begin{exercise}
Prove the lemma.
\end{exercise}
\begin{theorem}
$\N$ satisfies
\begin{enumerate}\setcounter{enumi}{2}
\item
$\Forall x 1x=x$,
\item
$\Forall x \Forall y xy=yx$ [that is, $\cdot$ is commutative],
\item
$\Forall x \Forall y \Forall z (x+y)z=xz+yz$ [that
is, $\cdot$ \defn{distributes} over $+$],
\item
$\Forall x \Forall y \Forall z (xy)z=x(yz)$ [that is,
$\cdot$ is associative].
\end{enumerate}
\end{theorem}
\begin{exercise}
Prove the theorem.
\end{exercise}
In establishing addition and multiplication as operations with the
familiar properties, we used only that $\N$ satisfies the
Induction Axiom. Other structures satisfy this axiom as well, so they
too have addition and multiplication:
\begin{example}\label{example:3}
Let $A=\{0,1,2\}$, and define $s:A\to A$ by
\begin{center}
\begin{tabular}{c || c | c | c}
$x$ & $0$ & $1$ & $2$\\ \hline
$s(x)$ & $1$ & $2$ & $0$
\end{tabular}
\end{center}
Then $(A,s,0)$ satisfies \axi, so it must have addition and
multiplication---which in fact are given by the matrices
\begin{equation*}
\begin{matrix}
0 & 1 & 2\\
1 & 2 & 0\\
2 & 0 & 1
\end{matrix}\quad\text{ and }\quad
\begin{matrix}
0 & 0 & 0\\
0 & 1 & 2\\
0 & 2 & 1
\end{matrix}
\end{equation*}
But $(A,s,0)$ does not satisfy \axz.
\end{example}
If a structure satisfies \axi, we may say that the structure
\defn{admits (proof by) induction}.\label{page:induction}
So all structures that admit induction have unique operations of addition
and multiplication with the properties given above.
Exponentiation on $\N$ is a binary operation $(x,y)\mapsto x^y$ whose
values compose the matrix
\begin{equation*}
\begin{matrix}
1 & 0 & 0 & 0& \cdots\\
1 & 1 & 1 & 1 & \\
1 & 2 & 4 & 8 & \\
1 & 3 & 9 & 27 &\\
\vdots &&&&\ddots
\end{matrix}
\end{equation*}
The formal properties are that $x^0=1$ and
\begin{equation*}
x^{\scr y}=x^y\cdot x
\end{equation*}
for all $x$ and $y$ in $\N$. By induction, there can be no more than
one such operation:
\begin{exercise}
Prove this.
\end{exercise}
Nonetheless, we shall need more than induction to prove that such an
operation exists at all:
\begin{example}
In the induction-admitting structure $(A,s,0)$ of Example \ref{example:3}, if
we try to
define exponentiation, we get $2^0=1$, $2^1=2$, $2^2=1$,
$2^{s(2)}=2^2\cdot 2=2$; but $s(2)=0$, so
$2^{s(2)}=2^0=1$. Since $1\neq 2$, our attempt fails.
\end{example}
For any $x$ in $\N$, we want to define $y\mapsto x^y$ as an operation
$g$ such that $g(0)=1$, and $g(\scr n)=g(n)\cdot x$. We have just
seen that induction is \emph{not} enough to allow us to do this. In
the next section, we shall see that \tech{recursion} is enough, and
that this is equivalent to \axz, \axu{} and \axi{} together.
\section{Recursion}
\markright{\sectbegin Recursion}
We want to be able to define functions $g$ on $\N$ by specifying
$g(0)$ and by specifying how to obtain $g(\scr n)$ from $g(n)$. The
next theorem is that we can do this. The proof is difficult, but the
result is powerful:
\begin{theorem}[Recursion]\label{thm:recursion}
Suppose $B$ is a set with an element $c$. Suppose $f$ is a unary operation
on $B$. Then there is a \emph{unique} function
$g:\N\to B$ such that $g(0)=c$ and
\begin{equation}\label{eqn:recursion}
g(\scr x)=f(g(x))
\end{equation}
for all $x$ in $\N$.
\end{theorem}
\begin{proof}
Let $\family S$ be the set whose members
are the subsets $R$ of $\N\times B$ that have the following two
properties:
\begin{enumerate}\setcounter{enumi}{1}
\item\label{item:base}
$(0,c)\in R$;
\item\label{item:ind}
$(x,t)\in R\implies(\scr x,f(t))\in R$, for all $(x,t)$ in $\N\times B$.
\end{enumerate}
So the members of $\family S$ have the properties required of $g$,
except perhaps the property of being a function on $\N$.
The set $\family S$ is non-empty,
since $\N\times B$ itself is in $\family S$. Let $g$ be the
intersection $\bigcap\family S$. Then $g\in\family S$ (why?).
We shall show that $g$ is a function with domain
$\N$. To do this, we shall show by induction that, for all $x$ in
$\N$, there is a unique $t$ in $B$ such that $(x,t)\in g$.
For the base step of our induction, we note first that $(0,c)\in g$.
To finish the base step, we shall show that, for every $t$ in $\N$, if
$(0,t)\in g$, then $t=c$. Suppose $t\neq c$. Then neither property
(\ref{item:base}) nor property (\ref{item:ind}) requires $(0,t)$ to be
in a given member of $\family S$. That is, if $R\in\family S$, then
$R\setminus\{(0,t)\}$ still has these two properties; so, this
set is in $\family S$. In particular,
$g\setminus\{(0,t)\}\in\family S$.
But $g$ is the smallest member of $\family S$, so
\begin{equation*}g\included g\setminus\{(0,t)\},\end{equation*}
which means $(0,t)\notin g$. By contraposition, the base step is
complete.
As an inductive hypothesis, let us suppose that $x\in \N$ and that
there is a unique $t$ in $B$ such that
$(x,t)\in g$. Then $(\scr x,f(t))\in g$. To complete our inductive
step, we shall show that, for every $u$ in $B$, if $(\scr
x,u)\in g$, then $u=f(t)$.
There are two possibilities for $u$:
\begin{enumerate}\setcounter{enumi}{3}
\item
If $(\scr x,u)=(\scr y,f(v))$
for some $(y,v)$ in $g$, then $\scr x=\scr y$, so $x=y$ by \axu; this
means $(x,v)\in g$, so $v=t$ by inductive hypothesis, and therefore
$u=f(v)=f(t)$.
\item
If $(\scr x,u)\neq(\scr y,f(v))$
for any $(y,v)$ in $g$, then (as in the base step) $g\setminus\{(\scr
x,u)\}\in\family S$, so $g\included g\setminus\{(\scr x,u)\}$, which
means $(\scr x,u)\notin g$.
\end{enumerate}
Therefore, if $(\scr x,u)\in g$, then $(\scr x,u)=(\scr y,f(v))$
for some $(y,v)$ in $g$, in which case $u=f(t)$. Therefore
$f(t)$ is unique such that $(\scr x,f(t))\in g$.
Our induction is now complete; by \axi, we may conclude that
$g$ is a function on $\N$ with the
required properties (\ref{item:base}) and (\ref{item:ind}). If $h$
is also such a function, then $h\in\family S$, so
$g\included h$, which means $g=h$ since both are functions on
$\N$. So $g$ is unique.
\end{proof}
\begin{exercise}
If $g$ and $\family S$ are as in the proof of the Recursion Theorem,
prove that $g\in\family S$.
\end{exercise}
Equation
(\ref{eqn:recursion}) in the statement of Theorem \ref{thm:recursion}
is depicted in the following diagram:
\begin{equation*}
\begin{CD}
\N @>{\scr{}}>> \N\\
@V{g}VV @VV{g}V\\
B @>>{f}> B
\end{CD}
\end{equation*}
From the $\N$ on the left to the $B$ on the right, there are two
different routes, but each one yields the
same result.
A \defn{definition by recursion}\label{page:recursion} is a definition
of a function on
$\N$ that is justified by Theorem \ref{thm:recursion}. Informally,
we can define such a function $g$ by specifying $g(0)$ and by
specifying how $g(\scr x)$ is obtained from $g(x)$.
\begin{remark}
Sections \ref{sect:arith-ops} and \ref{sect:recursion-gen} will
provide several important examples of recursive definitions.
\end{remark}
\begin{theorem}
The Induction Axiom is a logical consequence of the Recursion
Theorem.
\end{theorem}
\begin{proof}
Suppose $A\included\N$, and $0\in A$, and $\scr x\in A$ whenever
$x\in A$. Using the Recursion Theorem alone, we shall show $A=\N$.
Let $\B=\{0,1\}$, and define a function $g_0:\N\to\B$ by the rule
\begin{equation*}g_0(x)=
\begin{cases}
0,&\text{ if }x\in A;\\
1,&\text{ if }x\in \N\setminus A.
\end{cases}
\end{equation*}
Then $g_0$ is a function $g:\N\to\B$ such that $g(0)=0$ and $g(\scr
x)=g(x)$ for all $x$ in $\N$ (why?). But the function $g_1$ such that
$g_1(x)=0$ for all $x$ in $\N$ is also such a function $g$. By the
Recursion Theorem, there is only one such function $g$. Therefore
$g_0=g_1$, so $g_0(x)$ is never $1$, which means $A=\N$.
\end{proof}
\begin{exercise}
Supply the missing detail in the proof.
\end{exercise}
However, there are models of the Induction Axiom which do not satisfy
the Recursion Theorem:
\begin{example}\label{exam:ind-not-imp-rec}
Again let $\B=\{0,1\}$, and let $\lnot$ be the unary operation on
$\B$ such that $\lnot 0=1$ and $\lnot 1=0$. Then
$(\B,\lnot,0)$ admits induction, but there is \emph{no}
function $g:\B\to\N$ such that $g(0)=0$ and $g(\lnot x)={(g(x))}+1$
for all $x$ in $\B$.
\end{example}
\begin{remark}
Apparently Peano
himself did not recognize the distinction between proof by induction
and definition by recursion; see the discussion in Landau
\cite[p.~x]{MR12:397m}.
Burris \cite[p.~391]{Burris} does not acknowledge the distinction.
Stoll \cite[p.~72]{MR83e:04002} uses the term `definition by weak
recursion', although he
observes that the validity of such a definition does \emph{not
obviously} follow from the Induction Axiom. However, Stoll does not
\emph{prove} (as we have done in Example \ref{exam:ind-not-imp-rec})
that the Induction Axiom is consistent with the negation of the
Recursion Theorem.
\end{remark}
\begin{remark}
The structure $(\B,s,0)$ in Example \ref{exam:ind-not-imp-rec} also
satisfies \axu, but not \axi. If we define $t:\B\to \B$ so that
$t(x)=1$ for each $x$ in $\B$, then $(\B,t,0)$ satisfies the
Induction Axiom and \axz, but not \axu. Later (see Remark
\ref{rem:lim-ord}) we shall have natural
examples of structures satisfying \axz\ and \axu, but not
Induction. We shall also observe (in Remark \ref{rem:u-from-rec})
that \axu\ is a consequence of the Recursion Theorem.
\end{remark}
\begin{exercise}
Prove that \axz\ is a consequence of the Recursion Theorem.
\end{exercise}
\setcounter{equation}{0}
\section{Binary operations by recursion}\label{sect:arith-ops}
\markright{\sectbegin Binary operations by recursion}
The Recursion Theorem guarantees the existence of certain \emph{unary}
functions on $\N$. As in Theorem \ref{thm:addition}, we can get the
binary operation of addition by obtaining the unary operations $y\mapsto
x+y$. By recursion, we can define addition as the unique operation
such that
\begin{equation*}
x+0=x\land x+\scr y=\scr{(x+y)}
\end{equation*}
for all $x$ and $y$ in $\N$. In the same way, we can define
multiplication by
\begin{equation*}
x\cdot 0=0
\land x\cdot \scr y=x\cdot y+x.
\end{equation*}
The definition of exponentiation can follow this pattern:
\begin{definition}
The binary operation
$(x,y)\mapsto x^y$
on $\N$ is given by:
\begin{equation}\label{eqn:exp}
x^0=1 \land
x^{\scr y}=x^y\cdot x.
\end{equation}
\end{definition}
In fact, we have something a bit more general. A \defn{monoid} is a
structure $(A,\cdot,1)$ in which $\cdot$ is associative, and $a\cdot
1=a=1\cdot a$ for all $a$ in $A$. The monoid is \defn{commutative} if
$\cdot$ is commutative.
\begin{theorem}
Suppose $\str A$ is a monoid. For every
$y$ in $\N$, there is a unique
operation $x\mapsto x^y$ on $A$ such that \eqref{eqn:exp}
holds for all $x$ in $A$ and all $y$ in $\N$.
\end{theorem}
\begin{proof}
Let $c$ be the
operation $x\mapsto 1$ on $A$, let $B$ be the
set of unary operations on $A$,
and let $f$ be the operation
\begin{equation*}
h\longmapsto(x\mapsto h(x)\cdot x)
\end{equation*}
on $B$. By recursion, there is a function $g:\N\to B$
such that $g(0)=c$ and $g(\scr y)=f(g(y))$ for all $y$ in $\N$. Now
define $x^y=g(y)(x)$.
\end{proof}
\begin{theorem}
For all $x$ and $w$ in a commutative monoid, and for all
$y$ and $z$ in $\N$, the following hold:
\begin{enumerate}
\item
$x^{y+z}=x^yx^z$;
\item
$(x^y)^z=x^{yz}$;
\item
$(xw)^z=x^zw^z$.
\end{enumerate}
\end{theorem}
\begin{exercise}
Prove the theorem.
\end{exercise}
The \defn{binomial coefficient} $\binom mn$ is entry $(m,n)$ in the
following matrix:
\begin{equation*}
\begin{matrix}
1 & 0 & 0 & 0 & 0 & \cdots\\
1 & 1 & 0 & 0 & 0 &\\
1 & 2 & 1 & 0 & 0 &\\
1 & 3 & 3 & 1 & 0 &\\
1 & 4 & 6 & 4 & 1 &\\
\vdots &&&&&\ddots
\end{matrix}
\end{equation*}
We can give a formal definition by recursion:
\begin{definition}
The binary operation $(x,y)\mapsto \binom xy$ on $\N$ is given by:
\begin{equation}\label{eqn:binom}
\textstyle \binom x0=1\land \binom 0{\scr y}=0\land \binom{\scr
x}{\scr y}=\binom xy+\binom x{\scr y}
\end{equation}
\end{definition}
\begin{exercise}
Show precisely that this is a valid definition by recursion.
\end{exercise}
As with exponentiation, we can define the binomial coefficients in
a more general setting. The proof uses the same technique as the
proof of the Recursion Theorem:
\begin{theorem}
For any structure $(A,{}\scr{},0)$ that satisfies \axu{} and admits
induction, for every $y$ in $\N$, there is a unique
operation $x\mapsto \binom xy$ on $A$ such that\eqref{eqn:binom} holds
for all $x$ in $A$ and all $y$ in $\N$.
\end{theorem}
\begin{proof}
Let $c$ be the
operation $x\mapsto 1$ on $A$, and let $B$ be the
set of unary operations on $A$. We first prove that, for
every $h$ in $B$, there is a unique operation $f(h)$ in $B$ given by
\begin{equation*}
f(h)(0)=0\land f(h)(\scr x)=h(x)+f(h)(x).
\end{equation*}
Say $h\in B$, and let $\family S$ be the set whose members are the
subsets $R$ of $A\times A$ such that:
\begin{enumerate}
\item
$(0,0)\in R$;
\item
$(x,t)\in R\implies (\scr x,h(x)+t)\in R$, for all $(x,t)$ in $A\times
A$.
\end{enumerate}
Then $\bigcap \family S$ is the desired operation $f(h)$. (Why?) By
recursion, there is a function $g:\N\to B$ such that $g(0)=c$ and
$g(\scr y)=f(g)(y)$ for all $y$ in $\N$. Now
define $\binom xy=g(y)(x)$.
\end{proof}
\begin{exercise}
Supply the missing detail in the proof.
\end{exercise}
\begin{exercise}
Prove that $\binom x1=x$ for all $x$ in $\N$.
\end{exercise}
See also Exercises \ref{exer:binom} and \ref{exer:bin-thm}.
In the proof of the last theorem, it was essential that the
successor-operation on $A$ be injective:
\begin{example}
Let $A=\{0,1,2\}$, and define $s$ on $A$ by
\begin{center}
\begin{tabular}{ c || c | c | c}
$x$ & $0$ & $1$ & $2$\\ \hline
$s(x)$ & $1$ & $2$ & $1$
\end{tabular}
\end{center}
If we attempt to define $(x,y)\to\binom xy$ on $A\times\N$, we get a
matrix
\begin{equation*}
\begin{matrix}
1 & 0 & 0\\
1 & 1 & 0\\
1 & 2 & 1\\
1 & 1 & 1
\end{matrix}
\end{equation*}
That is, $\binom 12$ should be both $0$ and $1$. So our attempt fails.
\end{example}
\setcounter{equation}{0}
\section{The integers and the rational numbers}\label{sect:int&rat}
\markright{\sectbegin The integers and the rational numbers}
Arithmetic on the integers is determined by arithmetic on the natural
numbers. Given $\N$, we could just \emph{define} the negative
integers by somehow attaching minus-signs. A neater approach is
motivated as follows.
For each natural number $a$, we want there to be an integer $x$ such
that
\begin{equation*}0=a+x.\end{equation*}
Then for each $b$ in $\N$, we should have
\begin{equation*}b=a+b+x.\end{equation*}
By these equations, the pairs $(0,a)$ and $(b,a+b)$ determine the same
integer; so we can define integers to be equivalence-classes of such
pairs.
\begin{lemma}\label{lem:eq}
On $\N\times\N$, let $\sim$ be the relation given by
\begin{equation*}(a,b)\sim(c,d)\Iff a+d=b+c.\end{equation*}
Then $\sim$ is an equivalence-relation. If $(a_0,b_0)\sim(a_1,b_1)$ and
$(c_0,d_0)\sim(c_1,d_1)$, then
\begin{enumerate}
\item
$(a_0+c_0,b_0+d_0)\sim(a_1+c_1,b_1+d_1)$;
\item
$(b_0,a_0)\sim(b_1,a_1)$;
\item\label{item:prod}
$(a_0c_0+b_0d_0,b_0c_0+a_0d_0)\sim(a_1c_1+b_1d_1,b_1c_1+a_1d_1)$.
\end{enumerate}
\end{lemma}
\begin{exercise}
Prove the lemma. (For part (\ref{item:prod}), show that each
member is equivalent to $(a_1c_0+b_1d_0,b_1c_0+a_1d_0)$.)
\end{exercise}
\begin{definition}
Let $\sim$ be as in Lemma \ref{lem:eq}. We define
$\Z$ to be $\N\times\N\modsim$.
Let the $\mathord{\sim}$-class of $(a,b)$ be denoted
\begin{equation*}a-b.\end{equation*}
By Lemma \ref{lem:eq}, we can define the operations $+$, $-$ and
$\cdot$ on $\Z$ by the following rules, where $a,b,c,d\in\N$:
\begin{enumerate}
\item\label{item:ambig-sum}
$(a-b)+^{\Z}(c-d)=(a+^{\N}c)-(b+^{\N}d)$;
\item
$-^{\Z}(a-b)=b-a$;
\item
$(a-b)\cdot^{\Z}(c-d)=
(a\cdot^{\N}c+^{\N}b\cdot^{\N}d)-(b\cdot^{\N}c+^{\N}a\cdot^{\N}d)$.
\end{enumerate}
\end{definition}
Note that, by the definition, an integer like $5-3$ is \emph{not} the
natural number $2$;
it is not a natural number at all; it is the equivalence-class
\begin{equation*}\{(2,0),(3,1),(4,2),(5,3),\dots\},\end{equation*}
which is $\{(x,y)\in\N^2:x=y+2\}$.
\begin{theorem}
The function
$x\mapsto x-0:\N\to\Z$ is injective and preserves $+$ and $\cdot$,
that is,
\begin{enumerate}
\item
$(x+^{\N}y)-0=(x-0)+^{\Z}(y-0)$;
\item
$(x\cdot^{\N}y)-0=(x-0)\cdot^{\Z}(y-0)$
\end{enumerate}
for all $x$ and $y$ in $\N$.
On $\Z$, addition and multiplication are commutative and associative,
and multiplication distributes over addition. Finally,
\begin{equation*}x+^{\Z}(-^{\Z}x)=0-0\end{equation*}
for all $x$ in $\Z$.
\end{theorem}
\begin{exercise}
Prove the theorem.
\end{exercise}
\begin{definition}
On $\Z$, define the binary operation $-$ by
\begin{equation*}x-^{\Z}y=x+^{\Z}(-^{\Z}y).\end{equation*}
\end{definition}
\begin{lemma}
If $x,y\in\N$, then the integer
$x-y$ is $(x-0)-^{\Z}(y-0)$.
\end{lemma}
Now we can identify the natural numbers with their images in $\Z$,
considering the natural number $x$ to be equal to the integer $x-0$.
We can define the \defn{rational numbers} similarly:
\begin{lemma}\label{lem:Qeq}
On $\Z\times(\Z\setminus\{0\})$, let $\sim$ be the relation given by
\begin{equation*}
(a,b)\sim(c,d)\Iff ad=bc.
\end{equation*}
Then $\sim$ is an equivalence-relation. If $(a_0,b_0)\sim(a_1,b_1)$ and
$(c_0,d_0)\sim(c_1,d_1)$, then
\begin{enumerate}
\item
$(a_0d_0\pm b_0c_0,b_0d_0)\sim(a_1d_1\pm b_1c_1,b_1d_1)$;
\item
$(a_0c_0,b_0d_0)\sim(a_1c_1,b_1d_1)$;
\item
$(b_0,a_0)\sim(b_1,a_1)$ and $(0,a_0)\sim(0,1)$ if $a_0\neq0$.
\end{enumerate}
\end{lemma}
\begin{exercise}
Prove the lemma.
\end{exercise}
\begin{definition}
Let $\sim$ be as in Lemma \ref{lem:Qeq}. We define
$\Q$ to be $\Z\times(\Z\setminus\{0\})\modsim$.
Let the $\mathord{\sim}$-class of $(a,b)$ be denoted
\begin{equation*}
\frac ab
\end{equation*}
or $a/b$.
By Lemma \ref{lem:Qeq}, we can define the operations $+$, $-$ and
$\cdot$ on $\Q$, and $x\mapsto x\inv$ on $\Q\setminus\{0/1\}$, by the
following rules, where $a,b,c,d\in\Z$:
\begin{enumerate}
\item
$a/b\pm c/d=(ad\pm bc)/bd$;
\item
$(a/b)(c/d)=ac/bd$;
\item
$(a/b)\inv=b/a$ if $a\neq0$.
\end{enumerate}
\end{definition}
\begin{theorem}
The function $x\mapsto x/1:\Z\to\Q$ is injective and preserves $+$,
$-$ and $\cdot$. On $\Q$, addition and multiplication are
commutative and associative,
and multiplication distributes over addition. Finally,
\begin{equation*}x\cdot x\inv=\frac 11
\end{equation*}
for all $x$ in $\Q$.
\end{theorem}
\begin{exercise}
Prove the theorem.
\end{exercise}
Now we can identify the integers with their images in $\Q$,
considering the integer $x$ to be equal to the rational number $x/1$.
\setcounter{equation}{0}
\section{Recursion generalized}\label{sect:recursion-gen}
\markright{\sectbegin Recursion generalized}
How can we define $n$\defn{-factorial}, $(n!)$? Informally, we write
\begin{equation*}
n!=1\cdot 2\cdot 3\cdots(n-1)\cdot n.
\end{equation*}
For a formal recursive definition, we can try
\begin{equation}\label{eqn:factorial}
0!=1\land (\scr x)!=\scr x\cdot x!
\end{equation}
---but for this to be valid by the Recursion Theorem, we need an
operation $f$ on $\N$ so that $f(x!)=(\scr x\cdot x!)$. Such an
operation exists, but it is not clear how we can define it before
we have defined $(x!)$.
The definition \eqref{eqn:factorial} is valid by the following:
\begin{theorem}[Recursion with parameter]\label{thm:recursion-p}
Suppose $B$ is a set with an element $c$. Suppose $F$ is a function
from $\N\times B$ to $B$. Then there is a \emph{unique} function
$G:\N\to B$ such that $G(0)=c$ and
\begin{equation}\label{eqn:str-rec}
G(\scr x)=F(x,G(x))
\end{equation}
for all $x$ in $\N$.
\end{theorem}
\begin{proof}
Let $f$ be the function
\begin{equation*}
(x,b)\longmapsto(\scr x,F(x,b)):
\N\times B\longrightarrow\N\times B.
\end{equation*}
By recursion, there is a unique function $g$ from $\N$ to
$\N\times B$ such that $g(0)=(0,c)$ and
\begin{equation*}
g(\scr x)=f(g(x))
\end{equation*}
for all
$x$ in $\N$. Now let $G$ be $\pi\circ g$, where $\pi$ is the
function
\begin{equation*}
(x,b)\longmapsto b:\N\times B\longrightarrow B.
\end{equation*}
Then for each $x$ in $\N$ we have $g(x)=(y,G(x))$ for some $y$ in
$\N$. We can prove by induction that $y=x$. Indeed, this is the
case when $x=0$, since $g(0)=(0,c)$. Suppose $g(x)=(x,G(x))$ for some
$x$ in $\N$. Then
\begin{equation}\label{eqn:strong-rec}
g(\scr x)=f(x,G(x))=(\scr x,F(x,G(x))).
\end{equation}
In particular, the first entry in the value of $g(\scr x)$ is $\scr x$. This
completes our induction.
We now know that $g(x)=(x,G(x))$ for all $x$ in $\N$. Hence in
particular $g(\scr x)=(\scr x,G(\scr x))$. But we also have
(\ref{eqn:strong-rec}). Therefore we have (\ref{eqn:str-rec}),
as desired. Finally, each of $g$ and $G$ determines the other. Since
$g$ is unique, so is $G$.
\end{proof}
\begin{example}
We can define a function $f$ on $\N$ by requiring $f(0)=0$ and
$f(\scr x)=x$. This is a valid recursive definition, by Theorem
\ref{thm:recursion-p}. Note that $f$ picks out the immediate
predecessor of a natural number, when this exists.
\end{example}
\begin{remark}\label{rem:u-from-rec}
In the example, since $f$ is
unique, we see that \axu\ follows from the Recursion Theorem.
\end{remark}
\begin{definition}
For any function $f:\N\to M$, where $M$ is a set equipped with addition and
multiplication, we define the sum
$\sum_{k=0}^nf(k)$ and the product $\prod_{k=0}^nf(k)$ recursively
as follows:
\begin{itemize}
\item
$\displaystyle\sum_{k=0}^0f(k)=f(0)$ and
$\displaystyle\sum_{k=0}^{\scr
n}f(k)=\displaystyle\sum_{k=0}^nf(k)+f(\scr n)$;
\item
$\displaystyle\prod_{k=0}^0f(k)=f(0)$ and
$\displaystyle\prod_{k=0}^{\scr
n}f(k)=\left(\displaystyle\prod_{k=0}^nf(k)\right)f(\scr n)$.
\end{itemize}
\end{definition}
\begin{exercise}
Prove the following.
\begin{enumerate}
\item
$\sum_{k=0}^n(k+1)=(n^2+3n+2)/2$
\item
$\sum_{k=0}^n(k+1)^2=(2n^3+9n^2+13n+6)/6$
\item
$\sum_{k=0}^nb^k=(b^{n+1}-1)/(b-1)$
\item
$\sum_{k=0}^n(2k+1)=(n+1)^2$
\item
$\prod_{k=0}^n((k+1)/(k+2))=1/(n+2)$
\end{enumerate}
\end{exercise}
\setcounter{equation}{0}
\section{The ordering of the natural numbers}\label{order}
\markright{\sectbegin The ordering of the natural numbers}
In $\N$, if $\scr x=y$, then $x$ is an immediate predecessor of $y$,
and we know that $x$ is unique. More generally, we should like to say
that $z$ is a \tech{predecessor} of $y$ if $y$ is $\scr z$, or
$\scr{(\scr z)}$, or $\scr{(\scr{(\scr z)})}$, or
$\scr{(\scr{(\scr{(\scr z)})})}$,
or \dots. We can take care of the dots using
recursion.
\begin{definition}\label{dfn:preds}
Let the function $x\mapsto \pis x:\N\to\pow{\N}$ be given by the
rule:
\begin{equation*}
\bar 0=\emptyset\land\Forall x\pis{\scr x}=\pis x\cup\{x\}.
\end{equation*}
The elements of $\pis x$ are the \defn{predecessors of $x$}.
\end{definition}
We shall prove in this section that the binary relation
\begin{equation*}\{(x,y):x\in\pis y\}\end{equation*}
on $\N$ is a strict total ordering. It will be important in
\S\ssp\ref{model} that everything proved in this section is a
consequence of just two facts:
\begin{itemize}
\item
$\N$ admits induction.
\item
A function $x\mapsto \pis x:\N\to\pow{\N}$ does exist as given by Definition
\ref{dfn:preds}.
\end{itemize}
We shall have to be precise with the relations $\in$ and $\included$,
which are
\Eng{containment} and \Eng{inclusion} respectively. The relation
$\pincluded$ is \emph{proper} inclusion (the intersection of
$\included$ and $\neq$). We have:
\begin{center}
\begin{tabular}{|c@{$\Iff$}c@{$\Iff$}c|}\hline
$x\in A$ & $x$ is an element of $A$ & $A$ contains $x$\\ \hline
$x\included A$ & $x$ is a subset of $A$ & $A$ includes $x$\\ \hline
\end{tabular}
\end{center}
We shall show first that $y\in\pis x\Iff\pis y\pincluded \pis x$ for all $x$
and $y$ in $\N$.
\begin{lemma}\label{lem:el-inc}
$\N$ satisfies
\begin{equation}\label{eqn:in-pinc}
\Forall y (y\in \pis x\to \pis y\pincluded \pis x\land
\pis{\scr y}\included \pis x)
\end{equation}
whenever $x\in\N$. Hence $\Forall x x\notin\pis x$; also, the
map $x\mapsto \pis x$ is injective.
\end{lemma}
\begin{proof}
The formula \eqref{eqn:in-pinc} is satisfied when $x=0$.
Suppose it is satisfied when $x=z$. By contraposition, this means that, since
$\pis z\not\pincluded \pis z$, we have $z\notin \pis z$. Therefore
$\pis z\pincluded \pis{\scr z}$.
Say $y\in \pis{\scr z}$. Then either $y\in \pis z$ or $y=z$. In the
former case,
$\pis y\pincluded \pis z$ by inductive hypothesis. Hence in either case,
$\pis y\included \pis z$. Therefore $\pis y\pincluded \pis{\scr z}$;
also, $\{y\}\included
\pis{\scr z}$, so $\pis{\scr y}\included \pis z$. So \eqref{eqn:in-pinc}
holds when $x=\scr z$. By induction, \eqref{eqn:in-pinc} holds for all $x$
in $\N$.
Since $\pis x\not\pincluded\pis x$, we have $x\notin\pis x$, again by the
contrapositive of \eqref{eqn:in-pinc}. For the injectivity of
$x\mapsto\pis x$, note first that $\pis x=\pis 0\Iff x=0$. Suppose
$\pis{\scr x}=\pis{\scr y}$, that is, $\pis x\cup\{x\}=\pis
y\cup\{y\}$. Then either $y\in\pis x$ or $y=x$. In the first case,
$\pis{\scr y}\included\pis x\pincluded\pis{\scr x}$ (since $x\notin\pis
x$), contradicting $\pis{\scr x}=\pis{\scr y}$; therefore $y=x$.
By Lemma \ref{lem:zero-succ} (whose proof uses only induction), we are
done.
\end{proof}
\begin{lemma}\label{lem:inc-el}
$\N$ satisfies
\begin{equation}\label{eqn:pinc-in}
\Forall y (\pis y\pincluded \pis x\to y\in \pis x)
\end{equation}
for each natural number $x$.
\end{lemma}
\begin{proof}
The formula \eqref{eqn:pinc-in} holds when $x=0$. Suppose
\eqref{eqn:pinc-in} is true when $x=z$. Say $y\in\N$ and
$\pis y\pincluded \pis{\scr z}$. Then
$\pis{\scr z}\not\included \pis y$, so $z\notin \pis y$ by Lemma
\ref{lem:el-inc}. Hence
$\pis y\included \pis z$. If $\pis y\pincluded \pis z$, then $y\in
\pis z$ by
inductive hypothesis. If $\pis y=\pis z$, then $y=z$, so $y\in\{z\}$.
In either case, $y\in
\pis{\scr z}$. Thus \eqref{eqn:pinc-in} holds when $x=\scr z$.
\end{proof}
\begin{definition}\label{defn:<}
If $x,y\in\N$, we write
\begin{equation*}x