\documentclass[%
version=last,%
a5paper,
10pt,%
headings=small,%
bibliography=totoc,%
twoside,%
reqno,%
cleardoublepage=empty,%
parskip=half,%
draft=true,%
%DIV=classic,%
DIV=12,%
headinclude=false,%
pagesize]
{scrartcl}
%\usepackage[notcite,notref]{showkeys}
\usepackage[greek,english]{babel}
\usepackage[neverdecrease]{paralist}
\usepackage{hfoldsty,url,verbatim,relsize}
\usepackage{relsize} % Here \smaller scales by 1/1.2; \relscale{X} scales by X
\renewenvironment{quote}{\begin{list}{}
{\relscale{.90}\setlength{\leftmargin}{0.05\textwidth}
\setlength{\rightmargin}{\leftmargin}}
\item[]}
{\end{list}}
\usepackage{pstricks,pst-node,pst-tree}
\usepackage[all]{xy}
%\usepackage{auto-pst-pdf} % This is so that pdflatex works; but then one must run pdflatex -shell-escape (or pdflatex --enable-write18 in MikTeX). See http://tug.org/PSTricks/main.cgi?file=pdf/pdfoutput . I get an error message when I run latex on the file without "-shell-escape".
\newcommand{\Treebox}[1]{\Tr{\psframebox{#1}}}
\usepackage{amsmath,amssymb,amsthm,upgreek}
\newcommand{\stnd}[1]{\mathbb#1}
\newcommand{\R}{\stnd R}
\newcommand{\N}{\stnd N}
\newcommand{\Z}{\stnd Z}
%\newcommand{\lto}{\Rightarrow}
\newcommand{\lto}{\rightarrow}
\newcommand{\liff}{\Leftrightarrow}
\usepackage{mathrsfs}
\newcommand{\sig}{\mathscr S}
\newcommand{\str}[1]{\mathfrak#1}
\newcommand{\Forall}[1]{\forall#1\;}
\newcommand{\Exists}[1]{\exists#1\;}
\renewcommand{\leq}{\leqslant}
\newcommand{\included}{\subseteq}
\newcommand{\pincluded}{\subset}
\newcommand{\nincluded}{\nsubseteq}
\usepackage{bm}
%\newcommand{\class}[1]{\bm#1}
\newcommand{\gn}[1]{\ulcorner#1\urcorner}
\renewcommand{\phi}{\varphi}
\renewcommand{\theta}{\vartheta}
\newtheorem{theorem}{Theorem}
\newtheorem*{axiom}{Axiom}
\newtheorem*{lemma}{Lemma}
\newtheorem*{axsch}{Axiom Scheme}
\newtheorem*{corollary}{Corollary}
\renewcommand{\theequation}{\fnsymbol{equation}}
\begin{document}
\title{Induction and Recursion}
\author{David Pierce}
\date{\today}
\maketitle
\tableofcontents
\section{Introduction}
In mathematics we use repeated activity in several ways:
\begin{compactenum}[1)]
\item
to define sets;
\item
to prove that all elements of those sets have certain properties;
\item
to define functions on those sets.
\end{compactenum}
These three techniques are often confused, but they should not be.
Clarity here can prevent mathematical mistakes; it can also
highlight important concepts and results such as Fermat's (Little)
Theorem, freeness in a category, and G\"odel's Incompleteness Theorem.
The main purpose of the present article is to show this.
In the `Preface for the Teacher' of his
\emph{Foundations of Analysis} of 1929 \cite{MR12:397m},
Landau discusses to the confusion just mentioned, but without full
attention to the logic of the situation. The present article
may be considered as a sketch of how Landau's book might be updated.
It is indeed a sketch, and that only. I make a number of historical
references, when I have been able to consult the cited articles; but
the article is not a thorough-going review of the history of the
mathematical ideas discussed.
\section{Number theory}
For an example of the three techniques, suppose we are given the field
$\R$ of real numbers.
\begin{asparaenum}
\item
We may define the subset $\N$ of natural numbers by requiring that it
contain $1$ and that it contain $x+1$ for each of its elements $x$. Thus if a real number cannot be shown to be in $\N$ by application of these rules, repeated as needed, then that number is \emph{not} in $\N$, by definition.
\item
We may show that $\N$ is closed under addition, since for each element $x$ of $\N$, we have by definition $x+1\in\N$, and moreover if
$y\in\N$ and $x+y\in\N$, then
\begin{equation}\label{eqn:+}
x+(y+1)=(x+y)+1,
\end{equation}
which again is in
$\N$ by definition.
Thus if $A$ is the set of elements $y$ of $\N$ such that, for all $x$ in $\N$, the sum $x+y$ is in $\N$, then $A$ contains $1$ and is closed under adding $1$. Therefore $A=\N$, by definition of the latter.
Similarly, $\N$ is closed under multiplication,
since for all $x$ in $\N$ we have $x\cdot1=x$, and if $x\cdot y\in\N$, then
\begin{equation}\label{eqn:mult}
x\cdot(y+1)=x\cdot
y+x,
\end{equation}
which we now know to be in $\N$.
\item
Similarly, if we have the Gamma function
\begin{equation*}
x\mapsto\int_0^{\infty}\frac{t^{x-1}}{\mathrm e^{t}}\;\operatorname dt
\end{equation*}
on $(0,\infty)$,
then we can show that $\N$ is closed under it, using integration by parts. Alternatively, we can just define this operation on $\N$ by requiring $\Gamma(1)=1$ and $\Gamma(x+1)=\Gamma(x)\cdot x$, so that in general $\Gamma(x)$ is what is usually denoted by $(x-1)!$.
\end{asparaenum}
The definition of $\N$ here is \textbf{inductive}; the proof that $\N$
is closed under addition and multiplication is \textbf{inductive}; the second
definition of $\Gamma$ on $\N$ is \textbf{recursive.}
Alternatively, the definition of $\N$ can also be called recursive; or the second definition of $\Gamma$ can be called inductive.
But I shall argue that either of these alternatives is misleading.
Our inductive definition of $\N$ may be considered as informal.
Formally, we may define $\N$ as the \emph{intersection} of the set of all
subsets of $\R$ that contain $1$ and are closed under adding $1$.
Alternatively, $\N$ is the \emph{union} of the set of subsets $A$ of $[1,\infty)$ such that $x-1\in A$ for every $x$ in $A\smallsetminus\{1\}$.
In any case $\N$ is the \emph{smallest} subset of $\R$ that contains $1$ and
is closed under adding $1$. That is to say, $\N$ admits \textbf{proof
by induction:} every subset $B$ of $\N$ is equal to $\N$, provided
we can show that $B$ contains $1$ and contains $x+1$ whenever it
contains $x$. The whole point of the inductive definition of $\N$ is
to ensure that $\N$ admits inductive proofs.
To avoid taking $\R$ for granted,
we may try a direct axiomatic approach to $\N$.
(This is the point of Landau's book.)
We can just declare
that $\N$ is a set that
\begin{compactenum}[1)]
\item
has an element $1$,
\item
is closed under an operation $x\mapsto x+1$, and
\item
admits inductive proofs.
\end{compactenum}
Then we obtain the operations of addition and multiplication that we obtained before.
Indeed, we already know how to add $1$ to an element $x$ of $\N$: the result is simply $x+1$. If we know what $x+y$ is, then we can use \eqref{eqn:+} as a definition of $x+(y+1)$. So we have addition on $\N$. We go on to define $x\cdot 1$ as $x$, and if we know what $x\cdot y$ is, we define $x\cdot(y+1)$ as in \eqref{eqn:mult}.
This is all true; and yet, in saying it this way, we have cheated. For, it would be \emph{false} to say by analogy that we can make the definition
\begin{equation}\label{eqn:x^1}
x^1=x,
\end{equation}
and if we know what $x^y$ is, then we can make the definition
\begin{equation}\label{eqn:exp+1}
x^{y+1}=x^y\cdot x.
\end{equation}
How can this be?
Though the reader may not yet be fully in the know, s/he may have observed that our axiomatic treatment of $\N$ has omitted two of Peano's axioms of 1889 \cite{Peano}:
\begin{compactenum}[1)]\setcounter{enumi}3
\item
the operation $x\mapsto x+1$ is injective, but
\item
its range does not contain $1$.
\end{compactenum}
These axioms turn out not to be needed for the definitions of addition and multiplication on $\N$; but they or at least \emph{something} more is needed for exponentiation on $\N$.
Again, how can this be? Let us first observe that it \emph{is} so, by
noting that the Induction Axiom is available in modular arithmetic, although exponentiation as a binary operation is \emph{not} generally definable there.
Indeed, in the \emph{Disquisitiones Arithmeticae} of 1801
\cite[\P50]{Gauss},
which is apparently the origin of our notion of modular arithmetic,
Gauss reports that Euler's first proof of Fermat's
Theorem was as follows. Let $p$ be a prime modulus. Trivially
$1^p\equiv 1$ (with respect to $p$ or indeed any modulus). If $a^p\equiv a$ (\emph{modulo} $p$) for some $a$, then, since
$(a+1)^p\equiv a^p+1$, we conclude $(a+1)^p\equiv a+1$. This can be
understood as a perfectly valid proof by induction in the ring
with $p$ elements that we denote by $\Z/p\Z$: we have then proved
$a^p=a$ for all $a$ in this ring.
However, Dyer-Bennet showed in 1940 \cite{MR0001234} that, with
respect to a modulus $n$, all congruences $a\equiv b$ and $c\equiv d$
entail the congruence $a^c\equiv b^d$ if and only if $n$ is one\footnote{Dyer-Bennet names G. Birkhoff as having suggested the problem of finding these $n$ and as having found them independently. I found Dyer-Bennet's article through \emph{The on-line encyclopedia of integer sequences.}} of
$1$, $2$, $6$, $42$, and $1806$. We conclude:
\begin{theorem}\label{thm:mod}
For all $n$ in $\N$,
The finite ring $\Z/n\Z$ has a binary operation $(x,y)\mapsto
x^y$ satisfying the identities \eqref{eqn:x^1} and \eqref{eqn:exp+1}
if and only if $n\in\{1,2,6,42,1806\}$.
\end{theorem}
Let us observe in passing that the sequence of moduli here arises from
what Mazur \cite{Mazur-Th} calls the \emph{self-proving}
formulation of Euclid's Proposition IX.20 in the \emph{Elements}
\cite{MR1932864}: give me some primes, and I'll give you another one
by multiplying yours together, adding $1$, and finding a prime divisor
of the result. Indeed,
the product of \emph{no} primes should be considered as $1$, and then:
\begin{align*}
&\phantom{{}={}}1+1=2,\text{ prime;}\\
&\phantom{{}={}}2+1=3,\text{ prime;}\\
2\cdot 3+1&=6+1=7,\text{ prime;}\\
2\cdot 3\cdot 7+1&=42+1=43,\text{ prime;}\\
2\cdot 3\cdot 7\cdot43+1&=1806+1=1807.
\end{align*}
It turns out that since $1807=13\cdot 139$, the sequence of moduli in the theorem stops,
although of course the set of primes continues to grow.
Little discoveries like Theorem \ref{thm:mod} are a reason to begin the
natural numbers with $1$ rather than $0$. When Henkin in 1960
\cite{MR0120156} made some of the observations of the present
article, he started the natural numbers with $0$ and noted in effect that
on $\Z/2\Z$, if \eqref{eqn:exp+1} holds, then $0^y=0$ for all $y$, since $y=z+1$ for some $z$; in
particular $0^0\neq1$, so the equation $x^0=1$ fails.
Of course Henkin's argument works in $\Z/n\Z$ for every $n$ that exceeds $1$.
Still, $\Z/n\Z$ always has an addition given by the identity
\eqref{eqn:+} above (namely $x+(y+1)=(x+y)+1$).
At the beginning of the \emph{Disquisitiones,} Gauss notes that
addition of integers respects congruence; but apparently he does not feel the need to prove it.
However, we may establish the identity \eqref{eqn:+} on $\Z/n\Z$ as
follows.
We assume that we are given the operation $x\mapsto x+1$.
As an inductive hypothesis, we assume too that we `know' $x+b$ for some $b$; that is, we assume there is an operation $x\mapsto x+b$. But this is not just any operation; it satisfies the identities
\begin{align}\label{eqn:ih+}
1+y&=y+1,&(x+1)+y&=(x+y)+1
\end{align}
when $y=b$.
Note that these equations are vacuously true when $y=1$.
If we now use \eqref{eqn:+} when $y=b$ to define $x\mapsto x+(b+1)$,
then as a special case we have
\begin{equation*}
1+(b+1)=(1+b)+1,
\end{equation*}
so by the inductive hypothesis \eqref{eqn:ih+} we have $1+(b+1)=(b+1)+1$. Similarly
\begin{align*}
(x+1)+(b+1)
&=((x+1)+b)+1&&\text{[by \eqref{eqn:+} when $y=b$]}\\
&=((x+b)+1)+1&&\text{[by \eqref{eqn:ih+} when $y=b$]}\\
&=(x+(b+1))+1.&&\text{[by \eqref{eqn:+} when $y=b$]}
\end{align*}
Thus \eqref{eqn:ih+} holds when $y=b+1$.
Therefore on $\Z/n\Z$, as on $\N$, for every $y$, there is at least one operation $x\mapsto x+y$ satisfying \eqref{eqn:ih+}.
All we have used to establish this is induction (along with the element $1$ and the operation $x\mapsto x+1$; but the Induction Axiom assumes that these exist).
By induction also, each of the operations $x\mapsto x+y$ satisfying \eqref{eqn:ih+} is unique. Indeed, suppose when $y=b$ there is one such function, but $f$ is another, that is, $f(1)=b+1$ and $f(x+1)=f(x)+1$. Then by \eqref{eqn:ih+} when $y=b$ we have $1+b=f(1)$, and if $f(a)=a+b$, then $(a+1)+b=(a+b)+1=f(a)+1=f(a+1)$. Thus $x+b=f(x)$ for all $x$, that is, $f$ is the function $x\mapsto x+y$.
Now we have a unique operation $(x,y)\mapsto x+y$ satisfying \eqref{eqn:ih+}. By looking back at the proof, we conclude that \eqref{eqn:+} is an identity. Indeed, we used this equation to define an operation $x\mapsto x+(b+1)$ from $x\mapsto x+b$, and since these operations are now known to exist uniquely, \eqref{eqn:+} must hold.
However, Peano \cite[p.\ 95]{Peano} uses this equation by itself as a definition of addition, writing:\footnote{Peano has \eqref{eqn:+} in the form $a+(b+1)=(a+b)+1$.}
\begin{quote}
This definition has to be read as follows: if $a$ and $b$ are numbers, and if $(a+b)+1$ has a meaning (that is, if $a+b$ is a number) but $a+(b+1)$ has not yet been defined, then $a+(b+1)$ means the number that follows $a+b$.
\end{quote}
Is Peano correct? Can we take \eqref{eqn:+} as a definition in his sense? If so, then we should be able to take the equations \eqref{eqn:x^1} and \eqref{eqn:exp+1} as a definition of exponentiation on, say, $\Z/3\Z$. When Peano makes his remark, he has stated all of his axioms, and $\Z/3\Z$ does not satisfy all of them; still, it satisfies the Induction Axiom, and Peano does not appeal to any other axioms, or a lemma derived from them, to justify his remark.
Following Peano's procedure in $\Z/3\Z$ then, we get the successive rows of the following table:
\begin{equation*}
\begin{array}{l|ccc}
x&1&2&3\\\hline
x^1&1&2&3\\
x^2&1&1&3\\
x^3&1&2&3
\end{array}
\end{equation*}
We make no new row for $x^4$, since $4=1$ in $\Z/3\Z$, so $x^4$ has already been defined. If we did try to make a row for $x^4$, using \eqref{eqn:exp+1}, then it would not agree with the row for $x^1$. Thus, although we can use equations \eqref{eqn:x^1} and \eqref{eqn:exp+1} to give a definition, in Peano's sense, of exponentiation in $\Z/3\Z$, those equations are not identities under the definition.
Logically then, although we can use the rule \eqref{eqn:+} by itself to build up an addition table for $\N$ or $\Z/n\Z$, it does not follow that \eqref{eqn:+} is an identity. This needs an additional argument.
Somewhat modernized, Peano's thinking seems to be this. Let $A$ be the set of all $y$ such that an operation $x\mapsto x+y$ is defined. Then $1\in A$. Moreover if $b\in A$ then, since we can define $x+(b+1)$ by \eqref{eqn:+} when $y=b$, it follows that $b+1\in A$. By induction, $A$ contains all $y$. But this gives us no unique operation of addition. Indeed, assuming $b\in A$, we can show $b+1\in A$ by defining $x+(b+1)$ as $x+b$ or even $1$. What we must do is something like what we did: let $A$ be the set of all $y$ such that an operation $x\mapsto x+y$ is defined \emph{so as to satisfy \eqref{eqn:ih+}.}
Now I claim to have shown what I said at the beginning, that the definition of addition by means of \eqref{eqn:+} should not be called inductive, because such definitions are not generally justified by induction alone. The underlying observation here is not original; again, Henkin makes it, and before him,
Landau. (Landau in turn credits Kalm\'ar with the special proof that addition can indeed be established by induction alone. Landau does not mention that only induction is used; nor does he give an example like exponentiation, where induction is definitely not enough.) Using $y'$ for $y+1$, Landau writes in his `Preface for the Teacher':
\begin{quote}
On the basis of his five axioms, Peano defines $x+y$ for fixed $x$ and all $y$ as follows:
\begin{gather*}
x+1=x'\\
x+y'=(x+y)',
\end{gather*}
and he and his successors then think that $x+y$ is defined generally; for, the set of $y$'s for which it is defined contains $1$, and contains $y'$ if it contains $y$.
But $x+y$ has \emph{not} been defined.
\end{quote}
Landau once shared the confusion of Peano and his successors; the fact of this earlier confusion is a reason for publishing his book.
Nevertheless, despite the warnings of Landau, Henkin, and
others,\footnote{Dedekind was perhaps the first to give such a warning; see below.} confusion about these basic matters persists.
I suggest that Landau himself is a bit confused about what an
axiom is; at least, he fails to make a distinction that we find it worthwhile to make today. Peano himself gives \emph{nine} axioms for $\N$, but three of them are the
reflexive, symmetric, and transitive properties of equality of
numbers, and another is that something equal to a number is a number. Landau rightly sets these aside as being purely logical properties,\footnote{Perhaps Peano himself, or one of the followers mentioned by Landau, had already done this setting aside.} not specific to elements of $\N$.
Peano's remaining five axioms are those mentioned by Landau and also given
earlier in the present article. However, two of those, namely that $\N$
contains $1$ and
is closed under adding $1$,
are simply features of the structure\footnote{That is, they are formally entailed by considering $\N$ as a structure in a signature with a constant for $1$ and a singulary operation-symbol for $x\mapsto x+1$. See \S\ref{sect:alg} below.} of $\N$, features whose properties
are fixed by the remaining three axioms.
Burris gives these three axioms at the head of `The Dedekind--Peano
Number System', an appendix to his \emph{Logic for Mathematics and
Computer Science} \cite{Burris}, an excellent book from which I have
learned a lot. After stating the axioms, Burris defines addition as
Peano does. As we have seen, the definition \emph{is} justified; but
it is not \emph{obviously} justified. The student may come away from
that appendix with the wrong impression.
A similarly wrong impression may be got from Mac Lane and Birkhoff's
\emph{Algebra} \cite[p.\ 15]{MR941522}, where right after the Peano axioms are
given, it is said that the natural numbers can serve to index iterates
of singulary (`unary') operations. If this is so, then one might expect elements of $\Z/3\Z$ to serve as indices of iterated products---that is, as exponents of powers---in $\Z/3\Z$ (or in any $\Z/n\Z$, or $\Z$ itself); but as we have seen, this is not possible.
Also in his `Preface for the Teacher', Landau warns,
\begin{quote}
My book is
written, as befits such easy material, in merciless telegram
style (\textbf{`Axiom', `Definition', `Theorem', `Proof',} occasionally \textbf{`Preliminary Remark',} rarely words which do not belong to one of these five categories).
\end{quote}
But the material is \emph{not} easy. Perhaps it is the
assumption that the material \emph{is} easy that led Landau and
others to be confused about it in the first place. Such confusion
could have real mathematical consequences: it might lead one to
replace Fermat's Theorem with a false belief that $a^{p+1}\equiv a$ is an
identity \emph{modulo} $p$.
Landau is not concerned with noting what can be proved by induction
alone; the point of his book is just to construct the fields of real
and complex numbers, so the analyst can use them in good conscience.
Nonetheless, it seems worthwhile to note that, after defining addition on $\N$
as we have done,
we can go on to establish, again by induction alone, that addition is
commutative, associative, and cancellative (in the sense that
$x+z=y+z$ implies $x=y$). Also by induction, there is a unique
operation of multiplication, which is commutative and associative, and
which distributes over addition, although it need not be
cancellative. Thus we have $\N$ as a commutative semi-ring; but then we also have the set $\{1,\dots,n\}$ as a commutative semi-ring when we define $x\mapsto x+1$ on this set as in the following table, so that proof by induction is available.\footnote{It may be said that we do not know what $\{1,\dots,n\}$ means unless we have defined the ordering of $\N$, so that $\{1,\dots,n\}=\{x\in\N\colon 1\leq x\leq n\}$. The theorem that $\N$ can indeed be linearly ordered in the usual way does require all of the Peano axioms. Without proving this theorem though, we can still define $\{1,2\}$, then $\{1,2,3\}$, and so on, as far as we like.}
\begin{equation*}
\begin{array}{c|cccc}
x &1&\dots&n-1&n\\\hline
x+1&2&\dots&n &1
\end{array}
\end{equation*}
Here $n+1=1$, and then by induction $n+x=x$, so $n$ is neutral for addition; also then, every element has an additive inverse, so the set $\{1,\dots,n\}$ becomes the ring $\Z/n\Z$.
Of course, once one has the ring-structure of $\Z$, derived perhaps from the semi-ring structure of $\N$, then it is easy to show that congruence \emph{modulo} $n$ respects this structure, so that $\Z/n\Z$ is ring. Still, it seems worthwhile to note that most of this \emph{has already been shown} in establishing the semi-ring structure of $\N$, because the very same arguments work for $\Z/n\Z$.
Again, the attempt to define exponentiation by induction alone breaks
down almost completely. For every $n$ in $\N$, for every element $x$ of
$\Z/n\Z$, there is of course a function $y\mapsto x^y$ from $\N$ to
$\Z/n\Z$ satisfying the identities~\eqref{eqn:x^1} and \eqref{eqn:exp+1}; but this needs
more than induction. The full result is the following.
\begin{theorem}\label{thm:rec}
For every set $A$ that has an element $1$ and is closed under an
operation $x\mapsto x+1$, the following are equivalent conditions.
\begin{compactenum}
\item
The operation $x\mapsto x+1$ is injective,
but its range does not contain $1$, and
no proper subset of $A$ contains $1$ and is closed under the operation.
\item
For every set $B$ that has an element $c$ and is closed under an
operation $x\mapsto f(x)$, there is a unique function $g$ from $A$ to
$B$ satisfying the identities
\begin{align*}
g(1)&=c,&g(x+1)&=f(g(x)).
\end{align*}
\end{compactenum}
\end{theorem}
Dedekind's work on the natural numbers predates Peano's, and his
mathematical understanding seems to be more profound. He gives the
forward direction of Theorem~\ref{thm:rec} in his \emph{Nature and
Meaning of Numbers} of 1887 \cite[II(126), p.\ 85]{MR0159773}. It
is an accident of history that the Peano axioms are usually so called.
Peano does give us some notation, which has perhaps helped solidify
the ideas behind it. Russell and Whitehead may have helped spread the
notation through the \emph{Principia Mathematica}: their sign
$\supset$ for implication is derived from Peano's reversed C, and they
use Peano's sign $\in$ for membership of an individual in a class
(originally the sign is an epsilon, for the Greek
\foreignlanguage{greek}{>est'i} `is' \cite[pp.\ 25--26]{PM}).
Dedekind\label{Ded-con} himself does not distinguish between this
membership relation and the subset relation: he used the same sign for
both, looking something like a $3$ or perhaps a \emph{reversed} epsilon
(\cite[II(3), p.\ 46]{MR0159773} or \cite[p.\ 3]{Dedekind-German}.
Henkin
\cite[p.\ 337]{MR0120156} proves the reverse direction of Theorem~\ref{thm:rec}, but does not explicitly mention any earlier proof. If the theorem is difficult, the difficulty lies in recognizing that such a theorem \emph{might} be true; once one can recognize this possibility, proving the theorem is not that hard.\footnote{\label{note:rec}One can obtain the function $g$ as the union of the set of all sets $\{(1,c),(2,f(c)),(3,f^2(c)),\dots,(n,f^{n-1}(c))\}$; this is Dedekind's approach. Alternatively, $g$ is the intersection of the set of all relations $R$ from $A$ to $B$ such that $1\mathrel Rc$ and $(x+1)\mathrel R(f(y))$ whenever $x\mathrel Ry$. In the words of Enderton \cite[p.\ 35]{MR1801397}, these are the bottom-up and top-down approaches, respectively.} Similarly, it is not so hard to prove the independence of Euclid's Parallel Postulate from his others; but it seems to have taken more than two thousand years to recognize the possibility of such a proof.
\section{Algebra}\label{sect:alg}
Theorem~\ref{thm:rec} can be understood as being about \emph{algebras} in a \emph{signature} with one constant and one singulary operation-symbol. In the most general sense, an \textbf{algebra} is a set with some distinguished operations and elements; those operations and elements are given symbols, which compose the \textbf{signature} of the algebra.
Theorem~\ref{thm:rec} is about an algebra $(A,1,x\mapsto x+1)$, or more simply $\str A$. In a word, the second condition in the theorem is that $\str A$ admits \textbf{recursion}; more elaborately, the algebra admits \textbf{recursive definition of homomorphisms.} Another way to say this is that the algebra is \textbf{absolutely free:} that is, it is a free object in the category of \emph{all} algebras of its signature.
In the first condition, the part about not having certain proper subsets is that $\str A$ has no proper \emph{subalgebras}; in a word, $\str A$ is \textbf{minimal.}
Again, such minimality is not enough to establish recursion. Dedekind
\cite[\P130, p.\ 89]{MR0159773} notes this, observing in effect that
there is no homomorphism from $\Z/2\Z$ to $\Z/3\Z$, even though the
former admits induction. (Nonetheless, Dedekind does refer to definition by
recursion as definition by induction.)
If we understand an element of a set as a nullary operation on the set, then Theorem \ref{thm:rec} can be understood as a special case of the following.
\begin{theorem}\label{thm:gen}
An arbitrary algebra is absolutely free if and only if:
\begin{compactenum}[1)]
\item
the algebra is minimal,
\item
each of its operations is injective,
\item
the ranges of any two of those operations are disjoint.
\end{compactenum}
\end{theorem}
To establish the sufficiency of the three conditions,\footnote{See note \ref{note:rec} above. Enderton \cite[p.~39]{MR1801397} establishes sufficiency in case the signature has one binary, one singulary, and arbitrarily many nullary operation-symbols. We shall look at a similar case in \S\ref{sect:logic} below.} one proceeds as one would for the corresponding implication in Theorem~\ref{thm:rec}: given an algebra $\str A$ meeting those conditions and an arbitrary algebra $\str B$ of the same signature, one obtains a unique homomorphism from $\str A$ to $\str B$, either as the intersection of the set of all \emph{relations} from $A$ to $B$ with the appropriate properties, or as the union of the set of certain \emph{partial} functions from $A$ to $B$. For the necessity of the three conditions, one observes that all absolutely free algebras of a given signature are isomorphic to one another; then it is enough to observe that one example meets the conditions. In the situation of Theorem~\ref{thm:rec}, one already has such an example, or rather, one \emph{assumes} that one has an example, namely $(\N,1,x\mapsto x+1)$. In the case of a signature with no constants, not only are all absolutely free algebras isomorphic to one another, but they are identical with one another: they are empty.
In case there \emph{are} constants, the natural example of a free algebra would seem to be the \emph{term algebra,}\label{term-alg} as described for example by Hodges \cite[\S1.3, p.~14]{MR94e:03002}. I want to work out one case, by way of arguing that it is indeed mathematically worthwhile to be aware of Theorem~\ref{thm:gen} in its generality, and not only Theorem~\ref{thm:rec}.
\section{Propositional logic}\label{sect:logic}
In his `automathography' \cite[p.~206]{MR961440}, Halmos writes:
\begin{quote}
An exposition of what logicians call the propositional calculus can annoy and mystify mathematicians. It looks like a lot of fuss about the use of parentheses, it puts much emphasis on the alphabet, and it gives detailed consideration to `variables' (which do not in any sense vary). Despite (or because of?) all the pedantic machinery, it is hard to see what genuine content the subject has. Insofar as it talks about implications between propositions, everything it says looks obvious. Does it really have any mathematical value?
Yes it does\dots Question: what is the propositional calculus? Answer: the theory of free Boolean algebras with a countably infinite set of generators.
\end{quote}
It is good that Halmos found a way to understand logic that satisfied him; but he seems to have missed the point. For one thing, propositional logic is not just about free \emph{Boolean} algebras: it is about certain \emph{absolutely free} algebras, in the sense above, from which Boolean algebras are obtained as \emph{quotients.}
This is how the fuss and pedantry arises that Halmos decries; but I think it is inevitable, and I hope to make it a little more interesting below.
Meanwhile, logic should be understood as \emph{foundational} for mathematics. One \emph{can,} generally \emph{does,} and probably \emph{must} learn mathematics before symbolic logic; but if one wants to formalize one's work, at least by way of checking for mistakes, then one will reduce one's mathematics to logic, and not the other way around as Halmos does.
I shall describe here a version of the prositional calculus
that Church \cite[ch.\ I]{MR18:631a} develops from that of \L
ukasiewicz. We first choose a set $V$ of propositional `variables'. In the algebraic sense above, these variables will indeed be constants. One does generally want $V$ to be countably infinite, but this will make little difference for us. Actually, it is perhaps philosophically best to consider $V$ as finite, as long as it can be made as large as necessary to cover any particular situation.
We define a set of propositional formulas by three rules:
\begin{compactenum}[1)]
\item
every variable is a formula,
\item
$0$ is a formula, and
\item
if $F$ and $G$ are formulas, then so is the \emph{implication} $(F\lto G)$.
\end{compactenum}
Thus we obtain a \emph{term algebra}\label{term-alg2} in the signature $V\cup\{0,\lto\}$.
We may understand this definition to pick out the formulas from the set of all \emph{strings} of our symbols, just as our original definition of $\N$ picked out its elements from $\R$. This set of strings might be formalized as the set of functions from sets $\{1,\dots,n\}$ or $\{0,\dots,n-1\}$ into our set of symbols. Again though, this way of thinking is backwards or anachronistic, if we think of symbolic logic as being developed for the sake of formalizing the notions of sets and functions and numbers in the first place.
One should understand a string of symbols as something that can be written down, on paper, left to right in the usual way.
From the given definition of formula, it is easy enough to show that any particular written string is a formula; and that is all we need.
We want to know that the algebra of formulas is \emph{absolutely free.} That is, we want formulas to have \textbf{unique readability.} If we accept Theorem~\ref{thm:gen} then, since by definition the algebra of formulas is minimal, we need only show that
\begin{inparaenum}[(1)]
\item
the operation $(F,G)\mapsto(F\lto G)$ is injective and
\item
its range contains neither $0$ nor a variable.
\end{inparaenum}
These facts correspond to the two Peano axioms other than Induction; but in the present case they are theorems. One of the theorems is trivial, since all implications have more than one symbol. The other theorem is that if $(F\lto G)$ and $(H\lto K)$ are the same formula, then $F$ and $H$ are the same formula (and hence also $G$ and $K$ are the same), assuming $F$, $G$, $H$, and $K$ are formulas in the first place. This follows from the lemma that no proper initial segment of a formula is a formula. One can prove this by induction on the \emph{lengths} of formulas: that is, one can prove it as a theorem about the algebra $\N$.
Again this might be anachronistic, if one is developing logic in order to formalize $\N$.
Alternatively, one can prove simultaneously by induction in the algebra of formulas that every formula
neither
\begin{inparaenum}[(1)]
\item
has a formula as a proper initial segment, nor
\item
is itself a proper initial segment of another formula.
\end{inparaenum}
It may be legitimate to consider unique readability of formulas as
being obvious. From school arithmetic, we have the sense that we can
always put in enough brackets to make a given expression unambiguous.
In our present system, \emph{every} implication is surrounded by
brackets; is it not obvious that this is enough to ensure unique
readability? Church seems to think so. He first notes in passing
\cite[p.\ 38, n.\ 91]{MR18:631a} that brackets can be dispensed with
entirely by using the prefix notation of \L ukasiewicz, but does not dwell on the issue. Later \cite[\S10, pp.\ 70--71]{MR18:631a} he gives an algorithm for determining whether a given string is a formula. Given a string beginning and ending with brackets, the algorithm aims to detect (by counting brackets) an arrow of implication in the string such that, if the string \emph{is} a formula, then the two strings between the arrow and the outer brackets must be formulas. Then the algorithm is applied in turn to those two strings, and so on. Church calls the arrow found by the algorithm the \emph{principal implication sign} of the formula. Now, by definition an implication must have \emph{a} principal implication sign; but Church does not exactly \emph{prove} that his algorithm finds such a sign in every well-formed implication. Even if this is granted, there remains the question of whether the sign is unique.\footnote{Burris proves
unique readability for the \L ukasiewicz notation only, but seems to take unique
reability of bracketed infix notation as obvious \cite[pp.\ 39, 142,
197]{Burris}. Enderton notes in effect \cite[p.\ 30]{MR1801397} that unique readability is immediate if the formula $(F\lto G)$ is understood to be the ordered triple $(F,\lto,G)$. Again this is anachronistic: it begs the question of whether we \emph{can} write down formulas in such a way that an arbitrary implication can be unambiguously analyzed as an ordered triple $(F,\lto,G)$. If writing the formula as a \emph{string} $(F,\lto,G)$ is enough, then it is enough to write it without the commas, as $(F\lto G)$. But the brackets \emph{are} needed; why is that? Here one may start to feel the need to \emph{prove} that the brackets are enough; and Enderton does in fact supply a proof.}
Whether one proves it or not, unique
readability should be stated clearly, in order to be able to emphasize, as we
shall do below, what might be called the `non-unique readability' of
\emph{theorems.}
Meanwhile, we need unique readability of formulas to define
\textbf{interpretations,} namely functions $h$ from the algebra of formulas into
$\{0,1\}$ such that $h(0)=0$ and
\begin{equation*}
h((F\lto G))=1\text{ if and only if }h(F)=0\text{ or }h(G)=1.
\end{equation*}
(So one thinks of $0$ as \emph{falsehood,} and $1$ as \emph{truth.})
An interpretation then is determined by its restriction to the set $V$ of variables.
In the spirit of Halmos, we may note that an interpretation is a homomorphism into the two-element field, if we there understand $x\lto y$ to be $x\cdot y+x+1$. Moreover, if two formulas are understood to be \textbf{equivalent} if their images are equal under every interpretation, then the set of equivalence-classes of formulas can indeed be understood as a Boolean algebra, at least when we make the usual definitions: the \emph{negation} $\lnot F$ is $(F\lto0)$, the \emph{disjunction} $(F\lor G)$ is $(\lnot F\lto G)$, and the \emph{conjunction} $(F\land G)$ is $\lnot(\lnot F\lor\lnot G)$. But again, the point of logic is to show how such a Boolean algebra can be obtained in the first place.
The Boolean algebra of equivalence-classes of propositional formulas in two variables can be depicted in the Hasse diagram in Figure~\ref{fig:B};
\begin{figure}[ht]
\begin{equation*}
\xymatrix@!C0@R=22mm{
&&&&&
(P\lor\lnot P)\ar@{-}[dllll]\ar@{-}[dl]\ar@{-}[dr]\ar@{-}[drrr]&&&&\\
&
(P\lto Q)\ar@{-}[dl]\ar@{-}[dr]\ar@{-}[drrr]&&&
(P\lor Q)\ar@{-}[dllll]\ar@{-}[dr]\ar@{-}[drrr]&& (Q\lto P)\ar@{-}[dllll]\ar@{-}[dl]\ar@{-}[drrr]&&
\lnot(P\land Q)\ar@{-}[dllll]\ar@{-}[dl]\ar@{-}[dr]&\\
Q\ar@{-}[dr]\ar@{-}[drrr]&&
(P\liff Q)\ar@{-}[dl]\ar@{-}[drrr]&&
\lnot P\ar@{-}[dl]\ar@{-}[dr]&
P\ar@{-}[dllll]\ar@{-}[drrr]&&
\lnot(P\liff Q)\ar@{-}[dllll]\ar@{-}[dr]&&
\lnot Q\ar@{-}[dllll]\ar@{-}[dl]\\
&
(P\land Q)\ar@{-}[drrr]&&
\lnot(Q\lto P)\ar@{-}[dr]&&
\lnot(P\lor Q)\ar@{-}[dl]&&&
\lnot(P\lto Q)\ar@{-}[dllll]&\\
&&&&P\land\lnot P&&&&&
}
\end{equation*}
\caption{Hasse diagram for an algebra of equivalence-classes of formulas}\label{fig:B}
\end{figure}
but a question that logic must take up is how a node in the diagram can be written down in isolation so that its position in the diagram can be inferred.
In the foreword of his \emph{Algebra,} Lang writes \cite[p.\ v]{Lang-alg}:
\begin{quote}
Unfortunately, a book must be projected in a totally ordered way on the page axis, but that's not the way mathematics `is', so readers have to make choices how to reset certain topics in parallel for themselves, rather than in succession.
\end{quote}
Logic might be said to investigate how this projecting can be done so that readers are indeed able to recover what they want.
Again, to define an interpretation of formulas, we use the unique readability of formulas.
We use this also to
define a certain binary operation on the set of formulas. Let us use the symbol $*$ for this operation, so that we can define it by
\begin{equation*}
F*G=\begin{cases}
H,&\text{ if $G$ has the form $(F\lto H)$;}\\
G,&\text{ otherwise.}
\end{cases}
\end{equation*}
Such an operation is a \textbf{rule of inference} (this one being called \emph{Modus Ponens} or Detachment).
We do not actually need unique readability in order to define, as \textbf{logical axioms,}
all formulas of one of the forms
\begin{gather*}
(F\lto(G\lto F)),\\
((F\lto(G\lto H))\lto((F\lto G)\lto(F\lto H))),\\
(((F\lto0)\lto0)\lto F).
\end{gather*}
We obtain the set of \textbf{theorems} by taking the set of logical axioms and
closing under $*$ within the set of formulas. The set of theorems can then be understood as a
minimal algebra in a signature with $*$ and with a constant
for each logical axiom. However, unlike the algebra of formulas, the algebra
of theorems is not free.
Because of the freeness of the algebra of
formulas, each formula has a unique \emph{parsing tree}; for example,
one of the logical axioms has the parsing tree in Figure~\ref{fig:parse}, where $P$, $Q$, and $R$ are propositional variables.
\begin{figure}[ht]
\begin{center}
\pstree[levelsep=30pt,treemode=U]
{\Treebox{$((P\lto(Q\lto R))\lto((P\lto Q)\lto(P\lto R)))$}}
{\pstree{\Treebox{$(P\lto(Q\lto R))$}}
{\Treebox{$P$}
{\pstree{\Treebox{$Q\lto R$}}
{\Treebox{$Q$}
\Treebox{$R$}}}}
\pstree{\Treebox{$((P\lto Q)\lto(P\lto R))$}}
{\pstree{\Treebox{$(P\lto Q)$}}
{\Treebox{$P$}
\Treebox{$Q$}}
\pstree{\Treebox{$(P\lto R)$}}
{\Treebox{$P$}
\Treebox{$R$}}}}
\end{center}
\caption{a parsing tree}\label{fig:parse}
\end{figure}
Strictly the tree is an \emph{ordered} tree, in the sense that left branches must be distinguished from right branches.
The minimality of the algebra of theorems means that each
theorem has a \emph{proof,} which also can be understood as a tree; but this proof
is not unique. For example,
the theorem $(P\lto P)$ has the proof shown in Figure~\ref{fig:proof},
\begin{figure}[ht]
\begin{center}
\pstree[levelsep=30pt,treemode=U]
{\Treebox{$(P\lto P)$}}
{\Treebox{$(P\lto F)$}
\pstree{\Treebox{$((P\lto F)\lto(P\lto P))$}}
{\Treebox{$(P\lto(F\lto P))$}
\Treebox{$((P\lto(F\lto P))\lto((P\lto F)\lto(P\lto P)))$}}}
\end{center}
\caption{a proof, where $F$ is $(G\lto P)$}\label{fig:proof}
\end{figure}
where $F$ stands for a formula $(G\lto P)$, where $G$ can be \emph{any} formula.
We may debate whether a proof is `really' a tree, as opposed to a linearly ordered refinement of the tree, such as
\begin{gather*}
((P\lto((G\lto P)\lto P))\lto((P\lto (G\lto P))\lto(P\lto P))),\\
(P\lto((G\lto P)\lto P)),\\
((P\lto (G\lto P))\lto(P\lto P)),\\
(P\lto (G\lto P)),\\
(P\lto P).
\end{gather*}
Normally a proof is something that can be \emph{read,} and reading is done linearly;
in the language of Lang, a proof is a totally ordered projection on the page axis.
But to \emph{understand} a proof is to understand the relation of each of its steps to other steps; and this relation places those steps as nodes of a tree.
As there is an algorithm to determine whether an arbitrary string of symbols is a formula, so there is an algorithm to determine whether an arbitrary formula is a theorem in the present sense.
Indeed, by
induction in the algebra of theorems, if $F$ is a theorem, then $1$ is its only interpretation, that is, the formula is \textbf{logically true.} The converse of this implication is also true, although the proof is not obvious.\footnote{One approach is the following. For every formula $F$, if $h$ is an interpretation, define $F^h$ as $F$ itself if $h(F)=1$, and otherwise let $F^h$ be $(F\lto0)$. If the variables of $F$ are among $P_0$, \dots, $P_{n-1}$, then one shows, by induction in the algebra of formulas, that the formula $(P_0{}^h\lto\cdots\lto (P_{n-1}{}^h\lto F^h)\cdots)$ is a theorem. In particular, if $F$ is logically true, then $(P_0{}^h\lto\cdots\lto (P_{n-1}{}^h\lto F)\cdots)$ is always a theorem. Then one can in turn eliminate each $P_i{}^h$, using the theorem $((P\lto G)\lto((\lnot P\lto G)\lto G))$.} And there is an algorithm to determine whether a formula is logically true: just check its interpretations under all of the (finitely numerous) interpretations of its variables: that is, check its \emph{truth table.} If the formula is indeed logically true, then we can find a proof of it.
But this is a special feature of propositional logic; it fails in \emph{first-order logic.}\footnote{More precisely first-order \emph{predicate} logic, or perhaps first-order \emph{quantifier} logic.} I am going to try to describe this logic as tersely as possible for present purposes; but the logic is less neat than propositional logic. Still, we should appreciate that it is one solution found, after decades if not millenia of labor, to the problem of how mathematics can be expressed.
\section{First-order logic}
In first-order logic, the role of propositional variables is taken by \textbf{atomic formulas,} which express equality or other relations between \emph{individuals}; these individuals are denoted by \textbf{terms,} that is, members of the appropriate \emph{term algebra.}\footnote{See pages \pageref{term-alg} and \pageref{term-alg2}.}
Again this algebra has a signature, comprising $n$-ary operation-symbols for various $n$, including $0$; but there are also \textbf{individual variables,} which---as far as constructing terms is concerned---play the role of constants, that is, nullary operation-symbols. Terms can be understood as \emph{polynomials.}
First order logic also introduces new operations on formulas: $F\mapsto\Exists xF$ and $F\mapsto\Forall xF$, where $x$ is an individual variable.
(One can understand $\Forall x\phi$ as an abbreviation of $\lnot\Exists x\lnot\phi$.)
Every occurrence of $x$ in $\Exists xF$ or $\Forall xF$ is \emph{bound.} A formula in which all occurrences of variables are bound is a \textbf{sentence.} The set of sentences is not defined inductively; rather, the function that assigns to each formula its set of \emph{free} variables is defined recursively, and then the set of sentences is the inverse image of the empty set of variables under this function. Thus the definition of sentences does require unique readability of formulas.
An \textbf{interpretation} is still a function on the set of formulas. Its codomain is no longer $\{0,1\}$, but is instead the family of subsets of finite Cartesian powers of some set $M$. The interpretation may then be denoted by $\str M$, or more precisely $\phi\mapsto\phi^{\str M}$. Here $\phi^{\str M}$ can be understood as the \emph{solution set} of the formula $\phi$ in $\str M$, so that if $\phi$ has $n$ free variables, then $\phi^{\str M}$ is a subset of $M^n$. The interpretation is \emph{not} determined by its restriction to the set of variables; indeed, variables are no longer formulas. The interpretation is determined by its restriction to the set of atomic formulas. But now further analysis is possible. Each atomic formula is a combination of a relation-symbol (possibly the equals sign) and the appropriate number of terms. Each term $t$ then has an interpretation $t^{\str M}$, which is an operation on $M$. In particular, if $t$ has $n$ variables, then $t^{\str M}$ is a function from $M^n$ to $M$. The map $t\mapsto t^{\str M}$ is defined recursively from certain operations $S^{\str M}$ on $M$ that are assigned to the operation-symbols $S$ in the logic. Then the interpretation of an atomic formula is determined by the relations $S^{\str M}$ on $M$ assigned to the relation-symbols $S$ in the logic. The whole interpretation $\str M$ is now determined by the set $M$ and the operations and relations $S^{\str M}$ on $M$ for the various symbols $S$. The interpretation is a \textbf{structure} on $M$.
If $\sigma$ is a sentence, then by the foregoing account $\sigma^{\str M}$ should be a subset of $M^0$. But $M^n$ can be understood as the set of functions from $\{0,\dots,n-1\}$ to $M$; in particular, $M^0$ is the set of functions from the empty set to $M$. There is only one such function, the empty set, which can be denoted by $0$; and then $M^0=\{0\}$, which can be denoted by $1$. Thus $\sigma^{\str M}$ is either $0$ or $1$: it is $0$, if $\sigma$ is \textbf{false} in $\str M$, and $1$ if $\sigma$ is \textbf{true.} More generally a formula $\phi$ with $n$ free variables can be considered as true in $\str M$ if $\phi^{\str M}=M^n$; but then a formula that is not true is not necessarily false.
A structure in which every formula in a given set is true is a
\textbf{model} of that set. That set then \textbf{entails} every
formula that is true in every model of the set. A formula entailed by
the empty set is \textbf{logically true.} A set of formulas that is
closed under entailment is a \textbf{theory.} By this definition, a
theory has no obvious algebraic structure whereby the theory is
\emph{minimal} in the sense discussed earlier.
Nonetheless, a theory can be given such an algebraic structure: this
is the import of \textbf{G\"odel's Completeness Theorem} of 1930
\cite{Goedel-compl}.
An algebraic structure on theories is a \textbf{proof system.} A proof system then is a set of \emph{logical axioms}---certain formulas---and \emph{rules of inference}---certain operations on functions; the logical axioms can be considered as nullary operations. We require the logical axioms to be logically true, and the rules of inference to yield only formulas that are entailed by their arguments.
G\"odel's theorem is that there is a proof system\footnote{As logical axioms of this proof system, we can take those of propositional logic, along with the formulas
$((\phi\lto\psi)\lto(\phi\lto\Forall x\psi))$
where $x$ does not occur freely in $\phi$.
The remaining axioms are
$(\Forall x\theta\lto\theta^x_t)$,
where $t$ is a constant or variable, and $\theta^x_t$ is the result of replacing all free occurrences of $x$ in $\theta$ with $t$; if $t$ is a variable, it must not occur freely in a subformula of $\theta$ where $x$ is bound. As rules of inference, we can take $*$ as before, along with the operations
$\phi\mapsto\Forall x\phi$.}
that is
\textbf{complete} in the sense that every algebra of formulas with
respect to this system is already a
theory.\footnote{G\"odel's proof requires introduction of new relation-symbols; Henkin's improvement \cite{MR0033781} uses new constants. Church \cite{MR18:631a} presents both proofs.}
The set $T$ of formulas entailed by a set $\Gamma$ of formulas is also the smallest algebra of formulas (with respect to a complete proof system) that includes $\Gamma$. Then $\Gamma$ is a set of \textbf{axioms} for $T$. This set $T$ is a theory, but it need not be \textbf{complete} as a theory; that is, there may be a
sentence such that neither itself nor its negation is entailed by $\Gamma$.
(For a formula with
free variables, neither it nor its negation need belong to a given
complete theory.) Presburger showed in 1930 that we can write down a set $\Gamma$ of axioms such that
\begin{inparaenum}[(1)]
\item
the semigroup $(\N,1,+)$ is a model of $\Gamma$, and
\item
the theory entailed by $\Gamma$ is complete.
\end{inparaenum}
More precisely, although $\Gamma$ is infinite, we can write down as much of it as we need. Indeed, $\Gamma$ contains, for each $n$ in $\N$, the axiom
\begin{equation*}
\bigvee_{k=1}^n(x=k\lor\Exists yx=ny+k),
\end{equation*}
where $k$ as a term stands for $1+\dotsb+1$ (with $k$ occurrences of $1$) and $ny$ stands for $y+\dotsb+y$ (with $n$ occurrences of $y$); and $\Gamma$ contains finitely many other axioms.\footnote{The other axioms are
$x+y=y+x$,
$x+(y+z)=(x+y)+z$,
$x\neq y\lto\Exists z(x+z=y\lor y+z=x)$,
$x+y\neq x$, and
$x+y\neq 1$.
See Hodges \cite[pp.\ 85 \&\ 128]{MR94e:03002}; I have not consulted Presburger's original paper. One way to prove the theorem is to introduce a relation-symbol $<$ so that the atomic formula $x