## Introduction

Every so often, the question comes up (either here or elsewhere) of why induction is a valid proof technique. And this is of course a very natural question. Induction is after all rather mysterious compared to the other usual proof techniques. At the same time, it is a very useful one, so it is important that people can be given a satisfactory answer. The question is more precisely “why can we do induction on the natural numbers”, but I am not going to answer that question here. For one thing, the answer depends entirely on how one defines the natural numbers, and for another, induction has nothing to do with the natural numbers. “But wait” you might say. “Did he really just claim that induction has nothing to do with the natural numbers. How can that be? Is induction not something like, we prove something holds for \({0}\) and that if it holds for \({n}\) then it holds for \({n+1}\). How does that even make sense if we are not talking about the natural numbers?” And indeed, the “nothing” was a bit of an overstatement to get your attention. But it turns out that induction is a proof technique that can be used in a much more general setting than that of the natural numbers. This is the viewpoint I will try to explain in detail in this post: Given some set \({X}\), what do we need to be able to use induction to prove that something holds for all elements of \({X}\)?

## The usual case

To get started, let us look at the way induction is usually formulated (I will formulate this as a theorem, but as mentioned, I will not give a proof).

TheoremLet \({A\subseteq {\mathbb N}}\) such that \({0\in A}\) and for all \({n\in {\mathbb N}}\) we have \({n\in A\implies n+1\in A}\).Then\({A = {\mathbb N}}\).

At first glance, it seems like this uses some special properties of the natural numbers, namely the facts that we have a \({0}\) and a way to add \({1}\) to any element. But let us look at a different version of induction (often called “strong” induction). Often, we are told that this version is “equivalent” to the usual induction. But as will be seen later, this is either trivial (as both are true statements about the natural numbers), or false (as the theorems do not hold for the same sets when we start to generalize).

TheoremLet \({A\subseteq {\mathbb N}}\) such that for all \({n\in {\mathbb N}}\) we have \({\left(m < n \implies m\in A\right)\implies n\in A}\).Then\({A = {\mathbb N}}\).

(Note that I have the “baseless” version here. But clearly any set satisfying the above will contain \({0}\) since if \({n=0}\) then \({m < n}\) is always false and hence the first implication becomes true regardless of whether \({m\in A}\), which means that the second implication can only be true if \({n\in A}\)). So now we have a version of induction that only uses the ordering of the natural numbers, and this seems like it might be easier to generalize. So let us take a set \({X}\). What sort of ordering must we put on \({X}\) in order to be able to prove that something holds for all \({x\in X}\) by induction? The common answer is “a well-order”, but this is not quite correct, mainly because a well-order is a total order, and it turns out that it is possible to use induction with just a partial order. But let us still start with the more specific question: What sort of total order on \({X}\) allows us to do induction?

## Total orders

In this case, the answer is indeed “a well-order”. So let us first look at the definition.

DefinitionA total order on a non-empty set \({X}\) is called a well-order if any non-empty subset of \({X}\) has a smallest element.

So my claim is that if \({X}\) is a totally ordered set, then we can do induction on \({X}\) if and only if \({X}\) is well-ordered. But this then refers to the “strong” induction. What happened to the other kind? Can we make sense of the other kind if we just have a total order? The answer to the last question is: Almost. If we have a well-ordered set (with a small additional condition), then we can in fact make sense of the usual kind of induction. But it need no longer be a true statement, unless we change some details. Before we move on to the first kind of induction, let us prove the above claim that being well-ordered is both necessary and sufficient. First, we prove that if is sufficient.

TheoremLet \({X}\) be a well-ordered set and \({A\subseteq X}\) be such that for all \({x\in X}\) we have \({\left(y < x \implies y\in A\right)\implies x\in A}\).Then\({A = X}\).

*Proof:* Let \({B = X\setminus A}\) and assume for the purpose of contradiction that \({B}\) is not empty. Since \({X}\) is well-ordered this means that \({B}\) has a smallest element, call it \({b}\). But now, if \({x\in X}\) with \({x < b}\) then \({x\not\in B}\) since \({b}\) was smallest in \({B}\). Hence for all \({x\in X}\) with \({x < b}\) we have \({x\in A}\), and thus by the assumption on \({A}\), this means that \({b\in A}\) contradicting the choice of \({b}\). \(\Box\)

And then that it is necessary.

TheoremLet \({X}\) be a totally ordered non-empty set such that whenever a subset \({A\subseteq X}\) satisfies \({\left(y < x \implies y\in A\right)\implies x\in A}\) for all \({x\in X}\) then \({A = X}\). Then\({X}\)is well-ordered.

*Proof:* Let \({B\subseteq X}\) be a subset and assume that \({B}\) does not have a smallest element. Let \({A = X\setminus B}\). We need to show that \({A = X}\) and thus by assumption it is enough to show that if \({x\in X}\) and \({y\in A}\) for all \({y < x}\) then \({x\in A}\). But if \({y\in A}\) for all \({y < x}\) then \({x}\) cannot be in \({B}\), since it would then be the smallest element in \({B}\) (since all strictly smaller elements are not in \({B}\)), and thus \({x\in A}\) as we needed. \(\Box\)

So now the question remains how we can make sense of the first kind of induction, where we needed to have a \({0}\) and needed to be able to add \({1}\) to any element. But there is a very natural way to make sense of these things in a well-ordered set \({X}\), if we just assume one extra thing: \({X}\) does not have a largest element. Namely, we can take \({0}\) to be the smallest element of the set (which exists by assumption). As for adding \({1}\), this really just means “take the next element”, and this makes sense since for any \({x\in X}\) we know that the set \({\{y\in X\mid y > x\}}\) is non-empty (this is where we need \({X}\) to not have a largest element), and thus it has a smallest element, which is the “next” element after \({x}\) (called the immediate successor of \({x}\)). So with these definitions in hand, given a well-ordered set \({X}\) which does not have a largest element, can we do induction on \({X}\)? The answer turns out to be “no”, as the following example demonstrates. The example also illustrates what we need to change to make things work.

ExampleLet \({X = \{0,1\}\times {\mathbb N}}\) and order \({X}\) lexicographically (so \({(m,n) \leq (m’,n’)}\) if \({m < m’}\) or \({m = m’}\) and \({n\leq n’}\)). It is then easy to check that \({(0,0)}\) is the smallest element of \({X}\) (i.e. we denote \({0 = (0,0)}\)). It is also easy to check that with “\({+1}\)” defined as explained above, we have \({(m,n)+1 = (m,n+1)}\). Let \({A = \{(0,n)\mid n\in {\mathbb N}\}\subseteq X}\). Then the above observations show that \({0\in A}\) and \({a\in A\implies a+1\in A}\) but \({A\neq X}\). But \({X}\) is indeed well-ordered (this is a standard exercise that I leave to the reader), so this is a well-ordered set without a largest element where we cannot use the first kind of induction.

So what is it that goes wrong in the above example? If we look at the usual proof of why induction works for the natural numbers, it is very close to the proof I have presented here that “strong” induction works on any well-ordered set, except that at one point, one will need that if \({n\neq 0}\) then \({n-1}\) makes sense (one considers the complement of the given set, takes the smallest element \({n}\) and notices that this cannot be \({0}\) and then uses the inductive assumption on \({n-1}\)). But how would we define “subtract \({1}\)” in a general well-ordered set? This turns out to not be doable, and this is precisely what goes wrong in the example. More precisely, “subtract \({1}\)” should mean “take the immediate predecessor”, and such need not exist. In the given example, the element \({(1,0)}\) does not have any immediate predecessor, since the elements smaller than \({(1,0)}\) are precisely those in the chosen set \({A}\), and an immediate predecessor would be a largest element in \({S}\), which clearly does not exist. The above suggests how we can remedy the situation: Replace “\({0\in A}\)” by “\({x\in A}\) for any \({x\in X}\) which does not have an immediate predecessor”. And indeed, with this version, we can do induction (I leave the proof as an exercise to the reader. It is just a small change compared to the other proof given).

## Partial orders

Back to the more general case: Let \({X}\) be a partially ordered set. What do we need to assume about this partial order in order to do induction on \({X}\) (from now on, by induction I will mean “strong” induction since we cannot really make sense of the other kind in this larger generality). Here the answer turns out to be that we need \({X}\) to be well-founded (actually we do not need an ordering, just any well-founded relation, but I will stick with an ordering). So let us define this.

DefinitionA partially ordered non-empty set \({X}\) is said to be well-founded if any non-empty subset of \({X}\) has a minimal element.

Let us as previously prove that this condition is both sufficient and necessary to do induction. First, sufficient.

TheoremLet \({X}\) be a well-founded partially ordered set and \({A\subseteq X}\) such that for all \({x\in X}\) we have \({\left(y < x \implies y\in A\right)\implies x\in A}\).Then\({A = X}\).

*Proof:* Let \({B = X\setminus A}\). Assume for the purpose of contradiction that \({B}\) is not empty, and let \({b\in B}\) be a minimal element. Now it is clear that if \({x\in X}\) with \({x < b}\) then \({x\in A}\) since otherwise \({b}\) would not be minimal. But then \({b\in A}\) contradicting the choice of \({b}\). \(\Box\)

And necessary.

TheoremLet \({X}\) be a partially ordered non-empty set such that whenever a subset \({A\subseteq X}\) satisfies \({\left(y < x \implies y\in A\right)\implies x\in A}\) for all \({x\in X}\) then \({A = X}\).Then\({X}\)iswell-founded.

*Proof:* Let \({B\subseteq X}\) and assume that \({B}\) does not have a minimal element. Let \({A = X\setminus B}\). Now if \({x\in X}\) and \({y\in A}\) for all \({y < x}\) then also \({x\in A}\) as otherwise \({x}\) would be a minimal element of \({B}\). But then by assumption, \({A = X}\) so \({B}\) is empty. \(\Box\)

So now we know when we can do induction. But is this ever useful? And it sure is! For a concrete example of a well-founded set with a non-total order, we can take \({{\mathbb Z}}\) with the ordering \({m\leq n}\) if \({m = n}\) or \({|m| < |n|}\) (the “closer to zero” ordering). Sure, we could also mess a bit with the ordering to make it total, but this makes for a simpler description (for an example where this ordering is used, see my answer here). A useful and easy result for proving that certain sets are well-founded is the following.

PropositionLet \({X}\) be a partially ordered non-empty set such that for all \({x\in X}\), the set \({ \{y\in X\mid y\leq x \} }\) is finite.Then\({X}\)is well-founded.

*Proof:* Let \({A\subseteq X}\) be a non-empty subset and let \({a\in A}\). Now the set \({ \{x\in X\mid x\leq a \} \cap A}\) is finite and non-empty and thus has a minimal element, which is a minimal element in \({A}\). \(\Box\)

## An exercise for the readers

Finally, to end the post, I have an exercise for the readers. Go and find good examples of proofs by induction where the set in question is not the natural numbers. Even better, find ones where the set in question is not the natural numbers, but where the proof ended up being more complicated because the author has taken pains to only do induction on the natural numbers (this need not reflect poorly on the author. If the proof is meant for inexperienced people, then it is often better to give the less direct proof which uses the sort of induction the readers will be familiar with). Go ahead and fill the comments with such examples.

Filed under Advanced High School Undergraduate

Tagged: Elementary-set-theory, Induction

The second boxed theorem should make it clear that the $m$ is universally quantified within the antecedent, i.e. $(\forall m (m < n \implies m\in A\right))\implies n\in A$. Otherwise it could be interpreted as a $\exists m$ and this leads to something definitely false.

Thank you for the comment. I agree that your version is technically more correct, but I disagree that it could be interpreted differently. Implications will generally be interpreted in the way I intend here. This does mean that I have been a bit inconsistent, as I could have left out a “for all” in the box preceding it. I felt that adding the extra “for all” in the second box (and those others with similar statements) would make the formulation overly clunky to read, but I will consider if it might be possible without disturbing the readability too much.

To be more precise about the reason I suggest an “exists” interpretation, the way I originally read it (with my “formalist” hat on) was to “repair” the free variable m by universally quantifying the whole formula to get “for all m, ((m < n -> m in A) -> n in A)”, which is propositionally equivalent to “(exists m:(m < n -> m in A)) -> n in A” and is not what you intended. The first boxed equation is stated correctly, and “strong” induction is called that because you are allowed to make a stronger assumption in your induction proof, hence the presence of an additional quantifier by comparison to the first theorem. I would write the boxed equation as:

Theorem Let $A\subseteq\Bbb N$ such that for all $n\in\Bbb N$ we have $(\forall m<n,\ m\in A)\implies n\in A$. Then $A=\Bbb N$.

From a computer science perspective, induction on the natural numbers is just a very special case of structural induction – if we define a type

`Nat = Zero | Succ of Nat`

(borrowing syntax from Standard ML) to mimic the set-theoretic construction then what we’re really saying that for a predicate`p`

on`Nat`

, if`p(Zero)`

holds and for any`n`

in`Nat`

we hage`p(n) => p(Succ(n))`

then`p`

must be true over all of`Nat`

. Of course this misses the mathematically interesting point that all such “recursively defined structures” can be ordered nicely, by definition. But it gives a host of examples on which one can do induction: lists, binary trees, in fact pretty much any useful data structure.Great Blog!!