Archive for the 225 book tag

Sets

Standard post by ancoop on September 17, 2018
Comments are off for this post


[callout headingicon=”noicon” textalign=”textleft” type=”basic”]

I don’t want to belong to any club that will accept people like me as a member.

Groucho Marx

[/callout]

In our quest to reason about quantities and relationships, we’d like to handle quantities that are a bit more general than integers (real numbers, for example). We’d also like to handle relationships. To do this we will develop a language that is essentially equivalent to logic, but is a bit more intuitive to use. That language is the language of sets.

Overview of Sets

What is a set?

The intuitive definition of a set is just a collection of things. This is, of course, not much of a definition. Many mathematicians in the late nineteenth century actually thought the precise definition of set did not much matter. But the philosopher and mathematician Bertrand Russell, writing in 1901, discovered that we have to be more careful. He considered the following:

Definition. We call a collection ordinary if it does not contain itself. Otherwise we call the collection extraordinary.

Most collections are ordinary, for example the collection of all Bernese mountain dogs is not itself a Bernese mountain dog. The set of all natural numbers is not itself a natural number. But some collections are extraordinary, for example:

  • The collection of abstract ideas is an abstract idea.
  • The collection of mathematical objects is a mathematical object.
  • The collection of all collections is a collection.

Russell therefore asked the question: Is the collection of ordinary collections ordinary? To make this easier, let’s write $E$ for the collection of extraordinary collections, $O$ for the collection of ordinary collections, and write $\in E$ and $\in O$ for “is extraordinary” and “is ordinary”, respectively. Thus a collection $C$ satisfies $C\in O$ exactly when $C\notin C$ and $C\in E$ exactly when $C\in C$.

There are two possibilities. If the collection of ordinary collections is ordinary ($O\in O$), then we know that $O$ must be ordinary, so $O$ cannot contain itself, i.e. $O\notin O$. If the collection of ordinary collections is extraordinary, ($O\in E$), then we know that $O$ must contain itself, so $O\in O$. But a collection cannot be both ordinary and extraordinary.

So such a collection leads to a paradox–precisely as the sentence this sentence is false did. Russell’s paradox shows that we need to be very careful about the words “is in”.

Definition. A set $S$ is an object for which the question “Is $x$ in $S$?” has an unambiguous answer. We write $x\in S$ (read “$x$ is an element of $S$”) if the answer is yes and $x\notin S$ if the answer is no.

Another way to phrase this is: $S$ is a set exactly when $x\in S$ is an open sentence.

the Well-Ordering Principle

Standard post by ancoop on September 17, 2018
Comments are off for this post


Here’s a nice fact about the natural numbers:

Well-Ordering Principle. Every nonempty collection of natural numbers has a least element.

Observe, before we prove this, that a similar statement is not true of many sets of numbers. The interval $(0,1)$, for example, has no least element. The set of even integers has no least element. The set of natural numbers has no greatest element.

Proof. We’ll prove the contrapositive, namely, that if a collection of natural numbers has no least element, then it must be empty.

Let $T$ be a nonempty collection of natural numbers. Consider $S$, the natural numbers that aren’t in $T$ (in set notation, $S=\mathbb{N}\setminus T$). Since $T$ isn’t empty, $S$ isn’t all of $\mathbb{N}$. We’ll show that, for all $n\in\mathbb{N}$, $n$ is in $S$.

base case $n=1$: If $1\in T$, then $1$ is least. So $1\notin T$, i.e. $1\in S$.

inductive step: Suppose $n\in S$. If any of the numbers less than $n$ were in $T$, then one of them would be least (since there are only finitely many such numbers and we proved every finite set has a least element). So none of the numbers $1,\ldots,n$ is in $T$. If $n+1\in T$, then $n+1$ would be least. So $n+1\notin T$, i.e. $n+1\in S$. This completes the inductive step.

So we’ve shown that every natural number is in $S$, which means that $T$ must be empty. This establishes the contrapositive of the theorem.$\Box$

The Well-Ordering Principle can be used to prove all sort of theorems about natural numbers, usually by assuming some set $T$ is nonempty, finding a least element $n$ of $T$, and “inducting backwards” to find an element of $T$ less than $n$–thus yielding a contradiction and proving that $T$ is empty. In this sense, the Well-Ordering Principle and the Principle of Mathematical Induction are just two ways of looking at the same thing.

Indeed, one can prove that WOP, PCI, and PMI are all logically equivalent, so we could have taken any one of them as our fifth axiom for the natural numbers.

Fundamental Theorem of Arithmetic. Any natural number greater than 1 can be written as the product of primes.

Proof. Let $S$ be the set of natural numbers greater than 1 which cannot be written as the product of primes. By WOP, $S$ has a least element $n$.

Clearly $n$ cannot be prime, so $n$ is composite. Then we can write $n=ab$, where neither of $a$ and $b$ is 1. So $a<n$ and $b<n$. If both of $a$ and $b$ had prime factorizations, then so would $n$. Therefore at least one of $a$ and $b$ does not have a prime factorization (by relabeling, we can assume it’s $a$). But $a<n$ and $a\in S$, so $n$ was not least in $S$.

This contradiction establishes that $S$ has no least element, hence by WOP must be empty. So every natural number greater than 1 can be written as the product of primes.$\Box$

Observe that this proof has more or less the same “juicy bits” as the proof by PCI.

The Division Algorithm. Let $a,b$ be natural numbers. Then there are nonnegative integers $q,r$ so that $0\leq r<b$ and $a=bq+r$. Moreover, $q$ and $r$ are unique. We call $q$ the quotient and $r$ the remainder.

This theorem is called the Division Algorithm because it asserts that any natural number can be divided, with remainder, by any other natural number.

Proof. Given $a$ and $b$, if $a<b$, then set $q=0$ and $r=a$. Then $0\leq r<b$, and $a=qb+r$ as required.

If $a=b$, set $q=1$ and $r=0$.

We are left with the case $a>b$. Consider the set $S=\left\{n\in\mathbb{N}\middle| a-nb< 0\right\}$. Since $b\geq 1$, $kb\geq k$ for all $k\in\mathbb{N}$. In particular, $ab\geq a$, or equivalently $a-ab\leq 0$. So $a-(a+1)b<0$, hence $a+1\in S$.

Since $S$ is nonempty, by WOP, $S$ has a least element, which we’ll call $s$. Set $q=s-1$ and $r=a-qb$. By choice of $r$, we have $a=qb+r$, but we need to show that $0\leq r<b$.

Since $s$ is least in $S$, $q=s-1$ cannot be an element of $S$. So $r=a-qb\geq 0$.

We need to show that $r<b$. Suppose, to the contrary, that $r\geq b$. Then\begin{align*}
r&\geq b\\
a-qb&\geq b\\
a-qb-b&\geq 0\\
a-(q+1)b&\geq 0
\end{align*}

so that $q+1\notin S$. But $q+1=s$, which is an element of $S$. This contradiction establishes that $r< b$ as required.

Now that we have established the existence of $q$ and $r$, let us prove their uniqueness. Consider two pairs $(q,r)$, $(q’,r’)$ so that\begin{align*}
a&=qb+r\\
a&=q’b+r’
\end{align*}

Without loss of generality, assume $r\geq r’$. We have\begin{align*}
q’b+r’&=qb+r\\
q’b-qb&=r-r’\\
(q’-q)b&=r-r’
\end{align*}

Since $b\geq 1$, this means that $r\geq r’$ guarantees $q\leq q’$, and that $q=q’$ iff $r=r’$. We will show that $q=q’$, hence $r=r’$, i.e. the quotient-remainder pair is unique.

Consider the set $S$ as above, with least element $s$. If $q’>s-1$, then $q’\geq s$, so $q’\in S$. But then $q’$ cannot be a quotient. So $q’\leq s-1$. On the other hand, if $q'<s-1$, then\begin{align*}
b> r’&=a-q’b\\
&> a-(s-1)b
\end{align*}

and subtracting $b$ from both sides we have $0>a-(s-1)b$. But $s$ is least, so $s-1\notin S$. This contradiction establishes that $q’\geq s-1$.

So we see that $q’=s-1$. But the same argument applies to $q$, hence $q=q’$.$\Box$

Corollary. Any integer is either even or odd.

Proof. Given an integer $n$, if $n$ is even we’re done. So suppose $n$ is not even, i.e. we know that we cannot write $n=2k$ for any $k$.

The division algorithm says we must be able to write $n=2k+r$ for some $0\leq r<2$. Since we ruled out $r=0$, we must have $r=1$, i.e. $n=2r+1$ is odd. $\Box$

Note that we established previously that no integer could be both even and odd. Now we know that “odd” and “not even” are, in fact, synonyms.

“Disguised” Induction Proofs

We can use the WOP to give a kind of induction proof in disguise. Consider:

Claim. The sum of the first $n$ natural numbers is $\frac{n(n+1)}{2}$.

Ordinarily, we’d prove this by induction.

Exercise. Write a proof of this claim by ordinary induction.

But we can set up a proof that uses the Well-Ordering Principle, like this:

Proof. Suppose not. Then there is some natural number (call it $p$) so that the sum of the first $p$ natural numbers isn’t $\frac{p(p+1)}{2}$. Consider the set of all such numbers:

$$A=\left\{k\middle\vert 1+\cdots+k\neq \frac{k(k+1)}{2}\right\}$$

Our assumption is that $p\in A$; in particular $A$ is not empty. (We believe in our hearts that there are no such numbers, i.e. that $A$ is empty. But this is a reductio ad absurdum proof, so we’ll have to roll with it.)

By the Well-Ordering Principle, $A$ has a least element, which we’ll call $N$.

Maybe $N=1$? No, because $1=\frac{1\cdot (1+1)}{2}$. So $N>1$. But then $N$ has a predecessor, $n=N-1$.

Since $n<N$, we know $n\notin A$ ($N$, after all, was least among the elements of $A$).

That is, $1+2+\cdots+n=\frac{n(n+1)}{2}$.

Now, consider $1+2+\cdots+N$:

\begin{align*}

1+2+\cdots+N&=1+2+\cdots+n+(n+1)\\

&=\frac{n(n+1)}{2}+n+1\\

&=\frac{n(n+1)}{2}+\frac{2(n+1)}{2}\\

&=\frac{n^2+n+2(n+1)}{2}\\

&=\frac{n^2+3n+2}{2}\\

&=\frac{(n+1)(n+2)}{2}=\frac{N(N+1)}{2}\end{align*}

which shows that actually, $N\notin A$. But this is a contradiction! So our initial supposition was incorrect, and thus the claim is true. $\Box$.

Now, I’ve highlighted some parts of the proof in green and orange. Why? The green part is what would be the base case if we wrote an ordinary induction proof. The orange part is the inductive step (from $n=N-1$ to $n+1=N$).

In fact any proof by ordinary induction can be gussied up this way, if you desire.

Complete Induction

Standard post by ancoop on September 17, 2018
Comments are off for this post


[callout headingicon=”noicon” textalign=”textright” type=”basic”]

Travel isn’t always pretty. It isn’t always comfortable. Sometimes it hurts, it even breaks your heart. But that’s okay. The journey changes you; it should change you. It leaves marks on your memory, on your consciousness, on your heart, and on your body. You take something with you. alravel isn’t always pretty. It isn’t always comfortable. Sometimes it hurts, it even breaks your heart. But that’s okay. The journey changes you; it should change you. It leaves marks on your memory, on your consciousness, on your heart, and on your body. You take something with you.

Anthony Bourdain, No Reservations: Around the World on an Empty Stomach

[/callout]

Induction is like climbing a ladder. But there are other ways to climb besides ladders. Rock climbers don’t just stand on one step. To get up the rock, a rock climber needs to use all four limbs (not just, like the ladder climber, the previous step).

Like the rock climber, we are allowed to remember not just that we are only one step away from where we’re trying to get, but also how we got to where we are. Why not look back as many steps as we need?

Definition. We call an open sentence $P(n)$ completely inductive if it has the property that $P(1)\wedge P(2)\wedge\cdots\wedge P(n)\Rightarrow P(n+1)$.

If inductive sentences represent hereditary properties, completely inductive sentences represent properties which pass to the next generation but possibly only if the entirety of the lineage had the property.

Principle of Complete Induction. Suppose $P(n)$ is completely inductive and $P(1)$ is true. Then $\forall n, P(n)$.

How might we use a statement like the Principle of Complete Induction? Let’s see an example, and then come back to the justification that PCI works.

Fundamental Theorem of Arithmetic. Every natural number greater than 1 has a prime factorization; that is, given any $n\in\mathbb{N}$ with $n\neq 1$, there are prime numbers $p_1,p_2,\ldots, p_k$ so that we can write $n=p_1\cdot p_2\cdot \cdots \cdot p_k$.

The statement of this theorem contains two new terms: prime number and prime factorization. In the statement itself, we’ve explained what prime factorization means; but we still need to know what a prime number is.

Definition. A natural number $p>1$ is prime if the only natural numbers which divide $p$ are 1 and $p$.

Let’s get an example of a prime number.

Lemma. 2 is a prime number.

Proof. We attempt to factor 2; that is, to write $2=ab$ for some natural numbers $a$ and $b$; to show 2 is prime means we’ll show either $a=1$ or $a=2$. Now the product of any two natural numbers is at least as large as either factor, so we have $a\leq 2$ and $b\leq 2$. Since $a$ and $b$ are natural numbers, $1\leq a$ and $1\leq b$. So we have $1\leq a\leq 2$ and $1\leq b\leq 2$. But the only natural numbers between 1 and 2 (inclusive) are 1 and 2. So either $a=1$ or $a=2$. $\Box$.

Exercise. Adapt this proof to show that 3 is a prime number.

Exercise. Show that 5 is a prime number.

Now we’re ready to prove the Fundamental Theorem of Arithmetic.

Proof of the Fundamental Theorem of Arithmetic. We’ll prove the claim $\forall n, \left(n>1\Rightarrow n\text{ has a prime factorization}\right)$ by complete induction. We’ll refer to $n>1\Rightarrow n\text{ has a prime factorization}$ as $P(n)$.

(base case: $n=1$.) $P(1)$ is a conditional with a false antecedent; so $P(1)$ is true.

(base case: $n=2$.) $P(2)$ is “If 2>1 then 2 has a prime factorization.” 2 is prime, so there’s the prime factorization.

(inductive step.) Consider some natural number $K>2$. Assume that $P(1), P(2),\ldots, P(K-1)$ are all true; that is, assume that any natural number $n$ with $1<n<K$ has a prime factorization. We’ll use this assumption to show that $K$ has a prime factorization.

If $K$ is prime, then there’s our prime factorization.

If $K$ is not prime, then $K$ has a factor other than itself and 1. Call this number $a$. We have $K=ab$ for some natural number $b$. Because $a\neq 1$, we have $b\neq K$. Moreover, the product of any two natural numbers is at least as large as either factor, so we know $1\leq a\leq K$ and $1\leq b\leq K$. Because $a\neq 1$, $b\neq K$. All in all, we have$$1<a<K,1<b<K$$

By our inductive assumption, $a$ has a prime factorization, say $a=p_1\ldots p_k$. The inductive assumption also applies to $b$ to give some primes $q_1,\ldots,q_\ell$ with $b=q_1\cdots q_\ell$.

Then $$K=ab=p_1\ldots p_k\cdot q_1\cdots q_\ell$$

so $K$ has a prime factorization in this case, too.

In either case, $K$ has a prime factorization; this completes the inductive step.

By the Principle of Complete Induction, we must have $P(n)$ for all $n$, i.e. any natural number greater than 1 has a prime factorization.$\Box$

A few things to note about this proof:

  • This use of the Principle of Complete Induction makes it look much more powerful than the Principle of Mathematical Induction. Instead of “looking back” one step, we got to “look back” as far as we needed. If we take $K=60$, for example, then we could use $a=5$ and $b=12$. Instead of relying on the previous case ($K=59$), we’re relying on $K=5$ and $K=12$, which are quite far away from $K=60$.
  • This proof actually provides something of an algorithm for finding prime factorizations, probably the same one you were taught in grade school.
  • Just like ordinary inductive proofs, complete induction proofs have a base case and an inductive step.

One large class of examples of PCI proofs involves taking just a few steps back. (If you think about it, this is how stairs, ladders, and walking really work.) Here’s a fun definition.

Definition. The Fibonacci numbers are defined as follows: $f_1=1$ and $f_2=1$. For any $k>2$, $f_k=f_{k-1}+f_{k-2}$.

We call definitions like this completely inductive definitions because they look back more than one step.

Exercise. Compute the first 10 Fibonacci numbers.

Typically, proofs involving the Fibonacci numbers require a proof by complete induction. For example:

Claim. For any $n$, $f_n+2f_{n+1}=f_{n+3}$.

Proof. For the inductive step, assume that for all $k\leq n$, $f_k+2f_{k+1}=f_{k+3}$. We’ll show that $$f_{n+1}+2f_{(n+1)+1}=f_{(n+1)+3}$$

To this end, consider the left-hand side. \begin{align*}f_{n+1}+2f_{(n+1)+1}&=f_{n}+f_{n-1}+2\left(f_{n+1}+f_n\right)\\&=f_n+2f_{n+1}+f_{n-1}+2f_n\end{align*}

Now we observe that $n\leq n$ and $n-1\leq n$, so we can apply the inductive assumption with $k=n$ and $k=n-1$, to continue:\begin{align*}f_n+2f_{n+1}+f_{n-1}+2f_n&=f_{n+3}+f_{n+2}&=f_{n+4}\end{align*} by the definition of the Fibonacci numbers. This completes the inductive step.

Now for the base case. Observe that since we had to take two steps back in our inductive step (to prove the claim at $n+1$, we had to look at $n$ and $n-1$), our inductive step only starts working once we have two steps to rely on. So we need

base case $n=1$: The statement to prove is $f_1+2f_2=f_4$. Since $f_1=1$, $f_2=1$, and $f_4=3$, we have the base case.

base case $n=2$: The statement to prove is $f_2+2f_3=f_5$. Since $f_2=1$, $f_3=2$, and $f_5=5$, we have the base case.

By the Principle of Complete Induction, we have established the claim. $\Box$

Here’s another example:

Claim. For any $n$, and any $a$, $f_af_n+f_{a+1}f_{n+1}=f_{a+n+1}$.

Proof. Fix $a$.

For the inductive step, assume that for all $k\leq n$, $f_af_k+f_{a+1}f_{k+1}=f_{a+k+1}$. We’ll show that $$f_af_{n+1}+f_{a+1}f_{(n+1)+1}=f_{a+(n+1)+1}$$

To this end, consider the left-hand side. \begin{align*}f_af_{n+1}+f_{a+1}f_{n+2}&=f_a\left(f_{n}+f_{n-1}\right)+f_{a+1}\left(f_{n+1}+f_n\right)\\ &=f_af_n+f_af_{n-1}+f_{a+1}f_{n+1}+f_{a+1}f_n\\&=f_af_n+f_{a+1}f_{n+1}+f_af_{n-1}+f_{a+1}f_n\end{align*}

Now we observe that $n\leq n$ and $n-1\leq n$, so we can apply the inductive assumption with $k=n$ and $k=n-1$, to continue:\begin{align*}\overbrace{f_af_n+f_{a+1}f_{n+1}}+\overbrace{f_af_{n-1}+f_{a+1}f_n}&=f_{a+n+1}+f_{a+(n-1)+1}\\&=f_{a+n+1}+f_{a+n}=f_{a+n+2}\end{align*}

This completes the inductive step.

Now for the base case. Again we need two base cases, because our inductive step looked back two steps:

base case $n=1$: The claim is $f_{a}f_1+f_{a+1}f_2=f_{a+2}$. Since $f_1=f_2=1$, this claim is $f_a+f_{a+1}=f_{a+2}$, which is the definition of the Fibonacci numbers.

base case $n=2$: The claim is $f_{a}f_2+f_{a+1}f_3=f_{a+3}$. Since $f_2=1$ and $f_3=2$, we need to establish that $f_a+2f_{a+1}=f_{a+3}$. But we just proved that above.

By the Principle of Complete Induction, we have established the claim. $\Box$

Now that we’ve seen how complete induction works, let’s explain why complete induction works:

Proof of the Principle of Complete Induction. We are given an open sentence $P(n)$ with the properties: $P(1)$ and $P$ is completely inductive.

Consider the sentence $Q(n)=P(1)\wedge P(2)\wedge\cdots\wedge P(n)$. We’ll show first that $Q(n)$ is always true.

First observe that $Q(1)=P(1)$, which by assumption is true.

Now I claim $Q(n)\Rightarrow Q(n+1)$, that is, $Q$ is inductive. Assume $Q(n)$ is true. That is, $P(1)\wedge P(2)\wedge\cdots\wedge P(n)$. Because $P$ is completely inductive, we know $P(n+1)$ is true. Then $Q(n+1)=Q(n)\wedge P(n+1)$ is true.

Thus we have proved, by (ordinary) induction, that $\forall n,Q(n)$.

Why does this establish that $\forall n,P(n)$? Because $Q(n)=P(1)\wedge\cdots\wedge P(n) \Rightarrow P(n)$ by definition.$\Box$

The converse is also true: we could have taken the Principle of Complete Induction as an axiom, and used it to prove the Principle of Mathematical Induction. In fact you’ll do that for homework.

Here’s another example of a proof by complete induction, which shows we might need to go back quite a few steps (hence, have quite a number of base cases to build on):

Claim. If $n\geq 12$, then there are nonnegative natural numbers $k$ and $l$ so that we can write $n=4k+5l$.

Proof.

base case: $n=12$. Let $k=3$ and $l=0$.

base case: $n=13$. Let $k=2$ and $l=1$.

base case: $n=14$. Let $k=1$ and $l=2$.

base case: $n=15$. Let $k=0$ and $l=3$.

Now suppose we’ve established that we can write any number up to and including $n$ as a sum of 4s and 5s. We may as well assume $n\geq 15$ since we handled the others already. We’ll try to write $n+1$ as a sum of 4s and 5s.

$(n+1)-4=n-3\geq 12$, so we know we can write $(n+1)-4$ as a sum of 4s and 5s, say $(n+1)-4=4k+5l$. But then $n+1=4k+4+5l=4(k+1)+5l$, so we’ve written $n+1$ as the sum of 4s and 5s. This completes the inductive step, hence the proof. $\Box$

 

Mathematical Induction

Standard post by ancoop on August 2, 2018
Comments are off for this post


[callout headingicon=”noicon” textalign=”textright” type=”basic”]

千里之行,始於足下

The journey of a thousand miles starts with a single step.

Lao Tzu

[/callout]

Many statements in mathematics are true {\em for any natural number}. For example.

Fundamental Theorem of Arithmetic. Every natural number has a unique prime decomposition.

Theorem. Every natural number is either even or odd.

Power Rule for Differentiation. For any natural number $n$, $\frac{d}{dx}\left[x^n\right]=nx^{n-1}$.

Our first use of the definition of $\mathbb{N}$ (other than to prove facts about arithmetic) will be to establish a way to prove statements like this. But to do that, we need to make sure our definition of $\mathbb{N}$ is formal enough to serve as the backbone of a proof. To do this, we’ll focus on the structure of the open sentences involved. Since we are interested only in statements that involve the natural numbers somehow, we’re going to assume that the variables we’re dealing with represent natural numbers unless otherwise specified. That is, we’re taking the universe for everything to be $\mathbb{N}$ until further notice.

Definition. We call an open sentence $P(n)$ inductive if it has the property: $\forall n, P(n)\Rightarrow P(n+1)$.

We can think of inductive sentences as representing hereditary properties — that is, they pass down to successors. Being royal, for example, is an inductive property.

Equipped with this definition, we can state axiom 5 of the natural numbers as a formal symbolic sentence. (If you recall, we stated the first four axioms as formal symbolic sentences, but left the fifth axiom just in terms of words.) Here goes:

The Inductive Axiom. Let $P(n)$ be an open sentence whose variable $n$ represents a natural number. Suppose that:

  • $P(1)$ is true, and
  • $P$ is inductive.

Then $\forall n, P(n)$.

This version still has a few words in it, but we could make it purely symbolic:$$\left(P(1)\wedge \left[\forall n, P(n)\Rightarrow P(n+1)\right]\right)\Rightarrow \forall k,P(k)$$

To see that this version is the same as the wordier version we gave before, let’s take $P(n)$ to be “$n$ can be reached from 1 by a sequence of successions“. It’s clear that this $P$ is inductive: if $n$ can be reached from 1 by successions, so can its successor. Clearly 1 can be reached from 1 by a(n extremely short) sequence of successions. So $P$ meets both criteria — the Inductive Axiom says “any natural number can be reached from 1 by a sequence of successions“.

The Inductive Axiom is also known as the Principle of Mathematical Induction, or PMI for short. It’s the engine that will let us prove lots of statements of the form $\forall n, P(n)$, because that’s the conclusion of PMI.

To use PMI in this way, you need to a few things:

  • identify what $P(n)$ is
  • verify that $P(1)$ is true (we call this the “base case”)
  • show that $P(n)$ is inductive, which means to prove $\forall n, P(n)\Rightarrow P(n+1)$ (we call this the “inductive step”)

Luckily, we already know how to prove universal claims, and we already know how to prove conditionals. So a general outline of a proof by induction goes like this:

Proof. (by induction)
base case: $n=1$. [Establish $P(1)$]

inductive step: Let $n$ be a natural number. Assume $P(n)$ is true.
[. . . some work goes here . . .]
Conclude $P(n+1)$.

$\Box$

Let’s write an example proof by induction to show how this outline works. I’ll put my commentary in blue parentheses.

Claim. I can reach any rung of any ladder which is standing on the ground.

Proof. Given a ladder, we’ll show that for any $n$, I can reach the $n^{th}$ rung of that ladder. (Here $P(n)$ is the statement “I can reach the $n^{th}$ rung of the ladder.”)

I can reach the first rung by taking a step — after all, the ladder is standing on the ground. (We’ve just proved $P(1)$.)

Now suppose I’m standing on the $n^{th}$ rung. (This is assuming $P(n)$.) Then I can take a step (These are the “juicy bits” of the proof.), and get to the $(n+1)^{th}$ step. (We’ve concluded $P(n+1)$.)

By the Principle of Mathematical Induction, this shows we can reach any rung of the ladder.  $\Box$

Here’s an example where $P(n)$ is “more mathematical”.

Claim. For any natural number $n$, $1+4+7+\cdots+(3n+1)=\frac{3}{2}n^2+\frac{5}{2}n+1$.

Proof. (base case) When $n=1$, the left-hand side is $1+4=5$. The right-hand side is $\frac{3}{2}\cdot 1^2+\frac{5}{2}\cdot 1+1=\frac{3}{2}+\frac{5}{2}+1=4+1=5$. So the equality is true when $n=1$.

(inductive step) Let $n$ be a natural number. Assume $1+4+7+\cdots+(3n+1)=\frac{3}{2}n^2+\frac{5}{2}n+1$. Consider the left-hand side of $P(n+1)$:\begin{align*} 1+4+7+\cdots+(3(n+1)+1)&=1+4+7+\cdots+3n+(3(n+1)+1)\\ &=\frac{3}{2}n^2+\frac{5}{2}n+1+3(n+1)+1\\&=\frac{3}{2}n^2+\frac{5}{2}n+3n+5\end{align*}Here we have used the assumption $P(n)$.

Now we work on the right-hand side of $P(n+1)$:\begin{align*} \frac{3}{2}(n+1)^2+\frac{5}{2}(n+1)+1&=\frac{3}{2}(n^2+2n+1)+\frac{5}{2}(n+1)+1\\&=\frac{3}{2}n^2+\frac{5}{2}n+3n+\frac{3}{2}+\frac{5}{2}+1\\&=\frac{3}{2}n^2+\frac{5}{2}n+3n+5\end{align*}

Since the left-hand side of $P(n+1)$ and the right-hand side of $P(n+1)$ are both equal to $\frac{3}{2}n^2+\frac{5}{2}n+3n+5$, they are equal to each other. That is, $P(n+1)$ is true. This completes the inductive step.

By the Principle of Mathematical Induction, we have established the claim. $\Box$.

A few things to note here. First, the base case is usually pretty obvious. Second, the fun (i.e. hard) part of the proof is figuring out how to use $P(n)$ to get $P(n+1)$.

It’s not always obvious what the natural number (or inductive variable) is. Often it’s the size of something (since the natural numbers are for counting!). If your inductive variable isn’t obvious from the problem statement, it’s good form to let the reader know what you’ll be inducting on.

Claim. Every finite set of real numbers has a least element.

Proof. We will proceed by induction on the size of the set.

(base case) Suppose the set has exactly one element. Then that element is least.

(inductive step) Assume we’ve proved that any set of real numbers with exactly $n$ distinct elements has a least element. We’ll show that any set of real numbers with exactly $n+1$ distinct elements has a least element.

Let $S$ be such a set, and label its elements $a_1,a_2,\ldots, a_n,a_{n+1}$. There are two possibilities for how $a_{n+1}$ is related to the other elements of the set.

case: $a_{n+1}<a_k$ for all $k\leq n$. Then $a_{n+1}$ is least.

case: There is some $k\leq n$ with $a_k<a_{n+1}$. Then consider the set $S’$ which consists of all of the $a$s except $a_{n+1}$. $S’$ has exactly $n$ elements, so by our inductive assumption, there is a least element of $S’$ (call it $b$). Then $b\leq a_k<a_{n+1}$, so $b$ is also least in $S$.

In either case, $S$ has a least element, thus completing the inductive step, hence (by PMI) the proof. $\Box$

“Generalized” induction

Now consider the following:

Claim. If I am standing in a second story window, I can get to any rung of any ladder standing on the ground, provided that rung is at least the $15^{th}$ rung.

The form of this claim is: $\forall n, \left(n\geq 15\Rightarrow P(n)\right)$, or $\forall n\geq 15,P(n)$.

Proof. Since I’m in the second story window, I can step out onto the $15^{th}$ rung. (Prove $P(15)$.)

Now if I’m on any rung, I can take a step to get to the next rung. (Prove that $P(n)$ guarantees $P(n+1)$.)

So no matter what rung (above the $15^{th}$) I want to get to, I can. $\Box$.

This technique of starting someplace other than 1 is sometimes called generalized induction, but it really doesn’t deserve such a fancy name. It’s just regular induction, but starting from some number other than 1.

The Natural Numbers and Induction

Standard post by ancoop on July 31, 2018
Comments are off for this post


[callout headingicon=”noicon” textalign=”textright” type=”basic”]

Die ganzen Zahlen hat der liebe Gott gemacht; alles andere ist Menschenwerk.

Beloved God made the integers; everything else is the work of human hands.

Leopold Kronecker

[/callout]

Most informal definitions of the natural numbers call them the “counting numbers”. This is correct, but it’s not a mathematical definition. To make a mathematical definition, we need to think a bit more carefully about how counting works.

Activity. Think about how you count. Really think about it. If someone gives you a stack of bills, how you do go about counting it?

 

Overview of the Natural Numbers and Induction

  • The Natural Numbers (this page).  Natural numbers are the numbers used for counting.
  • Proof by Induction. The structure of $\mathbb{N}$ provides a handy way to prove statements of the form $\forall n\in\mathbb{N}, P(n)$.
  • Other Uses of Induction.
  • Complete Induction. The inductive idea can be pushed into some interesting places.
  • The Well-Ordering Principle. Induction, in disguise!
  • The Integers and Induction. Technically induction only applies to claims quantified over $\mathbb{N}$, but we can usually use it for claims about the integers, too.

 

What are the Natural Numbers?

Definition. The natural numbers (denoted $\mathbb{N}$) satisfy the following properties:

  1. 1 is a natural number.
  2. Each natural number $n$ has a unique successor $S(n)$, which is a natural number.
  3. If two natural numbers have the same successor, they are the same.
  4. 1 is nobody’s successor.
  5. Any natural number can be reached from 1 by a chain of successions.

We call these the Peano axioms, after Giuseppe Peano. They were first published in 1889.

Exercise. Write clauses 2, 3, and 4 as symbolic sentences. Then compute their symbolic and English denials.

We normally represent the natural numbers as dots on a line:
\begin{tikzpicture}

\filldraw (0,0) circle (3pt) node[align=left, below]{$1$} (2,0) circle (3pt) node[align=left, above]{$S(1)$}  (4,0) circle (3pt) node[align=left, below]{$S(S(1))$}  (6,0) circle (3pt) node[align=left, above]{$S(S(S(1)))$}  (8,0) circle (3pt) node[align=left, below]{$S(S(S(S(1))))$};

\draw[->, line width=2pt] (0,0) — (2,0);

\draw[->, line width=2pt] (2,0) — (4,0);

\draw[->, line width=2pt] (4,0) — (6,0);

\draw[->, line width=2pt] (6,0) — (8,0);

\draw[->, line width=2pt] (8,0) — (10,0);

\end{tikzpicture}

where each dot is a natural number, and the arrow points from a number to its successor. We also adopt the symbols 2 for $S(1)$, 3 for $S(S(1))$, 4 for $S(S(S(1)))$, etc.

Each clause of this definition is necessary in order for counting to work. (1) is obvious, but the others are also necessary. Without the others, we might have “number lines” that look very different.

For example, we might have number systems like:

\begin{tikzpicture}

\filldraw (0,0) circle (3pt) node[align=left, below]{$1$} (2,0) circle (3pt) node[align=left, above]{$S(1)$} ;

\draw[->, line width=2pt] (0,0) — (2,0);

\end{tikzpicture}

which satisfies axioms 1, 3, 4, and 5, but not axiom 2.

Or this:

\begin{tikzpicture}

\filldraw (0,1) circle (3pt) node[align=left, below]{$1$} (2,2) circle (3pt) node[align=left, above]{$S(1)$}  (4,2) circle (3pt) node[align=left, below]{$S(S(1))$}  (6,2) circle (3pt) node[align=left, above]{$S(S(S(1)))$}  (8,2) circle (3pt) node[align=left, below]{$S(S(S(S(1))))$};

\draw[->, line width=2pt] (0,1) — (2,2);

\draw[->, line width=2pt] (2,2) — (4,2);

\draw[->, line width=2pt] (4,2) — (6,2);

\draw[->, line width=2pt] (6,2) — (8,2);

\draw[->, line width=2pt] (8,2) — (10,2);

\filldraw (0,1) circle (3pt) node[align=left, above]{$1$} (2,0) circle (3pt) node[align=left, below]{$S(1)$}  (4,0) circle (3pt) node[align=left, above]{$S(S(1))$}  (6,0) circle (3pt) node[align=left, below]{$S(S(S(1)))$}  (8,0) circle (3pt) node[align=left, above]{$S(S(S(S(1))))$};

\draw[->, line width=2pt] (0,1) — (2,0);

\draw[->, line width=2pt] (2,0) — (4,0);

\draw[->, line width=2pt] (4,0) — (6,0);

\draw[->, line width=2pt] (6,0) — (8,0);

\draw[->, line width=2pt] (8,0) — (10,0);

\end{tikzpicture}

which also satisfies axioms 1, 3, 4, and 5, but not axiom 2.

or

\begin{tikzpicture}

\filldraw (0,0) circle (3pt) node[align=left, below]{$1$} (2,0) circle (3pt) node[align=left, right]{$S(1)$} ;

\draw[->, line width=2pt] (0,0) — (2,0);

\draw[->, line width=2pt] plot [smooth cycle, tension=1] coordinates {(2,0) (3,1) (4,0) (3,-1)};

\end{tikzpicture}

or

\begin{tikzpicture}

\filldraw (0,0) circle (3pt) node[align=left, below]{$1$} (2,0) circle (3pt) node[align=left, right]{$S(1)$} ;

\draw[->, line width=2pt] (0,0) .. controls (.5,.5) and  (1.5,.5) ..(2,0);

\draw[->, line width=2pt] (2,0) ..controls (1.5,-.5) and  (.5,-.5) .. (0,0);

\end{tikzpicture}

Exercise. Identify which Peano axioms each of the “number lines” above satisfies, and which axioms they do not satisfy.

Exercise. Draw a “number line” which satisfies axioms 1, 2, 3, and 4 but not axiom 5.

Peano’s definition of the natural numbers is called an axiomatic definition because it’s really just a list of axioms, or desirable properties. When we make an axiomatic definition, we specify some of the properties we want the object we’re defining has. This is somewhat of a balancing act: if we list too many such properties, it might be that our list is too much to ask for, and no such object exists. If we ask for too few, on the other hand, then maybe there are many different objects which satisfy our definition.

For now, we’ll take it on faith that there’s a unique system satisfying axioms 1 – 5, which we’ll refer to as the natural numbers.

 

 

Logic: Proofs with Quantifiers

Standard post by ancoop on July 11, 2018
Comments are off for this post


Now we’re at the stage where you’re probably antsy to start actually writing proofs. Excellent! Let’s get started.

[pullquote color=”reynoldsred” align=”alignright”]The logical form of a statement indicates the structure of its proof.[/pullquote]

The main message (and it’s the reason we spend so much time discussing logic) is this: The logical form of a statement indicates the structure of its proof. That is to say: each bit of logic that makes up a statement (quantifiers and connectives) has a corresponding move in writing the proof.

That’s not to say that the logical structure is the only thing you need to understand to write a proof. Statements have both form (the logic) and content. The content will come into play eventually, and you’ll often need a bit of creativity — an “aha!” moment. The secret is that the logical form of the statement, and the structure of the proof it indicates, gives you a scaffold to support that creative moment and put it to good use when inspiration strikes.

[pullquote color=”reynoldsred” align=”alignleft”]You don’t need to know exactly how the whole thing is going to go before you get started.[/pullquote]

 

There’s another secret, which is that writing a proof (like a lot of writing) happens in a dramatically different order from the order in which reading a proof does. There are a lot of drafts and revisions. There’s a lot of goal-setting and question-asking. The important thing to remember here is that You don’t need to know exactly how the whole thing is going to go before you get started.

 

 

Let’s do some examples with a familiar idea, divisibility. We’ll start with the definition:

Definition. We say that the integer $p$ divides the integer $q$ if there is some integer $k$ with $q=pk$. We refer to $k$ as a quotient of $p$ by $q$. In symbols, $p$ divides $q$ means $\exists k: q=pk$. We call the number $k$ the quotient of $q$ by $p$.

Notation. We write $p|q$ for $p$ divides $q$.

Notice that in this definition, we’ve specified the universe for both $p$ and $q$. The definition itself is an open sentence with two variables; a third variable is required to state the definition. Finally, observe that the quotient is not assumed to be unique.

Proposition. 3 divides 12.

Let’s think about how we’d prove this. What do we need to do? We need to show that it’s possible to express 12 as a product of 3 and something. The easiest way to do this is to find that something. This one’s easy.

Proof. Let $k=4$. Then $12=3\cdot 4=3\cdot k$. This shows that 3 divides 12. $\Box$

The claim we wanted to prove was existential: $\exists k: 12=3\cdot k$. This example shows that there is essentially just one way to prove an existential claim, and that way has two steps:

  1. Propose a candidate. In this case, our candidate was $k=4$.
    How you come up with the candidate is where the fun/excitement/headaches of an existential proof are.
  2. Verify that the candidate works. In this case, the verification step was simple arithmetic.
    Usually, this step will be more straightforward than the proposal step, but that’s usually because in the proposal step you’ll have picked the candidate to make the verification step easy.

Let’s do a more interesting example.

Proposition. For any integers $a$, $b$, and $c$, if $a$ divides $b$ and $a$ divides $c$, then $a$ divides $b+c$.

Let’s translate this into symbols. It’s best to go in steps. First, we have the key word any, which means this starts with a universal quantifier. In fact there are three of them: one for each of the variables $a$, $b$, and $c$. Then we’ve got an and and an if… then. So a good start is:

\[\forall a,\forall b,\forall c, \left[(a|b)\wedge(a|c)\right]\Rightarrow a|(b+c)\]

We can then substitute in what $a|b$ means, namely $\exists k: b=ak$, and what $a|c$ means, namely $\exists \ell: c=a\ell$. I’ve used different letters for the quotients, because they might be different. (If the quotients in $a|b$ and $a|c$ end up being the same, then we can just declare $k=\ell$ at the end, so we don’t lose anything by using more variables. Always err on the side of using more variables.) We should also substitute $a|(b+c)$: $\exists m: b+c=am$. Then our statement reads:

\[\forall a,\forall b,\forall c, \left[\left(\exists k: b=ak\right)\wedge\left(\exists \ell: c=a\ell\right)\right]\Rightarrow \left(\exists m: b+c=am\right)\]

Now let’s think about how to start our proof. We want to prove something about all possible triples of integers $a,b,c$. How would you prove that all cows are brown? You’d need to test every cow with a brownometer. It would be much more convenient if you had an assistant to bring each cow to a testing barn, so you didn’t have to carry that heavy brownometer around with you. So you’d tell your assistant: bring me the first cow. Then, having checked that cow was brown, you’d say: bring me the second cow. You’d check that cow. Then you’d ask your assistant to bring in the third cow, etc. We’ll do the same thing.

When you need to prove a universal claim, you start out with the word let, which is the rhetorical equivalent of telling your assistant to bring the next cow into the brownometry barn:

Proof. Let $a$, $b$, and $c$ be integers.

So far, so good. Now we’ve handled the 3 universal quantifiers; the next logic we encounter is the $\Rightarrow$. How do we handle that? By assuming the antecedent, and trying to prove the consequent is true.

Assume that $a$ divides $b$ and $a$ divides $c$. We’ll show that $a$ divides $b+c$.

Now what does it mean to prove $a$ divides $b+c$? $a|(b+c)$ is the same as $\exists m: b+c=am$. It’s an existential claim! So we have to  propose some $m$ and verify that it satisfies $b+c=am$. That is, we need to solve the equation $b+c=am$. Now we could just write $m=\frac{b+c}{a}$, but that’s no good because we need $m$ to be an integer, and there’s no guarantee that any random fraction is going to be an integer.

Instead, we’ll use our assumptions.

Because $a$ divides $b$, there is a $k$ with $b=ak$. Because $a$ divides $c$, there is $\ell$ with $b=a\ell$. Set $m=k+\ell$.

That’s the proposal step. Now to verify. We need to verify two things: first, that $m$ is an integer,

Because $k$ and $\ell$ are both integers, and the sum of any two integers, $m$ is an integer.

and second that $m$ satisfies the equation $b+c=am$.

We also have \[b+c=ak+a\ell=a(k+\ell)=am\]

which shows that $a|(b+c)$. $\Box$

Notice that my orange annotations were a lot longer than the actual text of the proof. To drive this point home, let me reproduce the proof below, without the annotations so you can see how short it is:

Proof. Let $a$, $b$, and $c$ be integers. Assume that $a$ divides $b$ and $a$ divides $c$. We’ll show that $a$ divides $b+c$. Because $a$ divides $b$, there is a $k$ with $b=ak$. Because $a$ divides $c$, there is $\ell$ with $b=a\ell$. Set $m=k+\ell$.

Because $k$ and $\ell$ are both integers, and the sum of any two integers, $m$ is an integer. We also have \[b+c=ak+a\ell=a(k+\ell)=am\]

which shows that $a|(b+c)$. $\Box$

[pullquote color=”reynoldsred” align=”alignleft”] The structure of a conditional proof is drawn from the consequent.[/pullquote]

I want to point something out in this proof, which is that the orange text focuses primarily on the consequent $a|(b+c)$. This is an important but counterintuitive lesson: the structure of a conditional proof is drawn

from the consequent. That is, when we prove $P\Rightarrow Q$, we need to look at the logical structure of $Q$ to guide our thinking. Think of $P$ as your set of tools, and $Q$ as the goal you want to achieve. Anyone who’s played baseball or softball has been told: keep your eye on the ball. $P$ is the bat; $Q$ is the ball.

 

Here’s another example.

Proposition. Quotients are unique. That is, for any integers $a$ and $b$ with $b\neq 0$, if $a|b$, there is exactly one $k$ so that $b=ak$.

Proof. Let $a$ and $b$ be integers so that $b\neq 0$. Assume $a|b$.

We’ve just taken care of the universal quantifier and stated the antecedent. Because we’re going to focus on the consequent, we haven’t gone to the trouble to unpack the antecedent yet. Let’s state our goal.

We need to show that there is a unique $k$ with $b=ak$. That is, we need to show there such a $k$, and there is only one such $k$.

Because we’re proving an and, we prove each conjunct separately.

Such a $k$ exists because $a|b$.

Nicely done. How about the second conjunct?

We now need to show: $\forall k,k’, \left(b=ak\wedge b=ak’\right)\Rightarrow k=k’$.

This is a universally quantified conditional, so we let, assume the antecedent, and try to prove the consequent. Let’s strategize a moment. How do we prove two numbers (here, $k$ and $k’$) are equal? One strategy is to prove that their difference $k-k’$ is zero. Another strategy would be to prove that $\frac{k}{k’}=1$. Another strategy would be to prove that $k\leq k’$ and $k\geq k’$.

Let $k$ and $k’$ be integers. Assume $b=ak$ and $b=ak’$. Then \[0=b-b=ak-ak’=a(k-k’)\]

By the zero-product principle, either $a=0$ or $k-k’=0$.

When you know an or, you can break the proof down into cases. The important thing is that the cases cover all possibilities. Here we have:

case $a=0$.  Then $b=ak=0\cdot k=0$. But we assumed $b\neq 0$. So this case doesn’t actually happen, so we must really be in the other case.

case $k-k’=0$. Then $k=k’$. This is what we wanted to show.

When you do a proof by cases, you need to end each case by either showing that that case doesn’t occur, or that it ends with what you’re trying to prove.

Because we’ve established the claim in each case, we are done. $\Box$.

You might have been tempted to do the following:

\begin{align*}

b&=b\\

ak&=ak’\\

k&=k’

\end{align*}

That is, to divide through by $a$. But we don’t know we can do that yet. Divisibility and proofs involving divisibility are about multiplication, not about division. In fact, division is about divisibility, rather than the other way around.

Logic: Uniqueness

Standard post by ancoop on July 10, 2018
Comments are off for this post


There is another quantifier, besides existential and universal: the unique existential quantifier.

Definition. The sentence $\exists!x:S(x)$ is true if there is exactly one $x$ in the universe so that $S(x)$ is true. We read There is a unique $x$ such that $S(x)$.

Notice that mathematically, unique means something rather different from what it means in ordinary English. Mathematically, unique always appears attached to some property. If we said Maxine’s a unique person, mathematically we’d mean There is only one person in all the universe, and that one person is Maxine.

Proposition. $\exists!x:S(x)$ is equivalent to each of the following:

  1. $\left[\exists x:S(x)\right]\wedge \left[\forall y,z,\left(S(y)\wedge S(z)\right)\Rightarrow y=z\right]$
  2. $\exists x:\left[S(x)\wedge \forall y,\left(S(y)\Rightarrow y=x\right)\right]$

Proof.

First we’ll show that each of (1) and (2) is enough to guarantee $\exists!x:S(x)$. (1) guarantees that there is at least one $x$ with $S(x)$. If we had two distinct members of the universe, say, $y$ and $z$, satisfying $S(y)$ and $S(z)$, then we’d have $y=z$, so $y$ and $z$ wouldn’t actually be distinct after all. Thus there is at most one $x$ with $S(x)$.

Now assume (2) is true. We know that there is at least one $x$ with $S(x)$. Suppose we had another, say $x’$. Then applying $\forall y, S(y)\Rightarrow y=x$ with $y=x’$, we see that $x=x’$. So $x’$ was $x$ to begin with. So there is at most one $x$ with $S(x)$.

Now we’ll show that $\exists!x:S(x)$ guarantees (1) and (2).

First, examine (1). Since we are proving a statement of the form $A\wedge B$, we must establish both conjuncts $A$ and $B$. The first conjunct $\exists x:S(x)$ is clearly true; there is a unique $x$ with $S(x)$, therefore there is some $x$ with $S(x)$. Now let’s prove the second conjunct, $\forall y,z,\left(S(y)\wedge S(z)\right)\Rightarrow y=z$. If this were false, we’d have $\exists y:\exists z:\left(S(y)\wedge S(z)\wedge y\neq z\right)$. Thus we have two distinct values of $x$ for which $S(x)$ is true. But this contradicts $\exists!x:S(x)$.

Now consider (2). We need to find the special $x$; let’s use the $x$ given by $\exists!x:S(x)$. Such an $x$ has $S(x)$. Now given $y$ with $S(y)$, we see that since there is exactly one value $z$ with $S(z)$, and $S(y)$ and $S(x)$, it must have been that $y=x$.

Thus we’ve shown that $\exists!x:S(x)$ guarantees (1) and (2) and each of (1) and (2) guarantees $\exists!x:S(x)$. This completes the proof.$\Box$

We can describe (1) above as saying: there is at least one $x$ with $S(x)$, and there is at most one $x$ with $S(x$).

We can describe (2) as saying: there is a special $x$ with has both $S(x)$, and the property that whenever $S(y)$ is true, $y$ must be the same as $x$.

Here we could say some words about how to prove a statement of the form $\exists!x:S(x)$, but we’ll postpone that a bit.

Exercising our powers of what-if-not thinking, let’s ask how $\exists!x:S(x)$ could be false. Consider the following statements:

  1. There is a unique President of the United States.
  2. There is a unique US Senator from Florida.
  3. There is a unique US Senator from Washington, DC.

(1) is true. (2) and (3) are false, but false for different reasons. Let’s see why:

Exercise. Compute the denial of $\exists!x:S(x)$, using the fact that $\exists!x:S(x)$ can be expressed

\begin{equation*}
\left[\exists x:S(x)\right]\wedge \left[\forall y,z,\left(S(y)\wedge S(z)\right)\Rightarrow y=z\right]
\end{equation*}

Explain the relevance of this to the problem of uniqueness of Senators from Florida and from Washington, DC.

Logic: Quantifiers

Standard post by ancoop on July 10, 2018
Comments are off for this post


Some sentences feel an awful lot like statements but aren’t. For example,

n is even

This is not a statement because it doesn’t have a truth value; unless we know what n is, we can’t really do much.

Definition. A sentence with one or more variables, so that supplying values for the variables yields a statement, is called an open sentence.

You can think of an open sentence as a function whose values are statements. In fact we will use function notation to name open sentences.

S(n)= n is even

Just as with ordinary functions, this notation works by substitution. With S(n) defined as above,

S(3)= 3 is even

which happens to be a false statement. S(14), on the other hand, is a true statement.

Along with an open sentence, we have to provide some kind of indication of what sort of thing the variable might be. For our example S(n), it makes most sense to let n be a natural number or possibly an integer. We call possible values for the variable of an open sentence the universe of that sentence.

One thing that cannot be emphasized enough is that variables can represent any type of thing, not just numbers or other mathematical objects. So we could think about the open sentence

A(r,t)= r is t years old

and say that the universe for r is everyone in this class and the universe for t is any whole number between 15 and 60.

Definition. Given an open sentence with one variable S(x), the statement \forall x, S(x) is true when, no matter what value of x we use, S(x) is true; otherwise \forall x, S(x) is false.
Given an open sentence with one variable S(x), the statement \exists x: S(x) is true when there is some value of x for which S(x) is true; otherwise \exists x:S(x) is false.

Terminology. We call \forall the universal quantifier, and we read \forall x, S(x) for all x, S(x). We call \exists the existential quantifier, and we read \exists x:S(x) there exists x such that S(x).

Observe that if there are only two possible values in the universe for S(x) (let’s call them \alpha and \beta), then \forall x,S(x) is true when both S(\alpha) and S(\beta) are true. But this is the same as S(\alpha)\wedge S(\beta) being true. Similarly, \exists x:S(x) is true when one of S(\alpha) or S(\beta) is true. But this is the same as S(\alpha)\vee S(\beta). So we see that the quantifiers are in some sense a generalization of \vee and \wedge. So the following makes sense:

De Morgan’s Laws, quantifier version: For any open sentence S(x) with variable x,

  1. \sim \left[\forall x,S(x)\right] \equiv \exists x:\left[\sim S(x)\right]
  2. \sim \left[\exists x:S(x)\right] \equiv \forall x,\left[\sim S(x)\right]

For example, a denial of the statement

There is a china teapot floating halfway between the earth and the sun.

is

Every china teapot is not floating halfway between the earth and the sun.

the universal quantifier, conditionals, and the universe

Quantifiers are most interesting when they interact with other logical connectives. For example, consider the following (true) statement:

Every multiple of 4 is even.

We could choose to take our universe to be all multiples of 4, and consider the open sentence

E(n)= n is even

and translate the statement as \forall n,E(n). But that isn’t very interesting. A much more natural universe for the sentence n is even is the integers. But then we have to do something clever, because if our universe for n is the integers, then \forall n,E(n) is false. The solution is to create another open sentence

F(k)= k is a multiple of 4

How do we use F and E to translate our true statement? We can think of an open sentence as a test–if we plug in a value for its variable(s), we see whether that variable passes the test. Here we have two tests: E, a test for evenness, and F, a test for multiple-of-4-ness. The statement we are trying to translate says that passing the F test is enough to guarantee passing the E test. That sounds like a conditional. Indeed the correct translation for Every multiple of 4 is even is:

\forall n,F(n)\Rightarrow E(n)

Try translating this statement back into English using some of the various translations for \Rightarrow to see that it really does mean the same thing as Every multiple of 4 is even.

Exercise. Translate \forall n, F(n)\wedge E(n) and \forall n, F(n)\vee E(n) into English. Explain why these are false statements. Assume the universe for both E and F is the integers.

In fact, we can always expand the universe by putting in another conditional. If we let Z(s) be the sentence s is an integer and expand our universe to include all mathematical objects encountered in this course, we could translate Every multiple of 4 is even as \forall s, \left(Z(s)\Rightarrow\left(F(s)\Rightarrow E(s)\right)\right).

A Note about Notation. It’s important to keep in mind that, just as for the functions you’ve encountered in calculus and before, the particular symbol we use for a variable is not relevant to the meaning of that variable. The statements

\forall s,\left(Z(s)\Rightarrow\left(F(s)\Rightarrow E(s)\right)\right)

and

\forall p,\left(Z(p)\Rightarrow\left(F(p)\Rightarrow E(p)\right)\right)

both say the same thing. namely, Every integer which is a multiple of 4 is even. The fact that we called the variable s when we defined Z and n when we defined E does not require us to always use those variables. Notice that in the English translation, no variables appear at all! We could equally well have written

\forall \odot, \left(Z(\odot)\Rightarrow\left(F(\odot)\Rightarrow E(\odot)\right)\right)

except that that’s a bit difficult to pronounce.

The existential quantifier and the universe

We just saw that generally speaking, a universal quantifier should be followed by a conditional. What should an existential quantifier be followed by? Consider the following true statement

There is a multiple of 7 which is even.

How can we represent this symbolically? We could take the universe to be all multiples of 7 and write \exists n: E(n). But as before, that’s not very interesting. So let’s keep our universe as it should be: the integers. As before, we’ll need a test for multiple-of-7-ness: denote by S(k) the sentence k is a multiple of 7.

Now think about what the statement There is a multiple of 7 which is even means. What is the relationship between multiple-of-7-ness and evenness? Just that some number happens to be both. Therefore we can translate:

\exists n: S(n)\wedge E(n)

Notice that because \wedge is commutative, our symbolic statement is equivalent to \exists n:E(n)\wedge S(n). But this is just fine, because our statement and the statement

There is an even number which is a multiple of 7

pretty clearly mean the same thing.

Let’s lock in the connection between \exists and \wedge with another example. This time we’ll use De Morgan’s laws and consider the statement

Every multiple of 7 is even.

which happens to be false. Therefore its negation is true. We compute that negation:

\sim\left[\forall n,\left(S(n)\Rightarrow E(n)\right)\right] \equiv \exists n:\sim\left(S(n)\Rightarrow E(n)\right)
\equiv \exists n:\left(S(n)\wedge \sim E(n)\right)

which we could phrase in English as There is an integer which is a multiple of 7 and not even. Thus we see that the existential quantifier pairs naturally with the connective \wedge.

Exercise. Let E(k) stand for k is even, S(p) stand for p is a multiple of 7, and Z(m) stand for m is an integer. Let the universe for all three sentences be the set of all mathematical objects encountered in this course. Write a symbolic translation of There is a multiple of 7 which is even using these open sentences.

Exercise. Translate \exists n: S(n)\Rightarrow E(n) into English. Explain why this is a true statement. Give a useful denial. Assume the universe for both E and S is the integers.

quantifier order

Many interesting open sentences have more than one variable, such as:

A(r,t)=r is t years old

Since there are two variables, we are entitled to ask the question which one? twice. And we may have a different answer each time. There are eight possibilities, of which four are

  1. \forall r,\forall t, A(r,t) For any person r, for any age t, that person is that age.
  2. \forall t,\forall r, A(r,t) For any age t, for any person r, that person is that age.
  3. \exists r: \exists t: A(r,t) There is a person r who is some age t.
  4. \exists t: \exists r: A(r,t) There is an age t so that some person r is that age.

A moment’s thought should make clear that statements 1 and 2 mean the same thing (in our universe, both are false), and statements 3 and 4 mean the same thing (in our universe, both are true if woefully uninformative). This says that we can move existential quantifiers past one another, and move universal quantifiers past one another.

The other four possibilities are

  1. \forall r,\exists t: A(r,t) For any person r, there is an age t so that that person is that age.
  2. \exists t: \forall r, A(r,t) There is an age t so that every person r is that age.
  3. \forall t,\exists r: A(r,t) For any age t, there is a person who is that age.
  4. \exists r: \forall t, A(r,t) There is a person r who is every age.

Quantifier order matters.
Mixed quantifiers do not commute.

Notice that statement 5 is true (in our universe): everyone has an age. But statement 6 says that everyone is the same age, which is false in our universe. So statement 5 and statement 6 mean different things. Similarly, statement 7 is likely true in our universe, whereas statement 8 is false. The lesson is that quantifiers of different flavors do not commute!

Logic in Proofs

Standard post by ancoop on July 10, 2018
Comments are off for this post


[callout headingicon=”noicon” textalign=”textright” url=”https://www.youtube.com/watch?v=OncDlwNH9F0″ target=”true” type=”basic”]

If man is five, then the devil is six.

And if the devil is six, then god is seven.

This monkey’s gone to heaven.

Black Francis, Monkey Gone to Heaven

[/callout]

Logic as a game of pushing symbols is fun, but our purpose with it is to construct arguments. In fact, classically an advanced education was divided into two curricula: the lower division, the trivium or three-fold path, consisted of

  1. grammar
  2. logic
  3. rhetoric

These three are in fact intimately related. We’ve already seem some interaction between grammar and logic; let’s see some ways that logic interacts with rhetoric, that is, the construction of compelling arguments.

proving $P\Rightarrow Q$

Many of the statements we want to prove in mathematics are conditionals. How could we go about doing this? Actually it’s easy. Recall the truth table for $\Rightarrow$:

$P$ $Q$ $P\Rightarrow Q$
T T T 1
F F 2
F T T 3
F T 4

We want to establish that $P\Rightarrow Q$ is true. That is, we want to make sure that we end up with a T in the third column of the truth table above. Phrased this way, it looks like our job is pretty easy (because there are so many Ts in that column!). If $P$ is false, we’re good to go. So we may as well assume $P$ is true. Then we only have to worry about $Q$ being false. Since $Q$ is a statement, it’s either true or false. So ruling out $Q$ being false means showing $Q$ is true.

Thus: we need to assume $P$ is true and prove $Q$ is true. The shorthand for this is: assume $P$; prove $Q$. In fact many primers on proof will belabor this point, putting the above maxim in a box in some bold color and telling you that now you know how to prove conditionals! But of course it’s how one gets from the assumption of $P$ to the conclusion of $Q$ that is where the fun lies, and we haven’t said really anything about that (yet). This is called a direct proof of $P\Rightarrow Q$.

other uses of the truth table for $P\Rightarrow Q$

modus ponens

Now let’s imagine that we know $P\Rightarrow Q$ is true; in other words, that we’re in line 1, line 3, or line 4; but we don’t know which one. If we had some additional information, like we happened to know that $P$ was true, then we’d know for sure we’re in line 1. But then we’d also know that $Q$ is true.

So we have the following rule: if we know $P$ and we know $P\Rightarrow Q$, then we know $Q$ also.

If you think about it for a moment, this is more or less the definition of the word if. But Classical and medieval logicians gave it a name, modus ponens.

modus tollens

What if we happened to know that $Q$ was false? Then we’ve got to be in line 4, so we know that $P$ is also false. So we have the following rule:

If we know $Q$ is false and we know $P\Rightarrow Q$, then we know $P$ is false also.

This is known to the medieval logicians as modus tollens.

If modus tollens seems a bit odd to you, just consider the following statement:

When you see ice in my scotch, that’ll be the day I die!

How does this work? Logically, the speaker is asserting $\text{ice is in my scotch } \Rightarrow\text{ it is the day I die}$. But since they’re around to have the conversation with us, we know the consequent is false. So the intended meaning must be that the antecedent was false, i.e. there’s no ice in their scotch. In fact we use this sort of discourse all the time in ordinary conversation.

reductio ad absurdum

A similar mode of reasoning also relies on the fact that only a false statement can imply a false statement. The idea is to establish a claim (we’ll call it $P$) by showing that $P$ couldn’t possibly be not true. So we assume $\sim P$, and make valid logical steps until we get to some egregious statement $Q$ which we all agree is false. Since our reasoning was sound, this must have meant that $\sim P$ was false, i.e. that $P$ was true. for example:

Consider what would happen if lowering the income tax rate always lowered income tax revenues. If we set the income tax rate at 100\%, we’d bring in more income tax revenue than any other rate. But at a 100\% income tax rate, nobody will have any reason to work, hence there will be no income tax revenue at all. Thus we should expect to get no income tax revenue at any other rate. On the other hand, the current income tax rate does generate income tax revenue. So we see that sometimes, lowering the income tax rate may increase income tax revenues.

Exercise. In the above argument, identify: $P$, the goal we set to prove; $\sim P$, the thing we assumed; $Q$, the truly egregious statement we all agree is false.

[panel heading=”what’s with these names?” headingtype=”p” type=”panel-info” state=”closed”]”modus ponens” is often translated as “the method of affirming”, but that’s a bad translation. In Latin, “pono” means “to put” or “to place”. So you might call this “the placement method”. Why? If you’ve ever watched a legal drama, the lawyers say things like “I put it to you, the jury, that…”. That’s sot of what’s going on here.

“modus tollens” is often translated as “the method of denying”, but again that’s bad Latin. “tollo” means “to snatch”. My preferred translation of this would be something like “the thief’s method”.

Actually both modus ponens and moduls tollens are less mysterious than they are made to seem by their fancy names. Perhaps the best name for both would be “modus manifestissumus” — “the really obvious method”.

“reductio ad absurdum” actually gets translated correctly, as “reduction to the absurd”. The idea is to take the denial of what you want to prove, and turn it into something. . . absurd.[/panel]

some other useful logical facts

Below is a list of logical facts. You should establish each one by using one or more of the following ways: making a truth table for the formulae involved; comparing truth tables; making algebraic manipulations of the formulae. You should also convince yourself that the statements make sense without resorting to these formal, abstract methods.

  1. If $P\wedge Q$ is true, then $P$ is true.
  2. If $P$ is true, then $P\vee Q$ is true.
  3. If $P\Rightarrow Q$ and $Q\Rightarrow R$ are both true, then $P\Rightarrow R$ is true.
  4. $P\Rightarrow (Q\vee R)$ is equivalent to $\left(P\wedge (\sim Q)\right)\Rightarrow R$ and $\left(P\wedge (\sim R)\right)\Rightarrow Q$.
  5. $P\Rightarrow (Q\wedge R)$ is equivalent to $\left(P\Rightarrow Q\right)\wedge \left(P\Rightarrow R\right)$.
  6. If $P\Rightarrow R$ is true and $Q\Rightarrow R$ is true, then $\left(P\vee Q\right)\Rightarrow R$ is true.
  7. If $\left(P\vee Q\right)\Rightarrow R$ is true, then $P\Rightarrow R$ is true and $Q\Rightarrow R$ is true.

We can restate each of the above facts as the claim that a certain logical formula is a tautology:

  1. $\left(P\wedge Q\right)\Rightarrow P$ is a tautology.
  2. $P\Rightarrow\left(P\vee Q\right)$ is a tautology.
  3. $\left[\left(P\Rightarrow Q\right)\wedge\left(Q\Rightarrow R\right)\right]\Rightarrow \left(P\Rightarrow R\right)$ is a tautology.
  4. $\left[P\Rightarrow (Q\vee R)\right]\Leftrightarrow\left[\left(P\wedge (\sim Q)\right)\Rightarrow R\right]$ is a tautology.
  5. $\left[P\Rightarrow (Q\wedge R) \right]\Leftrightarrow\left[\left(P\Rightarrow Q\right)\wedge \left(P\Rightarrow R\right)\right]$ is a tautology.
  6. $\left[\left(P\Rightarrow R\right)\wedge\left(Q\Rightarrow R\right)\right]\Rightarrow\left[\left(P\vee Q\right)\Rightarrow R\right]$ is a tautology.
  7. $\left[\left(P\vee Q\right)\Rightarrow R\right]\Rightarrow\left[\left(P\Rightarrow R\right)\wedge\left(Q\Rightarrow R\right)\right]$ is a tautology.

 

Logic: Conditionals

Standard post by ancoop on July 9, 2018
Comments are off for this post


 

So far our statements haven’t been very interesting. In fact most mathematical statements of interest are things like

If a function is differentiable, then it is continuous.

These statements are known as conditional, because they depend on some criteria, or conditions, being satisfied.

Consider the following statement (first, convince yourself that it is a statement!):

If it rains, the laundry will be ruined.

What is this statement about? It’s not really about whether or not it rained; it’s not really about whether or not the laundry was ruined. Rather it’s about the relationship between rain and laundry.  That’s what makes conditionals so interesting.

Definition. Given two propositions $P$ and $Q$, the proposition $P\Rightarrow Q$ (read if $P$ then $Q$) is given by the truth table:

$P$ $Q$ $P\Rightarrow Q$
T T T 1
F F 2
F T T 3
F T 4

 

Exercise. Is $\Rightarrow$ commutative or associative?

A word of caution: this truth table causes consternation among most people when they first encounter it. In particular,  lines 3 and 4 sometimes rub people the wrong way. Statements like

If $1=2$, then $3=4$.

don’t exactly feel true. But we’ll go over each line of the truth table to justify why these are the right choices. To make things easier, we’ll label the two inputs to the conditional so we can refer to them. There are several labeling schemes:

$P\Rightarrow Q$
antecedent consequent
hypothesis thesis
assumption conclusion

 

Now let’s go through each line of the truth table for $\Rightarrow$. There are two analogies that will help us understand this: the analogy of science, and the analogy of a contract.

The Science Analogy

Let’s consider the conditional

If it rains, the laundry will be ruined.

as a scientific claim. As good laundrologists, we want to put this to the test. So we conduct an experiment. Line 1 of the truth table is the situation when our experimental variable (the rain) is present, and so is our measured variable (the ruin of the laundry). These particular experimental data support the claim. We’ll record that as a T (though good experimental scientists know that a scientific claim is only ever provisionally established).

Line 2 is what would happen if the experimental variable were there, but the expected outcome (laundry being ruined) didn’t happen. Then we would know for sure that the proposed theory is definitely false. We record a F.

What about lines 3 and 4? That’s what scientists call the control condition; the experimental variable isn’t present. So what do we record in our lab notebook? No matter what happens with the laundry, the control data don’t tell us one way or the other about the claim. In particular, they leave open the possibility that the claim is true. So we record a provisional T in our lab book, just the same as the provisional T on line 1.

The Contract Analogy

Now let’s consider a conditional like

If you do all the work in this course, you’ll get an A.

Treat this as a contract or an agreement between us. Line 1 of the truth table is: you did the work, and you got the A. No problem: the agreement was carried out just fine. Line 2 of the truth table is: you did the work, but you didn’t get the A. That’s a clear violation of the agreement; in other words the agreement didn’t hold. Hence the truth value of F.

What about lines 3 and 4? That’s when you didn’t do all of the work. I could still give you an A (if I’m feeling kind) or not give you an A, but either way — you couldn’t really say that I’d violated the agreement. The agreement is still in force the whole time. That’s why we assign the truth value T to these lines.

 

In both analogies, a really solid way to think about this is what-if-not thinking: what would it mean for the conditional to be false? How could you know for sure that If it rains, the laundry will be ruined is false? Only if it it rains but the laundry isn’t ruined. How could you know for sure that it’s false that If you do all the work in this course, you’ll get an A is false? Only if you do all the work in this course, but nevertheless you don’t get the A. Let’s record that as a proposition:

Proposition. $\sim (P\Rightarrow Q)$ is equivalent to $P\wedge (\sim Q)$

We also have the equivalence:

Proposition. $P\Rightarrow Q$ is equivalent to $(\sim P)\vee Q$

Exercise. Prove this proposition in two ways: first, from the proposition above it using one of De Morgan’s Laws; and second, by comparing truth tables.

Notice that I used the words but and nevertheless to translate the logical symbol $\wedge$. Normally $\wedge$ is translated and, but the word but can be very useful when you want to deny a conditional. When it comes to truth values, the following words all mean the same thing:

and, but, yet, nevertheless

When it comes to $\Rightarrow$, there are even more possibilities for reasonable English translations; so many in fact that it’s impossible to list them all. Here are a few that you will likely encounter. The following all mean the same thing, namely $\text{the King of France is bald } \Rightarrow \text{ roses are red}$:

If the King of France is bald, then roses are red.
Roses are red if the King of France is bald.
The baldness of the King of France guarantees that roses are red.
The baldness of the King of France suffices for roses to be red.
For roses to be red, it is sufficient that the King of France be bald.
It is necessary for the King of France to be bald, that roses be red.
Roses’ redness is necessary to the King of France’s baldness.
Should the King of France be bald, roses will be red.
Supposing the King of France is bald, roses are red.
The King of France is bald only if roses are red.

I’ve departed from my normal typographical convention and put the keywords in italics, the antecedent in blue bold, and the consequent in green underline.

Some things to notice about these translations, in no particular order:

  • Sometimes the antecedent occurs first in the sentence, sometimes the consequent does.
  • Sometimes the antecedent occurs first in time, sometimes the consequent does.
  • The grammatical form of the antecedent and the consequent have to be adapted to read nicely.
  • The conditional keyword is sometimes split into more than one word.
  • If is different from only if.

Exercise. Create English translations of the conditional above, using the conditional keywords when and whenever.

How to keep it all straight? Use what-if-not-thinking. That is, ask yourself how you’d know the sentence was false. For example: what would it mean for Roses’ redness is necessary to the King of France’s baldness to be false? If the redness isn’t truly necessary, that means there’s a circumstance in which the King is bald but the roses aren’t red.

biconditionals

There is also a kind of conditional called the biconditional, which we denote $P\Leftrightarrow Q$:

$P$ $Q$ $P\Leftrightarrow Q$
T T T 1
F F 2
F T F 3
F T 4

 

Proposition. $P\Leftrightarrow Q$ is equivalent to $(P\Rightarrow Q)\wedge (Q\Rightarrow P)$

We therefore write and read $P\Leftrightarrow Q$ as $P$ if and only if $Q$, or sometimes as $P$ iff $Q$ or $P$ exactly when $Q$.

Exercise. Is $\Leftrightarrow$ commutative or associative?

Skip to toolbar