4  Axiomatic Derivations

We now present an axiom system for propositional logic, which will consist of an infinite set of axioms and a single rule of inference. A derivation of a well-formed formula from a set of formulas will consist of a finite sequence of formulas each of whose members will either be an axiom or a premise or the output of an application of the rule of inference to prior members of the sequence.

There is one ingredient we will use for the specification of the infinite set of axioms. If \(\varphi\) is a propositional formula whose propositional variables are \(p_1, ..., p_n\) and \(\psi_1\), …, \(\psi_n\) are propositional formulas, we write \[ \varphi[\psi_1/p_1, ..., \psi_n/p_n] \]

for the well-formed formula that results from \(\varphi\) when each propositional variable \(p_i\) is substituted for the well-formed formula \(\psi_i\). We will call \(\varphi[\psi_1/p_1, ..., \psi_n/p_n]\) a substitution instance of \(\varphi\).

Uniform Substitution

Given a formula \(\varphi\) whose propositional variables are among \(p_1, \dots, p_n\) and formulas \(\psi_1, \dots, \psi_n\), define \(\varphi[\psi_1/p_1, ..., \psi_n/p_n]\) recursively:

If \(\varphi\) is \(p_i\), \(\varphi[\psi_1/p_1, ..., \psi_n/p_n]\) is \(\psi_i\).

If \(\varphi\) is \(\neg \chi\), \(\varphi[\psi_1/p_1, ..., \psi_n/p_m]\) is \(\neg \chi[\psi_1/p_1, ..., \psi_n/p_m]\).

If \(\varphi\) is \((\chi \to \rho)\), then \(\varphi[\psi_1/p_1, ..., \psi_n/p_n]\) is \((\chi[\psi_1/p_1, ..., \psi_n/p_n]\to \rho[\psi_1/p_1, ..., \psi_n/p_n])\).

Example

\(p[\neg q/p] = \neg q\)

\(\neg (p \to \neg q)[\neg p/p, p \to p/q] = \neg(\neg p \to \neg (p \to p))\)

\((p \to q)[\neg q/p, \neg p/q] = \neg q \to \neg p\)

Exercise

Unpack each of the following:

\(p[(p \to (q \to p))/p]\)

\((p \to (q \to p))[\neg p/p, \neg p/q]\)

\(\neg \neg (p \to \neg q)[(p \to q)/p, (q \to p)/q]\)

We are now in a position to define the set of axioms of the system:

Axioms

A well-formed formula \(\varphi\) is an axiom if, and only if, it is a substitution instance of one of the well-formed formulas below: \[ \begin{array}{lll} \textsf{A1} & & p \to (q \to p)\\ \textsf{A2} & & (p \to (q \to r)) \to ((p \to q) \to (p \to r))\\ \textsf{A3} & & (\neg p \to \neg q) \to ((\neg p \to q) \to p) \end{array} \]

We will use a single rule of inference, which we call Modus Ponens (\(\textsf{MP}\)). The rule in question will enable us to move from two formulas of the form \(\varphi\) and \((\varphi \to \psi)\) to the formula \(\psi\).

Rule of Inference

\[ \begin{array}{lll} \textsf{MP} & & \varphi, \ (\varphi \to \psi)/\psi\\ \end{array} \]

We now explain what is for a formula to be derivable from a set of formulas:

Derivability

A well-formed formula \(\varphi\) is derivable from a set of well-formed formulas \(\Gamma\), written \(\Gamma \vdash \varphi\), if, and only if, there is a finite sequence of well-formed formulas \((\chi_1, \dots , \chi_n)\) such that

\(\chi_n = \varphi\)

for each \(m \leq n\), either

\(\chi_m\) is an axiom, or

\(\chi_m \in \Gamma\), or

\(\chi_k = (\chi_l \to \chi_m)\), for some \(k, l <m\).

Such a finite sequence \(( \chi_1, ..., \chi_n )\) is called a derivation (or a proof) of \(\varphi\) from \(\Gamma\).

So, a formula appears in a derivation if, and only if, it is either an axiom or a premise or the output of an application of modus ponens to earlier members of the sequence.

Theorem

A well-formed formula \(\varphi\) is a theorem, written \(\vdash \varphi\), if, and only if, \(\emptyset \vdash \varphi\).

We write \(\nvdash \varphi\) to indicate that \(\varphi\) is not a theorem.

Here is our first example of a derivation:

Example

\(\vdash p \to p\) \[ \begin{array}{llll} 1 & & p \to ((p \to p) \to p) & \textsf{A1}[p/p, (p \to p)/q]\\ 2 & & (p \to ((p \to p) \to p))\to & \\ & & ((p \to (p \to p)) \to (p \to p)) & \textsf{A2}[p/p, (p \to p)/q, p/r]\\ 3 & & (p \to (p \to p)) \to (p \to p) & \textsf{MP} \ 1, 2\\ 4 & & p \to (p \to p) & \textsf{A1}[p/p, p/q]\\ 5 & & p \to p & \textsf{MP} \ 3, 4\\ \end{array} \]

Note

Notice that the derivation is far from intuitive and the system requires one to proceed through trial and error until we find appropriate substitution instances in order to derive \(p \to p\) from available axioms. On the other hand, it is simple to generalize the argument above into a schematic derivation of \(\varphi \to \varphi\) from no premises: \[ \vdash \varphi \to \varphi \]

The axiomatic system is not particularly user-centered and certainly not preferable to natural deduction systems in which there are much simpler proofs of \(p \to p\). Curiously, the axiom system is able to prove every logical truth, and, moreover, it allows one to prove a formula from a set of formulas if, and only if, the former is a logical consequence of the latter.

But before we establish these results, we set out to develop a library of subroutines, which will facilitate the discussion of the question of whether a given formula is derivable from a set of formulas.

4.1 The Deduction Theorem

We will rely on a very general result. First, some notational conventions.

Notation

We write \(\Gamma, \varphi\) to abbreviate: \(\Gamma \cup \{\varphi\}\). And we will similarly write \(\varphi \vdash \psi\) to abbreviate: \(\{\varphi\} \vdash \psi\).

The Deduction Theorem

Given a set of formulas \(\Gamma\) and formulas \(\varphi\) and \(\psi\), \(\Gamma, \varphi \vdash \psi\) if, and only if, \(\Gamma \vdash \varphi \to \psi\).

One direction is simple:

Theorem 4.1 Given a set of formulas \(\Gamma\) and formulas \(\varphi\) and \(\psi\), if \(\Gamma \vdash \varphi \to \psi\), then \(\Gamma, \varphi \vdash \psi\)

Proof. Suppose \(\varphi \to \psi\) is derivable from \(\Gamma\). To obtain a derivation of \(\psi\) from \(\Gamma, \varphi\), we may extend a derivation of \(\varphi \to \psi\) from \(\Gamma\) with two additional steps: \(\varphi\), which being a member of \(\Gamma, \varphi\), is available as a premise, and \(\psi\) which is the output of \(\textsf{MP}\) when applied to \(\varphi \to \psi\) and \(\varphi\).

\(\blacksquare\)

The other direction is the deduction theorem, which requires a more involved justification:

Theorem

Given a set of formulas \(\Gamma\) and formulas \(\varphi\) and \(\psi\), if \(\Gamma, \varphi \vdash \psi\), then \(\Gamma \vdash \varphi \to \psi\).

Choose a set of formulas \(\Gamma\) and a formula \(\varphi\). We will use a complete induction (without a base case) on positive integers to prove that for every positive integer \(n\):

  • for all sets of formulas \(\Gamma\) and for all formulas \(\chi_1 \dots \chi_n\) and \(\varphi\) and \(\psi\), if \((\chi_1, ..., \chi_n)\) is a derivation of \(\psi\) from \(\Gamma, \varphi\) of length \(n\), then \(\Gamma \vdash \varphi \to \psi\).

That is, we argue that

  • if that is true for every positive integer \(m < n\), then it is true for \(n\) as well,

So, we begin with the assumption:

  • for every positive integer \(m < n\), for all sets of formulas \(\Gamma\) and for all formulas \(\chi_1 \dots \chi_m\) and \(\varphi\) and \(\psi\), if \((\chi_1, ..., \chi_m )\) is a derivation of a formula \(\psi\) from \(\Gamma, \varphi\) of length \(m\), then \(\Gamma \vdash \varphi \to \psi\)

We want to prove that

  • for all sets of formulas \(\Gamma\) and for all formulas \(\chi_1 \dots \chi_n\) and \(\varphi\) and \(\psi\), if \((\chi_1, ..., \chi_n)\) is a derivation of a formula \(\psi\) from \(\Gamma, \varphi\) of length \(n\), then \(\Gamma \vdash \varphi \to \psi\).

Consider a derivation of \(\psi\) from \(\Gamma, \varphi\) of the form: \[ ( \chi_1, ..., \chi_{n}) \] That means that \(\chi_{n}= \psi\).

There are four cases to consider:

Case 1. \(\psi\) is a logical axiom.

Here is a derivation of \(\varphi \to \psi\) from \(\Gamma\): \[ \begin{array}{llll} 1 & & \psi \to (\varphi \to \psi) & \textsf{A1}[\psi/p, \varphi/q]\\ 2 & & \psi & \\ 3 & & \varphi \to \psi & \textsf{MP} \ 1, 2\\ \end{array} \]

Case 2. \(\psi \in \Gamma\)

Here is a similar derivation of \(\varphi \to \psi\) from \(\Gamma\): \[ \begin{array}{llll} 1 & & \psi \to (\varphi \to \psi) & \textsf{A1}[\psi/p, \varphi/q]\\ 2 & & \psi & \\ 3 & & \varphi \to \psi & \textsf{MP} \ 1, 2\\ \end{array} \] Case 3. \(\psi = \varphi\)

Since \(\vdash \varphi \to \varphi\), a derivation of \(\varphi \to \varphi\) is a derivation of \(\varphi \to \psi\) from \(\Gamma\). Therefore \(\Gamma \vdash \varphi \to \psi\).

Case 4. \(\chi_k = (\chi_l \to \psi)\) for some \(k, l < n\). That is, \(\chi_n = \psi\) is the output of an application of \(\textsf{MP}\) to earlier lines \(\chi_l\) and \(\chi_k\).

That means that \((\chi_l \to \psi)\) and \(\chi_l\) are each in the sequence, which means that there are derivations \((\chi_1 \dots, \chi_l \to \psi)\) and \((\chi_1, \dots, \chi_l)\) of each formula \(\chi_l \to \psi\) and \(\chi_l\) from \(\Gamma, \varphi\) of length \(< n\).

Since both derivations have length strictly less than \(n\), by inductive hypothesis, \(\Gamma \vdash \varphi \to (\chi_l \to \psi)\) and \(\Gamma \vdash \varphi \to \chi_l\).

To build a derivation of \(\varphi \to \psi\) from \(\Gamma\), we may concatenate derivations of \(\varphi \to (\chi_1 \to \psi)\) and \(\varphi \to \chi_l\), respectively, from \(\Gamma\): \[ \begin{array}{llll} & & ... & \\ & & \vdots & \\ k & & \varphi \to (\chi_l \to \psi) & \Gamma \vdash \varphi \to (\chi_l \to \psi) \\ & & \vdots & \\ m & & \varphi \to \chi_l & \Gamma \vdash \varphi \to \chi_l\\ & & \vdots & \\ n& & ((\varphi \to (\chi_1 \to \psi)) \to & \textsf{A2}[\varphi/p,\chi_l/q,\psi/r]\\ & & ((\varphi \to \chi_l)\to (\varphi \to \psi)) & \\ n+1& & (\varphi \to \chi_l)\to (\varphi \to \psi) & \textsf{MP} \ k, n\\ n+2 & & \varphi \to \psi & \textsf{MP} \ m, n+1\\ \end{array} \] So, given \(\Gamma\) and \(\varphi\), if \(( \chi_1, ..., \chi_n)\) is a derivation of a formula \(\psi\) from \(\Gamma, \varphi\), then \(\Gamma \vdash \varphi \to \psi\).

\(\blacksquare\)

The deduction theorem encodes a common method of proof in mathematics. To establish a conditional \(\varphi \to \psi\) in a mathematical context, you may simply adjoin \(\varphi\) to your assumptions \(\Gamma\) and set out to derive \(\psi\) from them. That entitles you to conclude that \(\varphi \to \psi\) is a deductive consequence of the original assumptions.

We will now use of the Deduction Theorem to prove a variety of basic facts, which will facilitate the use of the axiom system.

Proposition 4.1 \(\vdash (\varphi \to \psi) \to ((\psi \to \chi)\to (\varphi \to \chi))\)

Given the Deduction Theorem, it suffices to note: \[ \{\varphi \to \psi \} \vdash (\psi \to \chi) \to (\varphi \to \chi). \] But given the Deduction Theorem again, it is enough to show: \[ \{ \varphi \to \psi, \psi \to \chi\} \vdash \varphi \to \chi, \] and, in particular, \[ \{\varphi \to \psi, \psi \to \chi, \varphi\} \vdash \chi. \] But it is simple to provide a derivation of \(\chi\) from \(\{\varphi \to \psi, \psi \to \chi, \varphi\}\): \[ \begin{array}{llll} 1 & & \varphi \to \psi & \\ 2 & & \varphi & \\ 3 & & \psi & \textsf{MP} \ 1, 2 \\ 4 & & \psi \to \chi & \\ 5 & & \chi & \textsf{MP} \ 3, 4\\ \end{array} \]

Proposition 4.2 \(\vdash (\neg \psi \to \neg \varphi) \to (\varphi \to \psi)\)

Given the Deduction Theorem, it suffices to note: \[ \{\neg \psi \to \neg \varphi\}\vdash \varphi \to \psi, \] which would in turn follow from the observation: \[ \{\neg \psi \to \neg \varphi, \varphi\} \vdash \psi. \] We now derive \(\psi\) from \(\{\neg \psi \to \neg \varphi, \varphi\}\): \[ \begin{array}{llll} 1 & & \neg \psi \to \neg \varphi & \\ 2 & & \varphi \to (\neg \psi \to \varphi) & \textsf{A1}[\varphi/p,\neg \psi/q]\\ 3 & & \varphi & \\ 4 & & \neg \psi \to \varphi & \textsf{MP} \ 2, 3\\ 5 & & (\neg \psi \to \neg \varphi) \to ((\neg \psi \to \varphi) \to \psi) & \textsf{A3}[\psi/p, \varphi/q]\\ 6 & & (\neg \psi \to \varphi) \to \psi & \textsf{MP} \ 1, 5\\ 7 & & \psi & \textsf{MP} \ 4, 6 \end{array} \]

Proposition 4.3 \(\vdash \neg \neg \varphi \to \varphi\)

Given the Deduction Theorem, it suffices to note: \[ \neg \neg \varphi \vdash \varphi \] Here is a derivation of \(\varphi\) from \(\neg \neg \varphi\): \[ \begin{array}{llll} 1 & & \neg \neg \varphi \to (\neg \varphi \to \neg \neg \varphi) & \textsf{A1}[\neg\neg\varphi/p, \neg \varphi/q]\\ 2 & & \neg \neg \varphi & \\ 3 & & \neg \varphi \to \neg \neg \varphi & \textsf{MP} \ 1, 2 \\ 4 & & (\neg \varphi \to \neg \neg \varphi) \to ((\neg \varphi \to \neg \varphi) \to \varphi) & \textsf{A3}[\varphi/p, \neg \varphi/q]\\ 5 & & (\neg \varphi \to \neg \varphi) \to \varphi & \textsf{MP} \ 3, 4\\ 6 & & \neg \varphi \to \neg \varphi & \vdash \neg \varphi \to \neg \varphi\\ 7 & & \varphi & \textsf{MP} \ 5, 6 \end{array} \]

Here is another helpfpul observation:

Proposition 4.4 \(\vdash \varphi \to \neg \neg \varphi\)

We exploit the last two observations to produce a simple derivation of \(\varphi \to \neg \neg \varphi\): \[ \begin{array}{llll} 1 & & \neg \neg \neg \varphi \to \neg \varphi & \textsf{Prop} \ 4.3\\ 2 & & (\neg \neg \neg \varphi \to \neg \varphi) \to (\varphi \to \neg \neg \varphi) & \textsf{Prop} \ 4.2\\ 3 & & \varphi \to \neg \neg \varphi & \textsf{MP} \ 1, 2 \end{array} \]

Proposition 4.5 \(\vdash (\varphi \to \psi) \to (\neg \psi \to \neg \varphi)\)

Given the Deduction Theorem, it suffices to establish: \[ \{\varphi \to \psi \} \vdash \neg \psi \to \neg \varphi, \]

Here is a derivation of \(\neg \psi \to \neg \varphi\) from \(\{\varphi \to \psi\}\): \[ \begin{array}{llll} 1 & & \varphi \to \psi & \\ 2 & & \psi \to \neg \neg \psi & \textsf{Prop} \ 4.4\\ 3 & & \neg \neg \varphi \to \varphi & \textsf{Prop} \ 4.3\\ 4 & & (\neg \neg \varphi \to \varphi) \to ((\varphi \to \psi) \to (\neg \neg \varphi \to \psi)) & \textsf{Prop} \ 4.1 \\ 5 & & (\varphi \to \psi) \to (\neg \neg \varphi \to \psi) & \textsf{MP} \ 3, 4 \\ 6 & & \neg \neg \varphi \to \psi & \textsf{MP} \ 1, 4 \\ 7 & & (\neg \neg \varphi \to \psi) \to ((\psi \to \neg \neg \psi) \to (\neg \neg \varphi \to \neg \neg \psi)) & \textsf{Prop} \ 4.1 \\ 8 & & (\psi \to \neg \neg \psi) \to (\neg \neg \varphi \to \neg \neg \psi) & \textsf{MP} \ 6, 7\\ 9 & & \neg \neg \varphi \to \neg \neg \psi & \textsf{MP} \ 2, 8\\ 10 & & (\neg \neg \varphi \to \neg \neg \psi) \to (\neg \psi \to \neg \varphi) & \textsf{Prop} \ 4.2\\ 11 & & \neg \psi \to \neg \varphi & \textsf{MP} \ 9, 10 \end{array} \]

Exercise

Use the Deduction Theorem in order to justify: \[ \vdash (\varphi \to \neg \varphi) \to \neg \varphi \]

Hint. By the Deduction Theorem, we need only show \(\{\varphi \to \neg \varphi\}\vdash \neg \varphi\). One way to do that exploits a substitution instance of \(\textsf{A3}\) and \(\textsf{Prop}\) 3.3 and 3.5.

Proposition 4.6 \(\vdash \neg \varphi \to (\varphi \to \psi)\)

Given the Deduction Theorem, we need only verify:

\[ \{\neg \varphi \} \vdash \varphi \to \psi \]

By another application of the Deduction Theorem, it is sufficient to check:
\[ \{\neg \varphi, \varphi \}\vdash \psi \]

Here is an outline of a derivation of \(\psi\) from \(\{\neg \varphi, \varphi\}\): \[ \begin{array}{llll} 1 & & \neg \varphi \to (\neg \psi \to \neg \varphi) & \textsf{A1}[\neg \varphi/p, \neg \psi/q] \\ 2 & & \neg \varphi & \\ 3 & & (\neg \psi \to \neg \varphi) & \textsf{MP} \ 1, 2\\ 4 & & (\neg \psi \to \neg \varphi) \to (\varphi \to \psi) & \textsf{Prop} \ 4.2\\ 5 & & \varphi \to \psi & \textsf{MP} \ 3, 4\\ 6 & & \varphi & \\ 7 & & \psi & \textsf{MP} \ 5, 6\\ \end{array} \]

Exercise

Explain how to adapt some of the considerations given above in order to justify:

\[ \vdash \varphi \to (\neg \varphi \to \psi) \]