The goal of this post is to outline a loophole of sorts in Gödel’s Second Incompleteness (GSI) theorem (and of Löb’s theorem by corollary) and give a somewhat sketchy description for how it might be exploited. I’ll begin by describing the loophole. Essentially, it’s merely the observation that Second Incompleteness is a syntactic property rather than a semantic one. Assume that \text{Bew} is a provability predicate which is subject to GSI. GSI does not prohibit the existence of a predicate \text{Bew}2 not subject to GSI such that [[\text{Bew}]] = [[\text{Bew}2]], where [[P]] is the set corresponding to the predicate P in the standard semantics. That is to say, it’s possible for two predicates to be the same, semantically, while one is subject to GSI and the other is not. Explicitly constructing an example is not hard, take the following definition for example; \text{Bew}2(X) := \text{Bew}(X) ∧ X ≠ Meta-theoretically, we know that ⊥ is not provable, and so can prove in some ambient theory that \text{Bew}2 and \text{Bew} are, really, the same predicate. Internal to PA, or some similar system, however, GSI holds for \text{Bew} but fails rather trivially for \text{Bew}2. \text{Bew}2 is called a relativized definition, of \text{Bew}. This demonstrates the loophole, but this particular "exploitation" of it is obviously not very satisfying. Ideally we’d not have to rely on meta-theoretic reasoning to justify the alternative definition. Of course, there are limitations to what we can do. If GSI holds for one definition and not the other then we cannot prove a material equivalence between the two without causing an inconsistency. Instead, we need a weaker notion of internal equivalence. Specifically, we can use the following conditions;
-
If P(X) → Q(X) then P is at most Q.
-
If \Phi[Q] is a definition for Q, and \Phi[P] holds, then P is at least Q. If both 1. and 2. hold then we’ve internally demonstrated that P and Q are semantically equivalent. Though both conditions are syntactically weaker than material equivalence, so we can’t use it for the same purposes in general. They are, however, sufficient to treat P as if it were Q for many purposes. To demonstrate these conditions, I want to switch to a weaker system for a bit. Robinson’s Q is basically just PA without induction. This prevents us from directly proving almost any interesting theorem. We can’t, for example, directly prove that addition is associative. We can, however, do the following. Firstly, we can observe the following definition of the natural numbers; \mathbb{N}(0) ∧ (∀x. \mathbb{N}(x) → \mathbb{N}(s(x))) Next, consider the following definition; \mathbb{N}_a(X) := \mathbb{N}(X) ∧ ∀y z. y + (z + X) = (y + z) + X We can prove that;
-
\mathbb{N}_a(X) → \mathbb{N}(X)
-
\mathbb{N}_a(0) ∧ (∀x. \mathbb{N}_a(x) → \mathbb{N}_a(s(x)))
- is trivial. 2. relies on the definition of addition used in Robinson’s Q;
-
X + 0 = X
-
X + s(Y) = s(X + Y) but is ultimately fairly straightforward. 1. and 2. are special cases of our previous conditions for internally verifying the semantic equivalence between two predicates; so we’ve essentially just demonstrated that \mathbb{N} and \mathbb{N}_a are, really, the same predicate. We can also prove, rather easily, that ∀x y z. \mathbb{N}_a(z) → x + (y + z) = (x + y) + z giving us, not a proof of associativity exactly, but a relativized proof of it. This particular technique was heavily emphasized in Predicative Arithmetic by Edward Nelson which is all about doing mathematics in Robinson’s Q and it’s where I learned this technique from. That book goes on to prove that we can relativize the definition of \mathbb{N} so that the totality of and all the standard algebraic properties of addition and multiplication are relativizable. Near the end, it proves that one can’t relativize the definition of \mathbb{N} so that the totality of exponentiation relativizes. However, there’s a different, rather enlightening, construction not considered in that book which at least gives us closure, though not full totality. Assume \mathbb{N}_m is a predicate defining numbers which are closed under multiplication. In other words, we know that;
-
\mathbb{N}_m(0)
-
∀x. \mathbb{N}_m(x) → \mathbb{N}_m(s(x))
-
∀x y. \mathbb{N}_m(x) ∧ \mathbb{N}_m(y) → \mathbb{N}_m(x \times y) Now observe that exponentiation, as a ternary relation, can be defined with; (∀x. {\uparrow}(x, 0, 1)) ∧ (∀x y z. {\uparrow}(x, y, z) → {\uparrow}(x, s(y), x \times z)) Define a new ternary relation; {\uparrow}_2(x, y, z) := {\uparrow}(x, y, z) ∧ (\mathbb{N}_m(x) → \mathbb{N}_m(z)) We can observe that;
-
∀x y z. {\uparrow}_2(x, y, z) → {\uparrow}(x, y, z)
-
(∀x. {\uparrow}_2(x, 0, 1)) ∧ (∀x y z. {\uparrow}_2(x, y, z) → {\uparrow}_2(x, s(y), x \times z))
-
∀x y z. \mathbb{N}_m(x) ∧ \mathbb{N}_m(y) ∧ {\uparrow}_2(x, y, z) → \mathbb{N}_m(z) All of these are fairly easy to prove. This demonstrates that {\uparrow} and {\uparrow}_2 are, really, the same predicate, and further that \mathbb{N}_m is closed under exponentiation using the {\uparrow}_2 definition. One more example before going back to PA. Fix a binary predicate which I’ll suggestively call \in. I will not make any additional assumptions about \in. Next, consider the following predicate; \mathbb{N}_i(X) := \mathbb{N}(X) ∧ ∀p. {\in}(p, 0) → (∀n. {\in}(p, n) → {\in}(p, s(n))) → {\in}(p, X) we can easily prove
-
\mathbb{N}_i(X) → \mathbb{N}(X)
-
\mathbb{N}_i(0) ∧ (∀x. \mathbb{N}_i(x) → \mathbb{N}_i(s(x))) demonstrating that \mathbb{N}_i, which effectively relativizes induction over \in in Robinson’s Q, is semantically the same predicate as \mathbb{N}_i. Furthermore, we can easily prove that; ∀p. {\in}(p, 0) → (∀n. {\in}(p, n) → {\in}(p, s(n))) → ∀n. \mathbb{N}_i(n) → {\in}(p, n) If we were working in a system with proper second-order quantification, I would say that induction, period, relativizes, but in a purely first-order system we are stuck with a fixed, but arbitrary, binary predicate. Back to PA, you can hopefully see where we’re going. I want to sketch out a proof that transfinite induction up to ε_0 relativizes in PA, and further sketch how to get a relativizing proof predicate not subject to GSI. I’ve not formalized this, so there may be some technical errors ahead, but I wanted to get some feedback before dedicating the effort to perfecting what’s beyond this point. Firstly, we can observe the following definition of cantor-normal form ordinals and their ordering relation ε_0(0) ∧ (∀x y. ε_0(x) ∧ ε_0(y) ∧ (\text{fst}(y) < x ∨ \text{fst}(y) = x) → ε_0(ω^x + y)) where \text{fst}(0) = 0 ∧ ∀x y. \text{fst}(ω^ x + y) = x and
-
∀x y. 0 < ω^ x + y
-
∀x y z. x < z → ω^ x + y < ω^ z + w
-
∀x y z w. x = z → y < w → ω^ x + y < ω^ z + w we can’t directly relativize transfinite induction. Instead, we relativize structural induction over cantor normal form terms; ε_0'(X) := ε_0(X) ∧ \ ∀p. {\in}(p, 0) → (∀x y. {\in}(p, x) → {\in}(p, y) → \text{fst}(y) ≤ x → {\in}(p, ω^{x} + y)) → {\in}(p, X) I don’t think we need to relativize the ordering relation, but I’m not certain, so I’ll leave it as a possibility here. We can easily prove;
-
ε_0'(x) → ε_0(x)
-
ε_0'(0) ∧ (∀x y. ε_0'(x) ∧ ε_0'(y) ∧ (\text{fst}(y) < x ∨ \text{fst}(y) = x) → ε_0'(ω^x + y)) demonstrating that ε_0 and ε_0' are, really, the same predicate. We can also easily prove structural induction; ∀p. {\in}(p, 0) → (∀x y. {\in}(p, x) → {\in}(p, y) → \text{fst}(y) ≤ x → {\in}(p, ω^x + y)) → ∀x. ε_0'(x) → {\in}(p, x) and from this we can prove transfinite induction; ∀p. {\in}(p, 0) → (∀o. (∀e. e < o → {\in}(p, e)) → {\in}(p, o)) → ∀x. ε_0'(x) → {\in}(p, x) this derivation is not so easy, but I don’t think it’s overly hard either. A derivation of transfinite induction from structural induction can be found in lines 76-150 in this Agda file, though, that derivation relies on a somewhat sophisticated mutual induction-recursion gadget that I’ve never seen formalized in PA, but I don’t think it will be a problem. There are also other derivations out there, but this is the simplest presentation I know of. Once we have this, we should be able to internally reproduce a proof by transfinite induction that PA is consistent. Essentially, we’d have a function, O which assigns a proof term an element of ε_0. We’d then, assuming we already have a proof predicate \text{Pr}(x, y), where y is an encoding for a proof of x, relativize Pr with \text{Pr}'(x, y) := \text{Pr}(x, y) ∧ ε_0'(O(y)) Proving relativization for this would be quite involved, mostly because of how complicated the proof predicate is. I’ve tried sketching out a presentation of a proof predicate which is as simple as possible, coming up with the following based on Krivine realizability;
-
\text{PrC}(∆, Γ, \text{var}(a), A) ← (a, A) \in Γ
-
\text{PrC}(∆, Γ, λa.M, A → B) ← \text{PrC}(∆, (a, A) :: Γ, M, B)
-
\text{PrC}(∆, Γ, M@N, B) ← ∃A. \text{PrC}(∆, Γ, N, A) ∧ \text{PrC}(∆, Γ, M, A → B)
-
\text{PrC}(∆, Γ, \star, ⊤)
-
\text{PrC}(∆, Γ, (M, N), A ∧ B) ←\text{PrC}(∆, Γ, M, A) ∧\text{PrC}(∆, Γ, N, B)
-
\text{PrC}(∆, Γ, \text{proj}_1(P), A) ←∃B. \text{PrC}(∆, Γ, P, A ∧B)
-
\text{PrC}(∆, Γ, \text{proj}_2(P), B) ←∃A. \text{PrC}(∆, Γ, P, A ∧ B)
-
\text{PrC}(∆, Γ, [α]M, ⊥) ←∃A. (α, A) ∈ ∆ \wedge \text{PrC}(∆, Γ, M, A)
-
\text{PrC}(∆, Γ, μα.M, A) ←\text{PrC}((α,A) :: ∆, Γ, M, ⊥)
-
\text{PrC}(∆, Γ, Λz.M, ∀x.A) ←\text{fresh} y. \text{PrC}(∆, Γ, M[\text{var}(y)/z], A[\text{var}(y)/x])
-
\text{PrC}(∆, Γ, M{t}, A[t/x]) ←\text{PrC}(∆, Γ, M, ∀x.A)\text{PrC}(∆, Γ, \text{refl}, a = a)
-
\text{PrC}(∆, Γ, \text{sym}(f), a = b) ←\text{PrC}(∆, Γ, f, b = a)
-
\text{PrC}(∆, Γ, \text{trans}(f, g), a = c) ←∃b. \text{PrC}(∆, Γ, f, a = b) ∧ \text{PrC}(∆, Γ, g, b = c)
-
\text{PrC}(∆, Γ, \text{sapp}(p), s(a) = s(b)) ←\text{PrC}(∆, Γ, p, a = b)
-
\text{PrC}(∆, Γ, \text{sinj}(p), a = b) ←\text{PrC}(∆, Γ, p, s(a) = s(b))
-
\text{PrC}(∆, Γ, \text{con}(p), ⊥) ←∃n. \text{PrC}(∆, Γ, p, z = s(n))
-
\text{PrC}(∆, Γ, \text{add0}, x + 0 = x)
-
\text{PrC}(∆, Γ, \text{adds}, x + s(y) = s(x + y))
-
\text{PrC}(∆, Γ, \text{mul0}, x \times 0 = 0).
-
\text{PrC}(∆, Γ, \text{muls}, x \times s(y) = x + (x \times y))
-
\text{PrC}(∆, Γ, \text{ind}(pz, ps), ∀x.P) ←\text{PrC}(∆, Γ, pz, P[0/x]) ∧\text{PrC}(∆, Γ, ps, ∀n. P[n/x] → P[s(n)/x]) Note that quotations are left implicit for the sake of succinctness. I’ve also not checked that this is completely correct, but it’s 95% of the way there at least. Enough for demonstration purposes. As you can see, it’s quite a beast. Thinking back to the relativized definition from the beginning, attempting something similar with this predicate would result in; \text{PrC}'(∆, Γ, a, A) := \text{PrC}(∆, Γ, a, A) ∧ (∆ = ∅ ∧ Γ = ∅ → A ≠ ⊥) Why can’t we prove that this relativizes? Most clauses actually are provable since either the conclusion is syntactically not ⊥ or the contexts are assumed to be nonempty. One of the exceptions is \text{PrC}(∆, Γ, \text{con}(p), ⊥) ← ∃n. \text{PrC}(∆, Γ, p, 0 = s(n)). to prove this clause we need to prove that z = s(n) isn’t derivable in the empty context for any n, which is as hard as proving consistency in the first place. So we ultimately get nowhere. However, if we do our ordinal relativization; \text{PrC}'(∆, Γ, a, A) := \text{PrC}(∆, Γ, a, A) ∧ ε_0'(O(a)) then this should relativize fine for all clauses, as far as I can tell. Finally, we should be able to prove (with an appropriately selected \in) that ¬ ∃a. \text{PrC}'(∅, ∅, a, ⊥) demonstrating that \text{PrC}' is not subject to GSI (or Löb’s theorem) despite the fact that we can internally verify that it’s the same predicate as \text{PrC}. Stepping back, what’s the ultimate point of all this? Traditional wisdom states that GSI and Löb’s theorem puts fundamental limitations on various formal logics preventing them from reasoning about themselves in a useful way. I no longer believe this to be the case. We can, with some cleverness, get these systems to do just that. I’m not really sure where to go from here, so I hope to get helpful feedback from you readers. I could go through the effort of trying to formalize this completely. I tried doing that but ran into the hell that is doing real mathematics rigorously within PA, so I quit early on. Honestly someone would have to pay me to complete that project. I’d rather try a different system, such as a weak higher-order logic, or a first-order logic where the data are lambda terms or something so one doesn’t require Gödel encoding everything. That might be better since it naturally represents programs. Anyway, thanks for reading. I hope this ends up being useful.
Comment
I do worry that there’s some technical subtlety not obvious until the end game that makes this particular construction impossible. I’ve been trying to spot some places where something might go wrong, and there are a few parts I don’t know how to fill in. I think there might be a problem relativizing a proof of the totality of ordinal multiplication which may prevent either the proof predicate relativization or the transfinite induction derivation from working. Or there may be some more subtle reason the steps I outlined won’t work. I was hoping that someone in the community might be able to say "I’ve seen this before, and it didn’t work because of XYZ", but that hasn’t happened yet, which means I’ll probably have to attempt to formalize the whole argument to get a definitive answer, which I’m not confident is worth the effort. Regardless, I don’t think my idea relies on this specific construct. The example from the beginning of the post that I find unsatisfying already sidesteps lob’s theorem, and just because I’m unsatisfied with it doesn’t mean it, or something equally as simple, isn’t useful to someone. Even if there’s a mistake here, I think the broad idea has legs.
The end of this post got too advanced for me to quickly understand. But near the beginning you say:
Comment
I think you’ve misunderstood what I was saying. I am not claiming that we can prove a material equivalence, an internal formal equivalence, between N and Na; that would require induction. The only things which that statement claimed were being formally proved within Q were 1. and 2. Neither of those statements require induction. 1. allows us to treat Na as, at most, N. 2. allows us to treat Na as, at least, N. Both together allow us to treat Na as N (within limits). Only 1. and 2. are internal, there are no other formal statements. That 1. and 2. are sufficient for demonstrating that N and Na are semantically the same is a metatheoretic observation which I think is intuitively obvious. The point is that we can see, through internal theorems, that N and Na are the same, not with a material equivalence, but with some weaker conditions. This justifies treating Na as an alternative definition of N, one which we can use to formally prove more than we could with N. This is a different perspective than is usually taken with formal mathematics where there’s often considered only one definition, but incompleteness opens the possibility for formally distinct, semantically equivalent definitions. I think that’s exploitable but historically hasn’t been exploited much.