Can Mathematics
Prove Its Own Sanity?
A landmark 2020 paper by logician Sergei Artemov claims to do the impossible — prove that arithmetic cannot contradict itself, using arithmetic alone. Here’s what that means, why it matters, and why it’s so controversial.
In this article
The Ancient Problem of Self-Trust
“`
Imagine you are building a bridge. Before you trust the bridge to hold traffic, you want to prove it is safe. But here’s a strange twist: what if the only tools available to you for testing the bridge are the bridge’s own beams and bolts? Can the bridge, in some sense, certify its own structural integrity?
This is not a perfect analogy, but it gestures at one of the deepest questions in all of mathematics: can a formal mathematical system prove that it will never produce a contradiction? Can mathematics, in other words, prove its own sanity?
For nearly a century, the consensus answer has been a resounding no — a verdict seemingly handed down by one of the most celebrated theorems in intellectual history, Gödel’s Second Incompleteness Theorem. But in a careful and technically dense 2020 paper, logician Sergei Artemov of the City University of New York argues that this consensus rests on a subtle but crucial mistake — and that, properly understood, a proof of consistency is achievable from within the system itself.
This article will walk you through the full story, from the ground up, with no prior knowledge of mathematical logic required.
Can Peano Arithmetic — the standard formal system for reasoning about whole numbers — prove that it will never derive a contradiction? Gödel said (seemingly) no. Artemov says: it depends on how you define “prove” and how you represent “consistency.”
“`
Hilbert’s Ambitious Dream
“`
To understand Artemov’s paper, we need to go back to the 1920s and the vision of the great German mathematician David Hilbert.
Hilbert was one of the most influential mathematicians of his era — a man of towering ambition and systematic thinking. He had watched mathematics grow increasingly abstract in the 19th century, piling axiom upon axiom, proof upon proof. And he noticed something unsettling: nobody had actually confirmed that all these elaborate mathematical towers weren’t quietly resting on hidden contradictions.
What if someone, deep in the machinery of set theory or arithmetic, could prove both that a statement is true and that it is false? That would be catastrophic — a single contradiction poisons an entire logical system, because from a contradiction you can prove literally anything. The whole edifice of mathematics would collapse.
The Formalist Program
Hilbert’s solution was bold. He wanted to put mathematics on an absolutely secure footing by doing two things:
First, formalize everything. Write all of mathematics as a precise formal language, with explicit axioms and mechanical rules of inference. No handwaving, no appeals to intuition — just symbolic manipulation according to fixed rules.
Second, prove that the system is consistent, using only the most basic, indubitable methods — what Hilbert called finitary reasoning. These are methods so simple and concrete that no one could doubt them: counting, finite combinatorics, straightforward calculation.
The crown jewel of this program would be to prove the consistency of Peano Arithmetic (PA) — the formal system that captures how we reason about natural numbers (0, 1, 2, 3, …) — using only the resources of that very system. Prove that arithmetic cannot contradict itself, using arithmetic.
Think of it like an accounting firm auditing its own books. Hilbert wanted mathematics to look at its own logical procedures and declare, “I have checked every step — there is no error here, no hidden flaw that could make me say both ‘yes’ and ‘no’ to the same question.”
This program — known as Hilbert’s Program — electrified the mathematical world. It promised that mathematics could be made permanently and provably safe. For about a decade, it seemed like it might actually work.
“`
Gödel’s Bombshell
“`
In 1931, a 25-year-old Austrian mathematician named Kurt Gödel published one of the most astonishing papers in the history of human thought. In it, he proved two theorems — his famous Incompleteness Theorems — that seemed to shatter Hilbert’s dream.
The First Incompleteness Theorem
Gödel’s First Theorem says, roughly: any sufficiently powerful, consistent formal system contains statements that are true but unprovable within the system. There will always be mathematical truths that the system cannot reach from its own axioms.
This was already shocking — it meant no formal system could capture all mathematical truth. But it was the Second Theorem that directly targeted Hilbert’s consistency program.
The Second Incompleteness Theorem
Gödel’s Second Theorem states: a consistent formal system (of sufficient power) cannot prove its own consistency.
More precisely: Gödel showed how to translate the statement “Peano Arithmetic is consistent” into a specific mathematical formula — call it Con_PA — written in the language of arithmetic itself. He then showed that PA cannot prove this formula. The system cannot certify its own integrity.
There exists no consistency proof of a system that can be formalized in the system itself.
— Encyclopædia Britannica, summarizing the received view of Gödel’s Theorem
This was interpreted as the death knell of Hilbert’s Program. If arithmetic cannot prove its own consistency, how can we ever achieve the kind of absolute certainty Hilbert sought?
How did Gödel do it?
Gödel’s key insight was an extraordinary trick called Gödel numbering. He found a way to encode every formula, every proof, every statement in the language of arithmetic as a unique natural number. Suddenly, statements about the formal system could themselves be written inside the formal system — mathematics could talk about itself.
Imagine every sentence in English were assigned a unique number. Then a sentence like “Sentence #4,201,847 is false” would refer to itself if it happened to be sentence #4,201,847. Gödel did exactly this to arithmetic — he made arithmetic sentences that talk about arithmetic sentences. This allowed him to construct mathematical analogues of “This statement cannot be proved,” which leads to paradox and unprovability.
Using this mechanism, Gödel expressed the consistency of PA — the claim that PA will never prove a contradiction — as a specific arithmetical formula Con_PA. He then proved that PA cannot prove Con_PA.
“`
The Problem with Con_PA
“`
Now we reach the heart of Artemov’s argument. To understand it, we need to slow down and ask a question that almost everyone has failed to ask carefully: is Con_PA actually a faithful representation of what “consistency” means?
What does consistency really mean?
At its most basic, the consistency of PA is a statement about finite sequences of formulas. A proof in PA is a finite list of formulas where each line follows from the previous ones by the rules of the system. PA is consistent if no such finite list can end in a contradiction.
In everyday language: no matter how long a valid PA-proof is, it will never conclude with a statement like “0 = 1” (which is the symbol for a contradiction, often written ⊥).
This is a very concrete, combinatorial statement. You could in principle check it by going through proofs one by one. The consistency statement (1) is:
For any finite sequence S of formulas,
S is not a PA-derivation of ⊥.
“`
The translation problem
When Gödel encoded consistency as an arithmetical formula, he wrote it like this:
“`
“For all x, x is not a proof of contradiction.”
“`
Here, x ranges over all natural numbers, since proofs are encoded as numbers. And this is where something subtle — and, Artemov argues, something crucial — goes wrong.
In formal logic, when you write “for all x,” the variable x ranges not just over the “standard” natural numbers (0, 1, 2, 3, …) but also over so-called nonstandard numbers that exist in certain strange mathematical models of PA. These nonstandard numbers are infinitely large — they don’t correspond to any actual finite object.
Real proofs are finite objects. They have standard, finite Gödel numbers. There are no proofs with nonstandard (infinite) Gödel numbers, because infinite sequences of formulas aren’t proofs.
Here is the consequence: when PA fails to prove Con_PA, it fails not because there is any genuine doubt about whether real, finite proofs can derive contradictions. It fails because Con_PA, as a universally quantified formula, also says something about these bizarre nonstandard “proofs” — and PA’s language is too weak to rule those out. The failure is about nonstandard arithmetic, not about real proofs at all.
In fact, Artemov points out something remarkable: the individual statements
“`
¬(1 is a proof of ⊥)
¬(2 is a proof of ⊥)
¬(3 is a proof of ⊥)
…
“`
are all provable in PA — every single one. PA has no trouble confirming, for any specific number you name, that it is not a proof of contradiction. What PA cannot do is bundle all these confirmations into a single formula with an unbounded universal quantifier.
This suggests the problem with Con_PA is not a deep foundational barrier — it’s a mismatch between the original notion of consistency and the particular formula chosen to represent it.
“`
A Hidden Crack in the Received Argument
“`
Artemov identifies a critical gap in the chain of reasoning that leads from Gödel’s Theorem to the conclusion “PA cannot prove its own consistency.” The argument rests on an implicit assumption he calls the “Con_PA as Consistency Principle”:
Any contentual proof of consistency of PA within the postulates of PA can be internalized as a formal PA-derivation of the specific formula Con_PA.
This principle says: if you find a real, honest proof that PA is consistent — using only the resources available within PA — then that proof must translate into a formal derivation of the specific formula Con_PA. But why should this be true?
Artemov argues it need not be. A real proof of consistency is a proof about finite sequences of formulas. Translating that proof into a formal derivation of Con_PA requires encoding it as a proof about all natural numbers x — including nonstandard ones. That step is not automatic. It smuggles in extra baggage that wasn’t in the original proof.
Crucially, both Hilbert and Gödel himself seemed to sense this. Hilbert, upon hearing of Gödel’s result, wrote that it only showed one “must exploit the finitary standpoint in a sharper way.” Gödel wrote that “it is conceivable that there exist finitary proofs that cannot be expressed in the formalism.” According to logician Gerald Sacks, Gödel in 1961–62 believed Hilbert’s program remained “very much alive and even more interesting than it initially was.”
These are not the words of men who believed the impossibility interpretation was settled. Artemov is, in a sense, following through on their hesitation.
“`
The Complete Induction Analogy
“`
Before presenting his proof, Artemov introduces a clarifying example that illuminates the core idea beautifully. It concerns a principle called Complete Induction.
What is Complete Induction?
You may be familiar with ordinary mathematical induction: to prove something is true for all natural numbers, you prove it’s true for 0, and you prove that if it’s true for any number n, it’s true for n+1. Done — it’s true for all numbers.
Complete Induction (also called Strong Induction) is a variation: to prove something is true for all x, you prove that if it’s true for all numbers smaller than x, then it’s true for x itself. This is an extremely useful proof technique.
Now here’s the key point: Complete Induction, as a property, is not a single formula. It’s a scheme — a pattern that applies to every possible formula ψ. It says:
For any given formula ψ: if “for all x, whenever ψ holds for all y < x, ψ holds for x” is true, then ψ holds for all x.
Can we prove this? Absolutely — it’s a standard textbook argument. You pick an arbitrary ψ and apply ordinary induction to a related formula. Easy.
But notice: there is no single formula of PA that represents Complete Induction. Because PA is not finitely axiomatizable, there is no way to wrap the entire scheme into one sentence. Nevertheless, we can give a single finite proof that covers all instances of it simultaneously, by describing a procedure (a function) that, given any ψ, produces a proof of CI(ψ).
A property can be completely and rigorously provable in PA — in a meaningful, formalizable sense — even if it cannot be represented by any single arithmetical formula. Looking only at single-formula representations misses an entire class of valid, formalizable mathematical reasoning.
Artemov argues that Consistency is exactly like this. It cannot be captured by a single formula in a way that works perfectly. But it can be proved — as a scheme — by a single finite procedure. And that is exactly what his paper does.
“`
Artemov’s Proof — Step by Step
“`
Now for the centerpiece. Artemov presents a genuine proof of PA-consistency in what he calls informal arithmetic — the ordinary, everyday mathematical reasoning we all do when we think about numbers, which corresponds precisely to the resources available in PA. The proof then gets fully formalized within PA.
The proof uses a crucial mathematical tool: partial truth definitions.
Partial Truth Definitions — What Are They?
In general, no formal system can define truth for all of its own sentences (this is Tarski’s theorem). But a weaker, useful thing is possible: for each level of logical complexity n, you can write down a formula Tr_n(x, y) that correctly captures what it means for a formula of complexity ≤ n to be true.
Think of these as truth-checkers for bounded complexity. Tr_n is a formula in PA’s language that acts like a truth test for all formulas that aren’t too complicated. And the key properties of Tr_n are provable in PA — Artemov cites this as a well-known, established result going back decades.
Imagine a building inspector who can’t inspect every building in the world. But given any specific building, she can inspect that building and certify it as safe or unsafe. Similarly, Tr_n isn’t a universal truth predicate — but given any specific formula of bounded complexity, it can evaluate its truth. And importantly, the rules it uses to evaluate are provably correct within PA.
The Proof, Step by Step
Fix an Arbitrary Proof S
Take any finite sequence of formulas S that claims to be a valid PA-derivation. We want to show it cannot end in a contradiction (⊥). Since S is finite, all formulas in it have bounded logical complexity — say, all formulas are of complexity at most level n.
Calculate the Complexity Level n
From the code of S, we can compute the right level n so that the partial truth definition Tr_n covers all formulas in S. This is a straightforward computation — it’s a primitive recursive function of S’s Gödel number.
Tr_n Serves as an Invariant
We claim: for every formula φ appearing in S, Tr_n(φ) holds — i.e., every formula in S is true (in the sense of Tr_n). We verify this by going through S step by step. The first formulas are axioms of PA — and it’s provable that all PA axioms satisfy Tr_n. Each subsequent formula follows from earlier ones by a rule of inference — and it’s provable that Tr_n is preserved by every rule of inference (Modus Ponens, etc.).
⊥ Fails Tr_n
A separate, provable fact about Tr_n: the contradiction symbol ⊥ does not satisfy Tr_n. In other words, Tr_n(⊥) is false, and this is provable in PA. This is essentially just saying that “0 = 1 is false” — something we obviously know.
Therefore ⊥ Is Not in S
Every formula in S satisfies Tr_n (step 3). But ⊥ does not satisfy Tr_n (step 4). Therefore ⊥ is not in S. The proof S does not contain a contradiction. Since S was an arbitrary PA-derivation, PA is consistent. ∎
Formalizing the Proof in PA
The argument above is given in informal arithmetic — the kind of plain mathematical reasoning we all do. Artemov then shows that this proof can be completely formalized inside PA itself. The key mechanism is a selector function.
Define a primitive recursive function p(d) that takes any number d (intended to be the Gödel number of a PA-derivation) and outputs a formal PA-proof that d is not a proof of ⊥. PA then proves the statement:
“For every x, p(x) is a PA-proof that x is not a proof of contradiction.”
“`
This is a universally quantified statement — but crucially, it does not say “for all x, ¬x:⊥.” It says “for all x, here is an explicit proof that ¬x:⊥.” The selector p is a concrete, computable function that hands you a certificate for each specific derivation.
Gödel’s theorem blocks a single proof of Con_PA — a single proof that covers all x at once, uniformly, including nonstandard x. Artemov’s result provides, for each standard x, an explicit proof that x is not a proof of ⊥. The selector function p is total and computable — for every real, finite derivation, it produces a real, finite certificate of non-contradiction. Gödel’s theorem does not block this.
Why Doesn’t Gödel’s Theorem Block This?
Artemov addresses this directly. Gödel’s theorem would block a proof like this: suppose there were a single formula I(x,y) such that PA proved both “if x is a proof of y, then I(x,y)” and “I(x,⊥) is never true.” Then PA could conclude “nothing is a proof of ⊥” — that is, Con_PA — and that’s impossible.
But Artemov’s invariant Tr_n is not a single formula uniform for all derivations. It depends on n, which depends on the specific derivation S. There is no single, universal I(x,y) — there’s a different invariant for each derivation. Gödel’s theorem prohibits a uniform invariant; Artemov uses a family of derivation-specific invariants. This is the gap, and it’s real.
“`
The Concept of Schemes
“`
Artemov’s proof motivates a broader theoretical framework — a general account of how properties expressed as schemes (infinite families of statements) can be finitely proved in PA. This is perhaps the most conceptually important contribution of the paper.
Three Kinds of Provability
Artemov defines three different notions of “provability” for a scheme — a family of statements of the form ϕ(0), ϕ(1), ϕ(2), …, indexed by the natural numbers:
| Kind | What it Means | Example |
|---|---|---|
| Strongly Provable | PA proves the single formula ∀xϕ(x) — the universal quantification over all x, including nonstandard ones | Simple arithmetic identities |
| Provable (as a scheme) | There is a computable function t such that PA proves ∀x[t(x) is a proof of ϕ(x)] — a selector that, for each x, hands you a proof | Consistency {¬x:⊥}; Complete Induction |
| Weakly Provable | For each specific n, PA proves ϕ(n) — but there’s no systematic proof procedure | True Σ₁-sentences provable one by one |
Artemov shows these three notions are genuinely distinct:
Provable ≠ Strongly Provable
The consistency scheme {¬x:⊥} is provable but not strongly provable — precisely because Gödel’s theorem blocks ∀x¬x:⊥.
Weakly Provable ≠ Provable
The “Unprovability of Inconsistency” scheme is weakly provable (each instance is provable) but not provable as a scheme — as shown by Kurahashi and Sinclaire independently.
This establishes a rich, three-tiered hierarchy of provability for schemes, with consistency sitting comfortably in the middle tier — provable but not strongly provable — and this is exactly the right tier for a concept that Gödel’s theorem targets only in its stronger form.
What Makes a Proof of a Scheme Legitimate?
Artemov is careful to note that not every formal object that satisfies the definition of a “scheme proof” actually constitutes a genuine consistency proof in the intended sense. He gives a subtle example: one can construct a trivial selector function that “proves” non-contradiction by cheating — it only works if the derivation in question is already inconsistent, in which case it gives a fake proof of ¬x:⊥ from ⊥ itself. This satisfies the formal definition of a scheme proof but is obviously not a real consistency proof.
His actual proof doesn’t have this defect — it uses the partial truth definition to provide genuine, independent verification. But the point is that the notion of “scheme proof” is a formal framework, and contentual judgment is still needed to evaluate whether a particular proof of a scheme actually establishes the property it claims to. This is no different from how logicians already use contentual judgment to evaluate which formulas count as faithful representations of consistency.
“`
What This Cannot Do
“`
Artemov is careful and honest about the limits of his approach. It is worth dwelling on these, because they show the result is not a magical escape from all of Gödel’s consequences.
PA + ConPA Cannot Prove Its Own Consistency This Way
The method works for PA. But if you extend PA by adding Con_PA as an additional axiom — creating the theory PA + Con_PA — the same method cannot establish its consistency.
This is shown by a result independently noticed by logicians Sinclaire and Kurahashi: PA cannot prove the constructive consistency formula CCon_T for any theory T that extends PA + ConPA. The method has a natural ceiling. It can prove consistency one level up, but not two — which is exactly what one should expect from a foundational tool of this power.
This does not violate Gödel’s Second Theorem. Gödel’s theorem remains completely valid. What Artemov has done is show that the theorem’s scope is narrower than commonly thought — it blocks a specific kind of consistency proof (one that derives the formula Con_PA) but does not block the scheme-based proof presented here.
Gödel’s Theorems Are Untouched
Artemov is emphatic: “By no means are we casting doubt upon Gödel’s Incompleteness Theorems.” These remain among the most important results in all of mathematics. The First Incompleteness Theorem — that PA contains true but unprovable statements — is not challenged. The Second — that PA cannot prove Con_PA — is not challenged. What is challenged is the interpretation of the Second Theorem as establishing that “PA cannot prove its own consistency.” Artemov argues this interpretation relies on identifying consistency with the specific formula Con_PA, which is a choice, not a necessity.
The Formalization Principle Survives for Sentences
Artemov also clarifies: his argument only refutes what he calls the “Con_PA as Consistency Principle” — the idea that any proof of PA consistency must produce the formula Con_PA. The broader Formalization Principle — that any ordinary mathematical argument about sentences can be formalized in PA — remains valid and unchallenged.
“`
Why This Matters
“`
If Artemov’s argument is correct, it has significant implications at several levels — philosophical, foundational, and practical.
For Philosophy of Mathematics
For decades, the standard story has been: Gödel killed Hilbert’s Program. You cannot have certainty from within — mathematics must always rely on something outside itself to certify its consistency. This story shaped how philosophers and logicians thought about the nature of mathematical knowledge.
Artemov suggests this story is too simple. The impossibility was partly an artifact of how consistency was formalized — of translating a finite, combinatorial statement into an arithmetical formula involving unbounded universal quantifiers and nonstandard models. When you avoid that translation and prove consistency directly, as a property of finite proof objects, the impossibility dissolves.
This rehabilitates, to some extent, Hilbert’s intuition that mathematics could be trusted on its own grounds. Not Hilbert’s full program — but the core aspiration: that we can know, by means formalizable within arithmetic, that arithmetic will not contradict itself.
For Foundations of Mathematics
The result suggests that the notion of “what can be proved in PA” is richer than the standard picture allows. There is provability of sentences, and there is also provability of schemes — and these are genuinely different concepts, with different power and different scope.
This opens up a research program: developing a systematic theory of scheme-provability, understanding which properties can be established this way, and clarifying the relationship between informal mathematical reasoning and formal derivations.
For Formal Verification
There is also a practical implication. When we use formal proof assistants to verify computer programs or mathematical results, we are using formal systems like PA. If those systems can certify their own consistency by the methods Artemov describes, then formal verification becomes a more self-contained activity — it does not need to appeal to external consistency assumptions to justify its own reliability.
Artemov’s paper does not overturn Gödel or solve all foundational problems. What it does is identify a genuinely overlooked gap in a near-century-old argument — a gap between what Gödel’s theorem actually proves and what people concluded from it. It fills that gap with a new proof technique (scheme-provability) and a concrete mathematical result (the consistency of PA is provable as a scheme). Whether the mathematical and philosophical community ultimately accepts this as a proof of “consistency” in the full Hilbertian sense will be debated for years. But the mathematical argument is careful, rigorous, and — at the very least — forces us to think much more precisely about what we mean when we say a system “proves its own consistency.”
The Reaction
The paper has been met with a mixture of serious engagement and skepticism in the mathematical logic community. Many logicians feel that “provability as a scheme” is a weaker notion than what Hilbert really wanted — that Hilbert wanted a single, unified certificate of consistency, not a family of certificates. Others find the argument compelling and believe the single-formula requirement was never a principled constraint but an artifact of the mathematical tools available in the 1930s.
What is not in dispute is that Artemov has identified something real and important: the notion of “proof of a scheme” is a legitimate, formalizable, and unexplored class of mathematical reasoning, and studying it is worthwhile regardless of how one interprets the result for Hilbert’s Program.
Conclusion: A Question of Definitions
Mathematics, at its deepest levels, is often a question of getting the definitions right. The Pythagorean theorem is not hard once you have defined distance, angle, and right triangle precisely. Similarly, whether PA can prove its own consistency is not a question with a fixed, eternal answer — it depends on what you mean by “prove” and what you mean by “consistency.”
Gödel showed, with breathtaking ingenuity, that a particular formulation of consistency — encoded as the formula Con_PA — is not derivable in PA. Artemov shows, with equal care, that a different, equally legitimate formulation — consistency as a property of finite proof sequences, proved via a selector function — is achievable within PA.
The lesson is not that Gödel was wrong. The lesson is that the question is subtler than the popular summary suggests. And in mathematics, recognizing that a question is subtler than you thought is always the beginning of genuine understanding.
As Gödel himself apparently believed, even in 1962: Hilbert’s program is “very much alive and even more interesting than it initially was.”
“`




























