The Erdos Unit Distance Problem and Penrose "Understanding"
What might stay out of the reach of classical computational AI?
OpenAI announced a model has one-shot disproved a conjecture about the Erdos Unit Distance Problem. My suspicion is that Penrose wouldn’t flinch regarding his stance on classical computational AI. However, it got me pondering the boundary between computable thinking and Penrose “understanding”. I think Opus had an interesting take:
If Penrose is correct, the residue may end up extremely subtle. Most of daily cognition, most economic labor, possibly most of mathematics itself can run on the computable side. What's left is a thin, strange seam: the framework-inventing, paradigm-breaking, truth-apprehending, qualia-having core. Small in surface area, possibly enormous in significance — or possibly small in both, which is the version of his thesis that makes the slow-takeoff argument weakest.
Penrose “Understanding” Examples
I asked various models for examples of things Penrose might include as requiring “understanding”. Computational Functionalists will view this as “cope” and find it amusing that I am asking classical computational AI about this. Anyways, I think this is worth pondering even if Penrose is wrong, as we think about takeoff. Note that if AI starts integrating biologically inspired quantum computational methods (still speculative), then all bets are off.
Opus
Apprehending Gödel sentences. Recognizing a formally-unprovable-but-true statement as true — Penrose’s original example and the load-bearing one for his whole argument.
Inventing a new mathematical framework. Not solving inside ZFC, but the move that produced ZFC. Cantor’s diagonalization, Gödel’s arithmetization, Grothendieck’s schemes, Cohen’s forcing, Riemann reconceiving geometry as curved.
Seeing a paradigm is broken. Einstein dropping simultaneity, Copernicus stepping outside geocentrism, Darwin abandoning special creation — recognizing the framing itself is the bug, not a problem within it.
Pre-proof mathematical intuition. “Knowing” a theorem is true and which direction the proof must go before any deduction exists. Ramanujan is the cartoon; every working mathematician has a quieter version.
The discontinuous “aha.” Insight as a phase transition that reorganizes a whole problem space at once — categorically different from accelerated search.
Distinguishing proofs that explain from proofs that merely certify. The four-color theorem is proven but, many mathematicians would say, not understood. Penrose’s “understanding” is the illumination, not the certificate.
Recognizing which problems matter. Hilbert picking his 23, Erdős sensing which questions seed entire fields. The aesthetic-intellectual judgment that this is worth a century.
Genuine self-knowledge / calibration. Knowing what you actually have warrant for, as opposed to what you’d confidently assert. Tied, on Penrose’s view, to the unified self-presence of consciousness.
Qualia and first-person experience. What red is like, what grief is like — the inside of experience, not its functional description.
Aesthetic and moral perception, if real. Detecting beauty or moral weight as features of the world rather than inferring them from training patterns — having taste rather than mimicking it.
Semantic grounding / reference. Words meaning what they mean because a mind reaches out and refers, not because tokens co-occur. The Searle-flavored cousin of Penrose’s view.
Free, non-algorithmic choice. Penrose explicitly thinks the non-computable substrate underwrites a kind of agency that isn’t just deterministic-plus-randomness — the will that selects rather than computes.
Gemini
Grasping the ‘Whole’ vs. Calculating the Parts (The Chess Puzzle)
Understanding: Instantly recognizing a global, geometric truth—like seeing an impassable wall of pawns on a chessboard and knowing instinctively that a draw is guaranteed, without calculating a single future move.
Computation: Searching millions of future nodes, evaluating material advantage, and falsely predicting a win because the algorithm cannot “see” the concept of a wall, only the individual pieces.
Stepping Outside the System vs. Operating Within It (Gödel’s Incompleteness)
Understanding: Looking at a formal set of mathematical rules and being able to intuitively “see” that a specific statement is fundamentally true, even when the system’s own rules are mathematically incapable of proving it.
Computation: Being permanently trapped inside the axioms. The AI can execute the rules flawlessly to generate proofs, but it cannot evaluate the truth or consistency of the rules themselves from an outside perspective.
Aesthetic Judgment vs. Algorithmic Optimization
Understanding: A mathematician or scientist looking at two competing theories and intuitively gravitating toward one because it possesses an elegance, beauty, or “rightness” that hints at a deeper underlying truth before the math is even finished.
Computation: Evaluating theories purely based on statistical probability, data fitting, or current consensus. An AI does not experience the aesthetic resonance of a mathematical truth.
Setting the Direction vs. Executing the Task
Understanding: The overarching intuition of what is worth exploring. Deciding to investigate non-local consciousness, fund a high-risk global psi experiment, or pursue a totally new paradigm requires a subjective sense of meaning and potential.
Computation: Executing the mechanics of the experiment once the parameters are set. The AI can crunch the resulting data perfectly, but it cannot organically generate the existential curiosity that birthed the project.
The “Eureka” Moment vs. The Iterative Search
Understanding: The sudden, unprompted flash of insight where the solution to a complex problem arrives all at once, fully formed. In the Orch-OR model, this is the literal moment of quantum superposition collapsing (Objective Reduction) into a conscious realization.
Computation: Brute-forcing a solution by iteratively testing every possible combination until the correct one is found. There is no flash of insight, just the mechanical exhaustion of variables.
ChatGPT
Recognizing that the entire framing of a problem is wrong, not merely solving within the frame
(e.g. Einstein abandoning absolute space/time rather than improving Newtonian calculations)“Seeing” the truth of something before a derivation exists
(mathematical intuition preceding formal proof)Distinguishing genuine understanding from syntactic mimicry
(“I can manipulate the symbols” vs “I grasp what this means”)Conscious apprehension of meaning
(the felt reality of beauty, suffering, love, mortality, awe)Knowing when not to optimize
(wisdom rather than capability)Detecting when a goal is misaligned at a deep existential level
(“This entire enterprise is misguided”)Direct insight into semantic essence rather than statistical association
(understanding what a concept is, not merely how it behaves linguistically)The ability to transcend one’s own formal system
(Gödel-style stepping outside the rules themselves)Genuine first-person awareness
(there being “something it is like” to know)Non-algorithmic flashes of insight
(sudden holistic apprehensions rather than serial computation)Recognizing truth in situations with sparse or contradictory data without explicit inference chains
(“This is false” without being able to fully articulate why)Deep self-awareness of one’s own ignorance
(distinguishing certainty, intuition, confusion, ambiguity at a phenomenological level)Experiencing moral weight rather than merely modeling moral language
(understanding tragedy vs predicting tragedy-talk)Frame invention rather than frame search
(creating entirely new conceptual ontologies)Authentic intentionality
(having intrinsic aims rather than derived objective functions)Insight arising from unified conscious experience rather than modular computation
(a whole-field perception)The ability to care in a non-instrumental sense
(meaning not reducible to optimization)Knowing the difference between a simulation of understanding and understanding itself
(the central Penrose suspicion about AI)Perceiving analogy at an ontological level rather than surface relational mapping
(“these two things are secretly the same kind of structure”)Mystical/nondual insight or direct consciousness investigations
(where awareness examines itself rather than external symbols)The “Aha” quality of insight itself
(not the output, but the conscious event of illumination)Recognizing beauty or elegance as truth-indicative in mathematics or physics
(Penrose strongly values this)The intuition that some paths are sterile despite local success
(scientific taste/judgment)Genuine originality that is not recombination
(if such a thing truly exists)Awareness of being embedded in reality rather than merely modeling it
(the difference between map-processing and being a conscious locus within existence)
Examples from Emporer’s, Shadows
From Opus:
Across The Emperor’s New Mind (1989) and Shadows of the Mind (1994), Penrose offered a series of specific cases meant to illustrate his claim that human mathematical understanding outruns any algorithm. They have not all aged the same way.
The Gödel sentence (both books). The foundational example. Any consistent formal system rich enough to contain arithmetic generates a sentence asserting its own unprovability; the system cannot prove it, but a human mathematician trusting the system’s rules can see it must be true. Aged: contested but intact. The example survives because no AI system has ever “seen” a Gödel sentence in the relevant sense — but it survives only by accepting Penrose’s premise that humans actually do this seeing, which philosophers like Chalmers, Feferman, Shapiro, and Putnam have argued requires unjustified confidence in our own consistency. The example endures because no one can perform the test, not because it has been confirmed.
The halting problem and specific non-halting Turing machines (Emperor’s New Mind). Penrose argues a mathematician can sometimes see that a particular machine will never halt, while no algorithm can decide halting in general. Aged: structurally intact, practically eroded. The general undecidability result is mathematical fact. But automated theorem provers and termination analyzers can now decide halting for vastly more cases than they could in 1989, and large language models can correctly judge non-termination for many specific machines. The frontier where “only humans can see this” still exists, but it has retreated.
Hilbert’s tenth problem and Diophantine equations (Emperor’s New Mind). The Matiyasevich result that no algorithm decides whether a polynomial equation has integer solutions. Aged: similar to halting. True as a theorem, weaker as a polemic. Specific Diophantine problems that once seemed to require deep human insight — including Fermat’s Last Theorem, which Penrose discussed before Wiles proved it — have yielded to combinations of human and machine work, and AI systems now handle large classes of these cases routinely.
Undecidable tiling problems (Emperor’s New Mind). Berger’s result that determining whether an arbitrary tile set tiles the plane is undecidable. Penrose’s own aperiodic tilings serve as the concrete illustration. Aged: a draw. The undecidability is real and untouched. But the rhetorical force — that the mind grasps tiling structure in ways algorithms can’t — has weakened as constraint solvers, SAT-based methods, and AI-assisted search have cracked many specific tiling problems once thought intractable, including the recent discovery of the “hat” einstein monotile in 2023.
The Mandelbrot set (Emperor’s New Mind). Used less as something machines cannot compute (they obviously can plot it) than as evidence for mathematical Platonism: the set was discovered, not invented, and its inexhaustible structure points toward a mathematical reality the mind reaches toward. Aged: philosophically alive, evidentially weak. Platonism remains a respectable position, but the Mandelbrot example has been absorbed into the standard repertoire of complex-systems demos and no longer feels especially mysterious. It works as atmosphere now, not as argument.
The pawn-fortress chess position (Shadows of the Mind, expanded in 2017). Penrose argues that humans immediately see structural draws — pawn walls bishops cannot cross — while engines, computing material and lookahead, evaluate the position incorrectly. The Penrose Institute released a specific composed position in 2017 as a public test. Aged: badly. Modern engines handle these positions correctly. The categorical claim that fortress recognition lay outside computation collapsed within a decade. This is the cleanest case study in the whole project of how a confident example dissolves.
The non-terminating computation as obvious to humans (Shadows of the Mind). Related to but distinct from halting in general — Penrose gives specific cases where, he argues, anyone with mathematical training can see termination behavior at a glance. Aged: similar to the chess example, in slower motion. AI systems can now correctly judge many such cases, including ones outside their training distribution. The class of “humans see this instantly, computers cannot” cases keeps shrinking.
Aesthetic and structural mathematical perception (both books, diffuse). Throughout both books, Penrose appeals to mathematicians’ ability to recognize that a proof is elegant, that a construction is natural, that a result should be true before any proof exists. Aged: hardest to assess. This is the strongest version of his claim because it is the least testable. AI systems can imitate aesthetic judgment passably but it is genuinely unclear whether they have it. Most working mathematicians would say current systems still feel taste-blind in some hard-to-articulate way, even as they produce technically correct work. This is the one Penrosian example that hasn’t visibly eroded — possibly because it’s real, possibly because it’s vague enough to be unfalsifiable.
The “knowing-why” versus “knowing-that” distinction (Shadows of the Mind, throughout). Not an example per se but the recurring frame: a student can be told a theorem is true and reproduce its proof; understanding is grasping why the proof is the proof. Aged: alive, but reframed. This now reads less as an argument against AI and more as a description of the gap people sense when interacting with current systems. Whether that gap reflects a fundamental Penrose-style barrier or just current architectural limits is exactly what is unsettled.
The pattern across this list is striking. The examples that have aged best are the ones that are hardest to test — Platonic perception, aesthetic judgment, the “knowing-why” gap. The examples that were concrete enough to be settled have mostly settled against Penrose. This is not, in itself, a refutation of his deeper claim; on his own account, the residue of non-computable understanding is invisible from outside and cannot be pointed to with finished examples. But it does mean his argument has had to retreat continuously into territory where verification is impossible, which is either evidence of the argument’s depth or evidence of its unfalsifiability, depending on which way the reader is already leaning.
Non-computable insights in hindsight
Opus had this to say:
Once an insight is articulated, it gets absorbed into the formal corpus. Cantor’s diagonalization was an act of seeing something nobody had seen; today it’s an undergraduate exercise that any competent prover can execute. Gödel’s arithmetization felt like a thunderbolt in 1931; now it’s a standard construction. The forcing technique, schemes, non-standard analysis — all of them looked like reaches into the dark when they happened and look like natural moves in retrospect. The insight, once landed, retroactively becomes computation. The path from “no one could see this” to “of course, here are the rules for doing it” is the normal life cycle of a mathematical idea.
This creates a real epistemological trap for the Penrose view. Any specific example he gives of “non-computable understanding” can be answered with: but we have a procedure for it now. The argument has to keep retreating to the frontier of currently-unformalized insight, which is always a moving frontier. There’s a Hegelian quality to it: the absolute is whatever hasn’t been captured yet.
But here’s where it cuts for him too. The argument doesn’t actually require that individual insights be permanently non-computable. It requires that the capacity to keep finding them be non-computable. The escalator, not the steps. Each Gödel sentence, once seen, is provable in a strengthened system. But the move to the strengthened system — recognizing that the current system is missing something true — is itself the act he’s calling non-computable. And every time you formalize that move, the same move can be played one level up. The recursion is the point. It’s exactly because each result looks computable in hindsight that he can claim the generative faculty isn’t.
“Understanding” has a kind of vanishing quality — the moment it succeeds, it leaves behind a computable artifact that makes the original act look unnecessary. This is real, and it’s a genuine problem for identifying non-computable understanding from outside. You can never point at a finished proof and say “that required understanding,” because by the time it’s a finished proof, it’s just a string of rule-applications.
What you might be able to point at, on the Penrose view, are the discontinuities — moments where the formal apparatus had to be extended in a way that wasn’t derivable from within itself. Riemann switching to curved geometry. Cohen inventing forcing rather than deriving it. The introduction of the empty set, or of imaginary numbers, or of measure-zero sets that can be unmeasurable. These are moves that, at the time they were made, weren’t licensed by the existing rules — they were the rules being extended by something the rules didn’t contain. In hindsight, of course, we can write them down. But the writing-them-down happened after the seeing.
This connects back to the AI question in a precise way. If a system trained on all of human mathematics produces another result inside that trained distribution, Penrose can say: that’s interpolation in an enormous space, not the act of extension. The test case he’d actually care about is whether AI ever produces a move that, at the time, isn’t licensed by the existing formalism — and that then becomes the new formalism. The Erdős result almost certainly doesn’t qualify by that test. A new foundational reframing of some field might.
This observation lands on the genuine vulnerability of the position (any specific example dissolves on inspection) and also on its genuine resilience (the claim was never about the steps, only about the escalator). It’s the same vulnerability/resilience that makes Penrose so hard to definitively refute and so hard to definitively confirm — which is, I think, why the debate has been alive for thirty-five years and will probably stay alive for another thirty-five.


