Reassessing Lean's Place in the History of Formal Mathematics

A veteran researcher in formal methods contends that Lean's current prominence in mathematical formalization overlooks foundational work from the 1960s onward, arguing that key milestones were achieved decades ago by systems like AUTOMATH, Boyer-Moore, and the LCF family, while acknowledging Lean's strengths in community engagement and accessibility.

The frequent insistence that proponents of mathematical formalization must justify not using Lean has become a notable feature of contemporary discourse. This expectation, while reflecting Lean's undeniable recent successes, prompts a necessary historical reflection on the field's evolution. The core argument is not that Lean lacks merit—its robust library (mathlib), active community, and tooling are significant—but that the narrative of its uniqueness risks obscuring a longer trajectory of achievement.

Formalizing mathematics is not a recent endeavor enabled solely by dependent type theory. In 1968, N.G. de Bruijn's AUTOMATH system already embodied many essential components for rigorous formalization. By 1977, L.S. Jutting had used it to formalize Landau's Foundations of Analysis, constructing the real numbers from logical foundations via equivalence classes and Dedekind cuts—a feat not replicated for two decades despite exponential growth in computing power. This work demonstrated that the fundamental machinery for analyzing core mathematical structures existed long before modern proof assistants.

Parallel developments further underscore this point. The Boyer-Moore theorem prover (later ACL2), emerging from LISP verification in 1973, successfully formalized profound results including Gödel's incompleteness theorems and the Banach-Tarski paradox, albeit with a computational logic orientation less suited to abstract mathematics but highly effective for specific domains. Simultaneously, the LCF paradigm pioneered by Robin Milner at Edinburgh introduced the revolutionary concept of using a functional programming language (ML) as a metalanguage for proof checking. This approach, which separates the logical kernel from user-facing tactics via abstract data types, directly enabled systems like HOL, Isabelle, and early Coq (now Rocq). John Harrison's work in HOL Light, for instance, formalized the prime number theorem using complex analysis—a milestone often attributed to newer systems.

A critical technical distinction often overlooked concerns the "propositions as types" paradigm. While central to systems like Coq and Agda, it is not universal. AUTOMATH, for example, kept types and propositions distinct—a choice de Bruijn justified decades ago to avoid complications like proof-dependent division (where x/y would require a proof of y≠0). The LCF insight that proof objects are unnecessary for verification—since a well-designed abstract data type in ML can enforce inference rules statically—means that carrying around extensive proof terms, as some type-theoretic systems do, can be computationally wasteful without adding verifiable value. This is not merely theoretical; in the era of large language models generating verbose proofs, systems leveraging LCF-style efficiency (like Isabelle's sledgehammer) often produce more human-readable outputs after automation.

The rise of Lean, particularly through Tom Hales' library-building initiative and Kevin Buzzard's pedagogical advocacy, undeniably broadened access. Its success in formalizing projects like the liquid tensor experiment showcases its capability for cutting-edge work. However, characterizing this as the first time such feats were possible overlooks that HOL Light and Isabelle/HOL had already formalized the Jordan curve theorem and the prime number theorem by the mid-2000s. Lean's key innovation may lie less in novel technical foundations and more in cultivating an exceptionally welcoming environment for mathematicians, reducing barriers to entry through superior documentation, IDE integration, and community norms.

This historical perspective carries practical implications. Over-indexing on Lean risks neglecting valuable alternatives: Isabelle's unmatched automation via sledgehammer (which integrates external solvers seamlessly), Mizar's extraordinarily readable Isar language (influencing Isabelle's own Isar), or even the simplicity of ACL2 for industrial verification. Moreover, the insistence on dependent types everywhere—despite their discouragement in Lean's own mathlib for many mathematical structures—can introduce unnecessary complexity through universe levels and definitional equality challenges, where T(N+1) and T(1+N) become distinct types, complicating proofs without enhancing mathematical expressiveness for much of mainstream analysis.

The emergence of AI-assisted proof generation further highlights the importance of readability and structural clarity. Legible, well-organized proofs—whether produced by humans or refined from AI output via tools like sledgehammer—are more amenable to inspection, simplification, and cross-system translation. Here, the LCF-derived emphasis on proof irrelevance and tactic-based scripting (as seen in Isabelle/Isar) may offer advantages over proof-term-heavy approaches when the goal is human understanding rather than mere machine verification.

Acknowledging Lean's genuine contributions does not require denying the field's deep roots. The formalization of mathematics has always progressed through diverse approaches: AUTOMATH's logical foundations, Boyer-Moore's computational verifiability, LCF's tactical frameworks, and Mizar's natural language-inspired syntax each addressed different facets of the challenge. Recognizing this pluralism fosters a healthier ecosystem where the choice of tool aligns with specific project needs—whether prioritizing automation, accessibility, library coverage, or verification trust—rather than conforming to a single prevailing trend.

#Lean #formal methods #proof assistants #Mathematics #Automation

Reassessing Lean's Place in the History of Formal Mathematics

Comments