Terence Tao reports AI systems autonomously solved Erdős problem #728 while demonstrating unprecedented capacity for iterative proof refinement and exposition generation, signaling a potential shift in mathematical research workflows.
A quiet revolution in mathematical research methodology is unfolding as artificial intelligence systems demonstrate increasingly sophisticated problem-solving capabilities. The latest milestone comes from Fields Medalist Terence Tao, who documented how AI tools autonomously resolved Erdős problem #728—a combinatorial number theory puzzle concerning divisibility conditions in binomial coefficients—after decades of human effort. This achievement extends beyond mere computation: AI demonstrated emergent abilities to rewrite mathematical expositions iteratively, fundamentally altering how proofs might be communicated and refined in the future.
The specific problem (#728) originated in a 1975 paper by Erdős, Graham, Ruzsa, and Strauss but contained ambiguities in its original formulation. Earlier this year, AI tools identified trivial solutions that violated the problem's intent, prompting mathematicians to refine constraints. On January 4, ChatGPT generated a proof under adjusted parameters, which was formalized using the Lean theorem prover by the AI system Aristotle. Subsequent iterations saw ChatGPT rewrite the proof into progressively clearer expositions (example), filling logical gaps and incorporating contextual references absent in initial attempts. Tao notes this output approaches publishable quality despite retaining some unnatural phrasing.
What distinguishes this case is the AI's role in dynamic exposition generation. Unlike traditional mathematical writing—where revising structure demands laborious manual effort—AI allowed rapid recreation of the proof's narrative across multiple versions. Participants could request tailored variants emphasizing different aspects: technical rigor, pedagogical clarity, or connections to prior literature. This capability, paired with Lean's automated verification, enables what Tao describes as a "high-multiplicity conception" of mathematical communication. Researchers might maintain a canonical paper while generating supplementary explanations for specific audiences, like simplified versions for students or field-specific adaptations using discipline-specific terminology.
Nevertheless, significant counterpoints temper enthusiasm. Tao emphasizes this solution isn't wholly novel: similar methods exist in literature, and the problem's obscurity contributed to its unresolved status. The AI's proof required human intervention to correct misinterpretations of the original problem's intent and repair logical errors. Broader concerns emerge from the mathematical community. Some question whether AI-generated expositions dilute mathematical craftsmanship, reducing researchers to prompt engineers rather than creators. As participant Peter H. Fröhlich observes, AI's tendency toward wholesale regeneration rather than localized edits necessitates complete revalidation, introducing unpredictability. Others highlight practical limitations: Noah Snyder notes the solution alone lacks broader context for standalone publication, while Tao suggests it might only merit inclusion in a survey paper alongside related results.
Despite caveats outlined in Tao's GitHub documentation, this case signals tangible progress. The system's ability to synthesize new proofs from ambiguous prompts and dynamically improve their communication reveals a path toward augmenting human expertise. For mathematical research—where refining arguments consumes substantial time—this could accelerate discovery. Yet as with all tools, its value hinges on mindful application: preserving human oversight for critical insight while automating mechanical tasks. The era of AI as a collaborative mathematical partner appears closer than many anticipated.

Comments
Please log in or register to join the discussion