John D. Cook used Claude to test quaternion rotation formulas against Lean, and the model found a mismatch between LaTeX output and Python code before producing a proof that Lean accepted.
John D. Cook received a typo report about his quaternion rotation matrix post, then used Claude Sonnet 4.6 Medium to test whether a proof assistant workflow could catch the same error.
Cook asked Claude to write Lean code for the two theorems in his post on converting between quaternions and rotation matrices. The post had Python code for numeric checks and LaTeX output for formulas. One subscript differed between them.
{{IMAGE:1}}
Claude compared the sources and treated the Python implementation as the ground truth. That mattered because Cook had asked for a proof of the SVG formulas, and the SVG came from the LaTeX with the typo. Claude flagged the row 1, column 2 entry and chose 2*(q2*q3 - q0*q1) from the Python code instead of the alt text’s 2(q_1 q_3 - q_0 q_1).
That choice shows the practical value of mixing code, symbolic math, and proof tools. A blog post can contain prose, rendered equations, alt text, and executable code. A reader might trust the equation image because it looks like math. A model that compares the sources can catch a conflict that a single representation hides.
Cook still had to steer the process. Claude’s first Lean 4 attempt failed. Cook pasted error messages back into the model, and after four rounds, Claude produced code that worked with Mathlib. The final proof covered column norms, column orthogonality, row norms, row orthogonality, four Chiaverini-Siciliano square-root arguments, and three sign-correction identities.
The Lean proof did more than repeat the Python test. Python checked sample values and gave Cook confidence that the formulas behaved as expected. Lean forced each algebraic identity to pass under the stated unit-quaternion assumption, q0^2 + q1^2 + q2^2 + q3^2 = 1. Cook used nlinarith, linear_combination, linarith, and ring to discharge the polynomial equalities.
The episode points to a useful workflow for technical writing. Authors who publish formulas in LaTeX and implement them in Python risk drift between the two versions. Cook already reduces that risk by generating one representation from another, such as Mathematica’s TeXForm[]. Lean adds a stricter check: the author must state the theorem, name the assumptions, and prove that the rendered identities follow.
The counterpoint concerns trust. Claude helped identify the typo, but Cook still verified the result through Lean. The model supplied candidate code and interpretations. The proof assistant checked the algebra. That split gives the workflow its strength: the model helps with translation and repair, while Lean enforces the final standard.
Comments
Please log in or register to join the discussion