Revisiting Literate Programming in the Age of AI Agents

Literate programming, once a niche approach to software development, may find new relevance as AI coding agents eliminate its traditional maintenance burdens and transform how we document and understand code.

Literate programming, conceived by Donald Knuth in the 1980s, represents a radical reimagining of how we write and understand software. The core idea is simple yet profound: instead of writing code that humans must decipher, we write prose that explains the code, with the code itself embedded as examples. The narrative flows naturally, and the code blocks serve as illustrations of the concepts being explained. For decades, this approach has remained largely theoretical, practiced by enthusiasts but rarely adopted in mainstream software development. The fundamental challenge has always been the same: maintaining two parallel narratives—the prose explanation and the actual code—creates a maintenance burden that most developers find untenable. Every change to the code requires corresponding updates to the prose, and vice versa, leading to a synchronization problem that grows exponentially with codebase complexity.

Historically, literate programming has found its most practical applications in data science through Jupyter notebooks, where explanations live alongside calculations and their results in a web browser. Within the Emacs ecosystem, Org Mode's org-babel package has enabled polyglot literate programming, allowing execution of arbitrary languages with results captured back into the document. However, even for enthusiasts like myself, using Org Mode as the source of truth for larger software projects has proven cumbersome. The source code becomes a compiled output, requiring extraction after every edit—a process known as "tangling" in Org Mode parlance. While automation is possible, it's easy to get into situations where manual edits to the real source get overwritten on the next tangle operation.

Yet I've maintained enough success with using literate programming for personal configuration and documentation that I've never been able to fully abandon the idea. Before the advent of AI coding agents, I had been adapting patterns for using Org Mode for manual testing and note-taking. Instead of working on the command line, I would write commands directly into my editor and execute them there, editing them in place until each step was correct. This approach meant that when I was done, I would have a document explaining exactly the steps that were taken, without requiring separate note-taking. The combination of creating notes and running tests gave me documentation for free when the test was completed.

This is where the emergence of AI coding agents transforms the equation entirely. Claude, Kimi, and other large language models have demonstrated remarkable proficiency with Org Mode syntax—a forgiving markup language that these models handle exceptionally well. All the documentation is available online and was likely included in their training data. While a significant downside of Org Mode is its extensive syntax, this complexity poses no problem for a language model. Now, when I want to test a feature, I ask my coding agent to write me a runbook in Org Mode. I can review it—the prose explains the model's reflection of the intent for each step, and the code blocks are interactively executable once I've finished reviewing, either one at a time or the whole file like a script. The results are stored in the document under the code, just like in a Jupyter notebook.

The transformative aspect is that I can edit the prose and ask the model to update the code, or edit the code and have the model reflect the meaning in the text. I can even ask the agent to change both simultaneously. The problem of maintaining parallel systems disappears. The agent handles tangling automatically, and the problem of extraction goes away. By instructing the agent with an AGENTS.md file to treat the Org Mode file as the source of truth, to always explain in prose what is going on, and to tangle before execution, we delegate the tedious aspects of literate programming to tireless machines. The agent excels at all of these tasks and never gets tired of re-explaining something in prose after a tweak to the code.

The fundamental extra labor that has historically limited literate programming's adoption is eliminated by AI agents, and this approach leverages capabilities that large language models are best at: translation and summarization. As a benefit, the codebase can now be exported into many formats for comfortable reading. This becomes especially important if the primary role of engineers is indeed shifting from writing to reading code.

I suspect that literate programming will also improve the quality of generated code, because the prose explaining the intent of each code block appears in context alongside the code itself. When an AI must explain its reasoning in natural language before producing code, it forces a level of clarity and intentionality that might otherwise be absent. The act of writing prose about what the code should do serves as a form of design documentation that emerges organically from the development process.

So far, I've only been using this workflow for testing and documenting manual processes, but I'm thrilled by its application there. I recognize that the Org format is a limiting factor due to its tight integration with Emacs. However, I've long believed that Org should escape Emacs. Markdown would be a natural alternative, but it lacks the ability to include metadata. Org Mode has concepts like properties that allow acting on the document programmatically from Emacs Lisp. In the past, this meant I was often tempted to fumble around with Lisp for a while to get some imagined interactive feature in my document, for which I never had time in practice. But now the LLM will happily insert some Emacs Lisp into the file variables section of the document with bespoke functionality for that interactive document specifically.

The lack of metadata in Markdown also means there's nowhere to store information about code blocks that would be extracted from a literate document. Org Mode provides header arguments that can be applied to source code blocks, providing instruction to the machine about execution details like where the code should be executed, which might even be a remote machine.

As usual in my posts about Emacs, it's not Emacs's specific implementation of the idea that excites me—Org's implementation of literate programming does. It is the idea itself that is exciting, not the tool. With AI agents, does it become practical to have large codebases that can be read like a narrative, whose prose is kept in sync with changes to the code by tireless machines? I think that's a compelling question worth exploring. The convergence of literate programming principles with AI capabilities might finally deliver on the promise that Knuth envisioned nearly four decades ago: code that can be read and understood as a coherent narrative, not just executed as a series of instructions.

#literate-programming #AI_Agents #Org Mode #software documentation #LLMs

Revisiting Literate Programming in the Age of AI Agents

Comments