Antirez shares his experience using Claude Code to build a Z80 and ZX Spectrum emulator from scratch, demonstrating how AI coding agents can create original implementations when properly guided with documentation and clear rules.
Antirez recently conducted an interesting experiment in "clean room" software development, using Claude Code to build a Z80 CPU emulator and ZX Spectrum computer emulator from scratch. His approach offers valuable insights into how AI coding agents can be effectively used for complex programming tasks.
The experiment was partly inspired by Anthropic's blog post about using their Opus 4.6 model to write a C compiler in Rust. Antirez found several issues with that approach - why use Rust for a C compiler (which is essentially graph manipulation), why not provide ISA documentation, and why the minimal steering? He believed a better "clean room" setup would provide the agent with all relevant documentation while still prohibiting internet access and copying existing source code.
The Process
Antirez's methodology was methodical and well-structured:
Specification Phase: He created detailed markdown files describing what he wanted to build, including specific requirements like executing whole instructions at a time (for embedded compatibility), tracking clock cycles, providing memory access callbacks, and emulating all official and unofficial Z80 instructions. For the Spectrum implementation, he included extensive details about rendering, memory usage constraints, and I/O interactions.
Documentation Gathering: He started a Claude Code session to fetch all useful documentation about the Z80 and Spectrum from the internet, extracting only factual information into markdown files. He also provided binary test vectors and ROMs. Crucially, he then completely removed this session to prevent any contamination from source code seen during the search.
Implementation Phase: Starting fresh, he asked Claude Code to implement the emulator following strict rules: no internet access, no searching for similar source code, continuous testing, detailed commenting, and maintaining a work-in-progress log.
Verification: As a final step, he copied the repository to a clean environment and asked both Claude Code and Codex to check for copyright issues by comparing against major Z80 implementations. Neither agent found evidence of theft - the implementation used established emulation patterns but was distinct from existing codebases.
Results
The Z80 emulator was completed in 20-30 minutes, producing 1,200 lines of readable, well-commented C code (1,800 with comments and whitespace) that passed both ZEXDOC and ZEXALL test suites. The agent worked autonomously without any prompting during implementation, using a process of incremental development - implementing different classes of instructions, testing, debugging, and fixing bugs through integration tests and debugging sessions.
For the ZX Spectrum implementation, the process was similar but included more steering, particularly for TAP file loading. The agent demonstrated impressive versatility, writing instrumentation code to observe the Z80's behavior, implementing SDL integration, and handling complex timing issues for cassette loading. The final emulator could run games like Jetpac with working sound and minimal CPU usage.
Key Insights
Antirez draws several important conclusions from his experiment:
Documentation is crucial: Always provide agents with design hints and extensive documentation about the task. Such documentation can be obtained by the agent itself, but should be provided upfront for the implementation phase.
Clear rules matter: A markdown file with coding rules and a work-in-progress log that's frequently updated and read helps keep the agent on track and prevents forgetting important details during context compaction.
Think like a human programmer: The agent's process of incremental development, testing, and debugging mirrors how human programmers work. It's not about uncompressing a complete implementation from training data, but rather assembling different knowledge domains to create something new.
LLMs don't memorize and copy: The experiment contradicts the idea that LLMs simply memorize and uncompress training data. While they can memorize over-represented documents when prompted, in normal operation they create new code using known techniques and patterns without constituting copies of existing code.
Human coding isn't "clean room" either: Humans often download and study multiple implementations before creating their own, taking inspiration without copying verbatim. This cross-pollination effect has been crucial to software development's rapid evolution.
Technical Details
The implementation includes several thoughtful design choices:
- Memory efficiency: No large lookup tables for ULA/Z80 contention, ROM referenced rather than copied to save 16KB
- Embedded focus: Optional framebuffer rendering, designed for devices like RP2350
- Comprehensive testing: Integration with ZEXDOC/ZEXALL test suites, custom test binaries
- CP/M support: Automatic detection and implementation of CP/M syscalls from test binaries
- TAP loading: Accurate emulation of Spectrum cassette loading with PWM encoding
Future Directions
Antirez suggests an interesting follow-up experiment: implementing the same emulators without providing any documentation to the agent, then comparing the results. This could provide further insights into how much documentation improves AI-generated code quality.
The complete implementation is available on GitHub under an MIT license, which Antirez feels comfortable releasing because the code is original and represents quality training data for future LLMs, including open-weight models.
This experiment demonstrates that AI coding agents, when properly guided with documentation and clear rules, can produce high-quality, original implementations of complex systems. It's not about replacing human programmers but rather augmenting their capabilities with tools that can rapidly assemble and test different programming techniques and knowledge domains.
Comments
Please log in or register to join the discussion