Drew Breunig's 'whenwords' experiment challenges traditional software distribution by shipping only tests and specifications, letting coding agents generate implementations on demand.
Software libraries have always followed a predictable pattern: write code, package it, distribute it. Drew Breunig's recent experiment flips this model entirely. His time-formatting library called whenwords contains zero lines of implementation code.
Instead, it ships with three artifacts: a detailed specification document, an AGENTS.md file that instructs coding agents on how to build the library, and a comprehensive YAML file of conformance tests. That's it. When you need the library, you hand these files to a coding agent like Cursor, Aider, or Claude Code, specify your target language, and the agent writes the entire implementation for you.
What's Actually New Here
The core innovation isn't the time-formatting concept itself—converting timestamps to human-readable strings like "3 hours ago" is a solved problem with dozens of existing libraries across every language. What's novel is the distribution mechanism: a library as a specification rather than code.
Breunig's approach treats the specification as the source of truth. The YAML test suite defines expected behavior across dozens of edge cases: different time intervals, timezone handling, locale variations, and boundary conditions. The AGENTS.md file provides context about the library's purpose, design philosophy, and constraints. Together, these artifacts give an AI coding agent enough information to generate a correct implementation.
This represents a shift from "write once, run anywhere" to "specify once, generate anywhere." The same specification can produce Python, JavaScript, Rust, or Go implementations without manual porting.
The Conformance Test Breakthrough
The key enabler is the quality of the test suite. Breunig's YAML file doesn't just contain simple unit tests—it captures nuanced behavior that would be difficult to infer from requirements alone. For example:
- How should the library handle ambiguous time periods? (Is "1 month" 30 days or calendar months?)
- What precision is appropriate for different intervals? (Should "just now" apply to 5 seconds or 30?)
- How does it handle edge cases like leap years, daylight saving transitions, or irregular calendar systems?
By encoding these decisions in test cases, the specification becomes executable documentation. The coding agent doesn't need to guess what "correct" means—it has a concrete target to hit.
This approach mirrors how modern AI systems learn: show them examples of desired behavior rather than telling them what to do. The test suite functions as a few-shot learning prompt for the coding agent.
Limitations and Trade-offs
The zero-code model has clear constraints:
Language maturity matters. The approach works best for languages with well-established libraries and patterns that agents have seen frequently in training data. Generating a correct Python implementation is easier than producing idiomatic Haskell or cutting-edge Zig code.
Specification quality is everything. A vague or incomplete spec leads to wrong implementations. Breunig's success depends on having tests that cover real-world usage patterns, not just happy paths. The spec needs to capture architectural decisions, performance expectations, and error handling philosophy.
Debugging gets harder. When the agent generates buggy code, you're debugging both the implementation and the specification. You need strong test coverage to distinguish between "wrong spec" and "wrong implementation."
Version control changes. Instead of tracking code changes, you're tracking spec changes. This requires different tooling and workflows. The spec becomes the API you maintain.
Practical Applications
This pattern extends beyond time formatting. Consider:
- API clients: Specify endpoints, authentication, and response schemas; generate clients in any language
- Data validation: Define schemas and constraints; generate validators for different runtimes
- Configuration parsers: Specify config file formats; generate parsers for various languages
- Protocol implementations: Define wire protocols; generate serialization/deserialization code
The model works best for libraries that are:
- Well-understood with established patterns
- Primarily glue code or data transformation
- Not performance-critical at the implementation level
- Used across multiple languages
The Broader Pattern
Breunig's experiment aligns with a larger trend we're seeing in AI-assisted development: moving up the abstraction ladder. Instead of writing functions, developers write specifications. Instead of debugging code, they validate tests.
This isn't about replacing developers—it's about changing what we optimize for. Writing good specifications requires deep domain knowledge and careful thinking about edge cases. Writing code requires attention to syntax and language-specific patterns. The former is harder to automate than the latter.
The conformance suite approach also creates a valuable byproduct: language-agnostic behavioral documentation. Even if you never use a coding agent, the YAML tests serve as comprehensive requirements that any developer can read and understand.
What Comes Next
The real test is whether this pattern scales beyond experiments. We need:
- Better tooling for writing and maintaining specification files
- Standardized formats for conformance tests that agents can reliably parse
- Verification systems that can validate generated code against specs automatically
- Community practices for sharing and evolving specifications
Breunig's whenwords library is available on GitHub as a proof of concept. It's worth studying the actual YAML test file to see how much detail is required to generate reliable code. The specification is the hard part—everything else is becoming infrastructure.
This experiment suggests a future where libraries are distributed as executable specifications, and the "implementation" is just a generated artifact. For certain types of problems, that future might already be here.

Comments
Please log in or register to join the discussion