Alex Self outlines a multi‑phase process that uses Claude sub‑agents to inject systematic doubt into AI‑generated specs, code, and documentation. By front‑loading scrutiny with specialized agents—such as Assumption Excavator, Gap Analyzer, and Security Analyst—the workflow turns mistrust in LLM output into a repeatable quality‑gate, while acknowledging token costs and the need to tailor depth to project scope.
Overview
Alex Self’s “automated doubt” workflow is a response to the erosion of confidence that can happen when large language models (LLMs) are given free rein over code, specs, and docs. The core idea is simple: treat every AI‑produced artifact as a hypothesis that must be interrogated from several angles before it is accepted. The approach is organized into three phases—Design, Development, and Wrap‑up—each powered by a suite of Claude sub‑agents that act as specialized auditors.
Phase 1 – Design (Spec First)
- Spec generation – The main Claude instance drafts a specification (PRD, design doc, etc.). The human reviewer skims for 2‑5 minutes to confirm that the high‑level intent is captured.
- Pre‑implementation doubt – A slash‑command in Claude Code launches three agents:
- Pre‑Implementation Architect – Checks design cohesion, scope, and architectural concerns.
- Documentation Validator – Flags missing or ambiguous documentation sections.
- Assumption Excavator – Surface hidden assumptions (e.g., mismatched data‑shape expectations).
- Iterative enrichment – Findings are folded back into the spec. For larger scopes, three additional agents run:
- Gap Analyzer – Finds omitted error‑handling, edge‑cases, or integration points.
- Implied Completeness Detector – Highlights implicit requirements that are not explicitly stated.
- Ambiguity Mapper – Spots language that could be interpreted in multiple ways.
- Checklist creation – Once the spec stabilizes, Claude produces a companion checklist that can be used to track progress across sessions.
Typical output: 10‑25 findings for a small feature, up to 35 for a medium‑sized component.
Phase 2 – Development
- Spec‑driven coding – Claude consumes the final spec and checklist, then writes the implementation. If a session is resumed, a Chain Tracer or Deep Explore sub‑agent reconstructs the current state before proceeding.
- Write‑avoidance policy – Sub‑agents are deliberately not used for direct file writes; a single Claude Code terminal instance performs all mutations. This mitigates the risk of rogue agents corrupting the codebase.
- Post‑implementation doubt – After a build succeeds, a second wave of agents audits the output:
- Code Validator – Looks for logic errors, missing tracking calls, etc.
- Type Safety Validator – Enforces static‑type guarantees.
- Test Architect – Generates or validates test suites.
- Code Optimizer – Suggests performance or readability improvements.
- Public Interface Validator – Checks API contracts and documentation.
- Security Analyst – Flags potential injection points, information leaks, and other security concerns.
- Iterative triage – The first run typically yields 15‑35 findings, with the most severe addressed before a second pass. The loop repeats until the defect count falls below a personal quality threshold.
Example finding: “Promise.allSettled fires all agents simultaneously with no concurrency limit, risking resource exhaustion and API rate limits.”
Phase 3 – Wrap‑up & Ship
When the post‑implementation audit is satisfactory, a Ship workflow runs a final checklist of agents, adding a few that focus on release readiness:
- Anxiety Reader – Detects lingering runtime anxieties such as uncontrolled concurrency.
- API Contract Validator – Ensures the live API matches the spec.
- Release Readiness Validator – Verifies versioning, changelog completeness, and deployment scripts.
The goal is not to achieve a perfect, immutable artifact—rather, to converge on a state where the operator feels comfortable releasing, backed by documented evidence from multiple perspectives.
When to Scale the Process
- Small scope – Run only the pre‑implementation agents.
- Medium scope – Add Gap, Implied Completeness, and Ambiguity agents.
- Large scope – Perform a full sweep, possibly looping through the entire suite multiple times.
Token consumption is a practical constraint; each agent invocation burns Claude credits. Projects with tight budgets may limit themselves to a Code Validator and Test Architect, while mission‑critical systems might employ 40+ agents.
Key Takeaways
- Automated doubt is a trust‑building signal – By forcing the LLM to justify its output, the human operator regains confidence.
- Multiple viewpoints catch different defects – Just as two eyes provide depth, diverse agents surface distinct classes of bugs.
- The process is modular – Teams can add or drop agents based on risk tolerance and token budgets.
- Assumption Excavator is universally valuable – Even a single pass with this agent can surface hidden premises that would otherwise slip through.
- All pipelines are open‑source – The full set of agents and Claude Code commands are available at the GitHub repository.
Final Thought
Quality in AI‑assisted development remains a negotiation between the artifact, the agents, and the operator. By front‑loading scrutiny and iterating until the collective sense of “good enough” aligns, developers can turn the uncertainty of LLM output into a repeatable, auditable process.
Comments
Please log in or register to join the discussion