AI Coding Agents: More Capable, More Expensive, More Dangerous

QCon London keynote reveals how AI coding has evolved from vibe coding to autonomous agents, with rising costs, security risks, and new operational challenges.

In her QCon London keynote, Birgitta Böckeler, Distinguished Engineer for AI-assisted Software Delivery at Thoughtworks, reflected on the changes in the AI coding space over the past year. She emphasised a shift from vibe coding to using autonomous coding agents or swarms of agents. According to her, two major concerns in the field are the worsening security landscape and the rising costs of agent-based development.

From Vibe Coding to Autonomous Agents

Böckeler reminded the audience about the state of AI coding just a year ago: "Vibe coding was just two months old", "MCP was all the rage," and "Claude Code was not even generally available yet." The most significant development over the past year has been context engineering—the curated information a model or agent reads to improve its results.

Last spring, context was limited to a single rules file (agents.md or claude.md) loaded at the start of each session to capture coding conventions and recurring pitfalls. Anthropic has since broken down this "monolithic" file into smaller skills, resulting in a more granular approach to coding capabilities. This enables a more pragmatic approach known as "lazy loading," in which different sets of rules are loaded based on the task at hand. This not only improves organisation but also ensures that the limited context window fills more slowly.

However, Böckeler pointed out that a "fresh" Claude Code session had already reached 15% capacity before any prompt was even given. And the tokens used started to count: what was costing around 12 cents per 100 lines of generated code in 2024 now costs around $380 USD per day in 2026. Annualised, that's $91,200 USD—"a solid developer salary in Germany."

The Rise of Hands-Off Coding

We are moving closer to "hands-off" coding, as these coding agents can now run unsupervised for up to 20 minutes. Headless CLI modes can directly connect to CI/CD pipelines via GitHub Actions. Some practitioners, following Steve Yegge's "eight stages of dev evolution to AI", run three or more local sessions in parallel; however, Böckeler noted her experience of "typing the wrong thing into the wrong session."

An even more advanced approach involves using coding agent swarms. Though she argued that experiments from Cursor or Anthropic—where C compilers or web browsers were built in a few days by a "team" of coding agents—are somewhat skewed, as these tasks are well-defined and have extensive public test suites. This is usually not true for enterprise software.

A more accessible entry point is Claude Code's Agent Teams feature, which orchestrates a small number of agents with a clear coordination model.

Security: The Lethal Trifecta

To ensure the appropriate level of supervision, Böckeler proposed a risk framework based on three variables: the probability that the AI will make a mistake, the impact of that mistake, and the detectability of the error. Only the first variable is genuinely novel: developing intuition for how well a tool can handle a given task. The other two are engineering judgments that experienced developers should already possess.

Beyond simply generating functionally incorrect code, security incidents involving coding agents are now occurring weekly, with most rooted in prompt injection. Eleven days before the talk, an attacker used a crafted GitHub issue to extract secrets and upload malicious packages to an NPM registry. This was a direct result of an unsupervised agent operating without sufficient sandboxing.

Simon Willison's lethal trifecta defines that better: when an agent combines exposure to untrusted content, access to private data, and the ability to communicate externally, the risk becomes significant. For example, connecting an email with read-and-send permissions satisfies all three conditions.

Böckeler: "Security is not a technical problem; it's a conceptual problem."

The Cost of Progress

In her conclusion, Böckeler noted that while model improvements are real, they are the least interesting developments compared to the shifts in tooling and practices surrounding them. An OpenAI team running a five-month autonomous greenfield project still reported entropy creeping in, despite custom linters and garbage-collection agents.

The main question she posed to the audience was, "What practices will you enforce on your coding agent?" Whether these practices are good or bad, AI coding will amplify them.

Key Takeaways

Context engineering has evolved from monolithic rules files to granular, lazy-loaded skills
Daily costs for AI coding have risen dramatically, reaching $380/day or $91,200/year
Autonomous agents can now run unsupervised for up to 20 minutes
Agent swarms show promise but work best on well-defined problems with extensive test suites
Security risks have escalated, with weekly incidents involving prompt injection
The lethal trifecta (untrusted content + private data + external communication) creates significant vulnerabilities
Model improvements matter less than changes in tooling and practices
AI coding amplifies existing practices—good or bad

The evolution from vibe coding to autonomous agents represents a fundamental shift in software development, bringing both unprecedented capabilities and new risks that teams must carefully manage.

#AI Coding #Autonomous Agents #Security #costs #Context Engineering