Critical Billing Bypass Found in VS Code Copilot Chat
#Vulnerabilities

Critical Billing Bypass Found in VS Code Copilot Chat

Startups Reporter
5 min read

Microsoft's VS Code Copilot Chat contains a severe billing bypass vulnerability allowing unlimited free premium model usage through subagent manipulation.

A critical vulnerability has been discovered in Microsoft's VS Code Copilot Chat extension that allows users to bypass billing entirely and access premium AI models for free through a clever combination of subagents and tool calls.

The Vulnerability Explained

The issue, documented in GitHub issue #292452, exploits several key behaviors in Copilot's billing system:

  1. Subagents and tool calls don't consume premium requests - These operations are treated as free infrastructure
  2. Request cost calculated on initial model only - The billing system only looks at the first model used
  3. Free models available in Copilot - GPT-5 mini, GPT-4.1, and others are included at no cost
  4. Agent definitions can specify models - Users can define which model a subagent should use

By combining these elements correctly, users can effectively get unlimited free usage of expensive premium models like Claude Opus 4.5, which normally costs "3 premium requests" per use.

How the Attack Works

Here's the step-by-step process:

  1. Start a new chat session using a free model like GPT-5 Mini
  2. Create an agent configuration that specifies a premium model (e.g., Opus 4.5)
  3. Set the chat mode to "agent"
  4. In the initial message, instruct the agent to launch a subagent using the runSubagent tool
  5. The free model handles the initial request (no cost)
  6. The free model creates a subagent (still free)
  7. The subagent launches with the premium model configuration
  8. The premium model processes the request, but no premium requests are consumed

The billing system only sees the initial free model usage and completely ignores the expensive premium model that actually processed the user's query.

Real-World Impact

In testing, this vulnerability allowed a single message to trigger a 3+ hour process that launched hundreds of Opus 4.5 subagents to process hundreds of files, consuming only 3 premium credits total. Without intervention, the process would have continued indefinitely.

Additional Abuse Vectors

The reporter also identified a second, more complex attack vector:

  • Set chat.agent.maxRequests to a very high value
  • Use a premium model as the initial chat model
  • Create a custom script that the model is instructed to call repeatedly
  • Craft prompts that create infinite loops

This approach requires more effort but could theoretically allow unlimited premium model invocations beyond the initial message cost.

Security Implications

The vulnerability represents a significant security and business risk for Microsoft:

  • Revenue loss: Users can access expensive premium models without paying
  • Resource exhaustion: Attackers could potentially consume substantial compute resources
  • System abuse: The ability to create infinite loops could be used for denial-of-service attacks
  • Trust erosion: Billing bypass vulnerabilities severely damage user confidence

Microsoft's Response

Initially, Microsoft Security Response Center (MSRC) declined to address this as a security issue, stating that billing bypass falls outside their scope and instructing the reporter to file a public bug report instead. This decision has drawn criticism from the security community.

Microsoft has since marked the issue as "not planned" for resolution, though community members have disputed this classification and called for the issue to be reopened.

Technical Analysis

The vulnerability appears to stem from server-side metering and entitlement enforcement gaps. The billing system only tracks the initial model selection rather than following the actual inference path through subagents and tool calls.

Key technical observations:

  • Message "types" are declared on the client side, suggesting no API validation
  • The system trusts client-provided fields like agent configuration model selection
  • There's no server-side enforcement of entitlement checks on every inference
  • No caps exist on per-message tool calls or per-session requests

The community has proposed several mitigation strategies:

  1. Meter per actual inference - Track billing based on the resolved model used at dispatch time
  2. Treat subagent calls as billable - Any path that results in an LLM call should be metered
  3. Move enforcement to backend - Don't trust client-provided configuration
  4. Implement server-side caps - Limit per-message tool calls, per-session requests, and wall-clock time
  5. Add integration tests - Verify billing occurs correctly when subagents use premium models
  6. Add loop detection - Prevent single messages from triggering unlimited repeated calls

Industry Context

This vulnerability highlights the complex security challenges in AI-powered development tools. As AI assistants become more sophisticated with agent-based architectures, billing systems must evolve to track usage across complex execution paths rather than simple linear conversations.

For developers and organizations using VS Code Copilot Chat, this vulnerability represents both a potential cost-saving opportunity and a significant security concern that could lead to unexpected resource consumption or service disruption.

The incident also raises questions about how technology companies classify and prioritize billing-related vulnerabilities, particularly when they involve AI services where the cost structure is more complex than traditional SaaS billing.

Conclusion

This billing bypass vulnerability in VS Code Copilot Chat represents a serious security flaw that could allow unlimited access to premium AI models at no cost. While Microsoft has not yet committed to fixing the issue, the security community has highlighted the significant risks and proposed concrete mitigation strategies.

The case also illustrates the challenges of securing AI-powered development tools and the importance of comprehensive billing enforcement that tracks actual resource usage rather than just initial configuration settings.

For now, organizations using VS Code Copilot Chat should monitor their usage patterns closely and be aware that this vulnerability could potentially be exploited to bypass their premium model entitlements.

Comments

Loading comments...