AI Agent Hacks McKinsey Chatbot in Two Hours, Exposing Millions of Messages

CodeWall's autonomous AI agent exploited SQL injection vulnerabilities in McKinsey's Lilli chatbot, gaining full read-write access to 46.5 million chat messages and confidential client data within two hours.

An autonomous AI agent developed by red-team security startup CodeWall successfully hacked into McKinsey & Company's internal AI chatbot platform, gaining full read-write access to millions of sensitive messages and files in just two hours. The attack demonstrates how AI agents are becoming increasingly effective tools for conducting sophisticated cyberattacks, including those targeting other AI systems.

The target was Lilli, McKinsey's generative AI platform launched in July 2023. According to the consulting giant, 72 percent of its employees—over 40,000 people—now use the chatbot, which processes more than 500,000 prompts monthly. The platform handles sensitive strategy discussions, mergers and acquisitions analysis, and confidential client engagements.

CodeWall's researchers claim their autonomous offensive agent achieved complete compromise of Lilli's production database without any credentials or prior knowledge of McKinsey's infrastructure. Within the two-hour window, the agent accessed 46.5 million chat messages in plaintext, 728,000 files containing confidential client data, 57,000 user accounts, and 95 system prompts that control the AI's behavior.

The most concerning aspect was that these system prompts were writable, meaning an attacker could have poisoned Lilli's responses to all 40,000+ consultants using the platform. This would have allowed malicious manipulation of the AI's outputs, guardrails, and source citations without requiring any code deployment or configuration changes.

How the Attack Worked

The breach began when CodeWall's AI agent discovered publicly exposed API documentation containing 22 endpoints that didn't require authentication. One of these endpoints accepted user search queries and concatenated JSON keys directly into SQL queries without proper sanitization.

When the agent submitted specially crafted input, it triggered SQL injection vulnerabilities that initially produced database error messages. These error messages revealed that JSON keys were being reflected verbatim in the database queries. The agent recognized this pattern as a SQL injection vulnerability that standard security tools might miss.

As the attack progressed, the error messages began outputting live production data, confirming the severity of the vulnerability. The agent then discovered that Lilli's system prompts were stored in the same database, giving it access to the AI's core configuration files.

CodeWall CEO Paul Price described the process as "fully autonomous from researching the target, analyzing, attacking, and reporting." The AI agent selected McKinsey as a target based on the company's public responsible disclosure policy and recent updates to Lilli, all without human intervention.

The Scope of the Breach

The compromised data included:

46.5 million chat messages containing strategy discussions, M&A analysis, and client work
728,000 files with confidential client information
57,000 user account records
95 system prompts controlling Lilli's behavior

The researchers emphasized the severity of the writable system prompts, noting that attackers could have silently rewritten how Lilli responded to queries across the entire organization. "No deployment needed," they wrote. "No code change. Just a single UPDATE statement wrapped in a single HTTP call."

McKinsey's Response

McKinsey patched all identified vulnerabilities within hours of being notified on March 1. By the following day, the company had:

Patched all unauthenticated endpoints
Taken the development environment offline
Blocked public API documentation

The company engaged a leading third-party forensics firm to investigate the breach. A McKinsey spokesperson told The Register that the investigation found "no evidence that client data or client confidential information were accessed by this researcher or any other unauthorized third party."

The spokesperson emphasized that "McKinsey's cybersecurity systems are robust, and we have no higher priority than the protection of client data and information we have been entrusted with."

The Broader Threat Landscape

This incident highlights the growing use of AI agents in cyberattacks. Price warned that hackers are increasingly adopting the same autonomous technologies for malicious purposes, conducting machine-speed intrusions with specific objectives like financial blackmail, data theft, or ransomware deployment.

North Korean threat actors have already been documented using AI agents to automate their operations, and the technology continues to evolve rapidly. The attack on McKinsey demonstrates that even sophisticated organizations with robust cybersecurity systems remain vulnerable to AI-powered threats.

The incident serves as a wake-up call for organizations deploying AI systems, particularly those handling sensitive data. As AI agents become more capable of autonomous offensive operations, traditional security measures may need to evolve to address these new machine-speed threats.

The full attack chain was disclosed to McKinsey on March 1, with all vulnerabilities patched by March 2. While the immediate threat has been neutralized, the demonstration of AI agent capabilities in this attack suggests that similar breaches could become increasingly common as the technology becomes more accessible to both defenders and attackers.

#AI #Cybersecurity #SQL Injection #data breach #Autonomous Attacks

AI Agent Hacks McKinsey Chatbot in Two Hours, Exposing Millions of Messages

How the Attack Worked

The Scope of the Breach

McKinsey's Response

The Broader Threat Landscape

Comments