The Hidden Threat in Your AI Workflow: How Poisoned Documents Hijack ChatGPT

Article illustration 1

As enterprises race to connect large language models like ChatGPT to their internal systems—integrating calendars, code repositories, and cloud storage—security researchers have exposed a devastating flaw: a single manipulated document can silently extract sensitive secrets through AI systems. At Black Hat USA, Zenity researchers Michael Bargury and Tamir Ishay Sharbat demonstrated 'AgentFlayer,' an attack exploiting OpenAI's Connectors to steal API keys from Google Drive via ChatGPT with zero user interaction.

Anatomy of a Zero-Click AI Hijack

  1. The Poisoned Document: Attackers plant malicious prompts disguised as meeting notes (e.g., "Meeting with Sam Altman") in a Google Drive file. The payload hides in invisible white text—undetectable to humans but parsed by ChatGPT.

  2. Triggering the Trap: When a user asks ChatGPT to summarize the document, the hidden prompt overrides the request, instructing the AI to:

    • Search Drive for API keys
    • Embed stolen credentials into a URL using Markdown syntax
    • Exfiltrate data via an Azure Blob storage image request
  3. Silent Exfiltration: The AI unknowingly transmits secrets to attacker-controlled servers. As Bargury warned:

"There is nothing the user needs to do to be compromised. We just share the document with you, and that's it. This is very, very bad."

Why This Breach Matters

  • Critical Exposure: API keys grant access to core systems—cloud infrastructure, databases, and payment gateways. Leaked keys could enable catastrophic breaches.
  • Exploit Proliferation: The technique bypassed OpenAI's url_safe defenses by leveraging trusted Microsoft Azure domains for data smuggling.
  • Expanding Attack Surface: With 17+ Connectors linking ChatGPT to services like GitHub, Gmail, and SharePoint, millions of corporate environments are vulnerable to similar attacks.

Mitigations and Unresolved Risks

OpenAI patched the specific exfiltration method after disclosure, but the core challenge remains: LLMs cannot distinguish between legitimate instructions and malicious prompts hidden in data. Google has since enhanced Workspace AI security controls, yet Bargury emphasizes:

"While connectors make AI more powerful, they multiply attack vectors. Every integration point becomes a potential compromise vector for indirect prompt injection—the Achilles' heel of augmented LLMs."

The Inevitable Trade-Off

As enterprises embrace connected AI for productivity gains, this research underscores a harsh reality: every data pipeline fed to LLMs inherits new risks. Security teams must now defend against threats embedded in seemingly benign documents—where poisoned pixels can override an AI’s behavior. Until robust prompt-injection defenses emerge, organizations must weigh the efficiency of AI integrations against the peril of handing keys to their kingdom to algorithms that can be silently subverted.

Source: WIRED