A deep dive into ChatGPT Agent's HTTP headers reveals how it authenticates itself through cryptographic signatures, while an accidental Cloudflare setting created false alarms about search engine exposure. This case study highlights both the sophistication of modern bot identification and the pitfalls of misconfigured infrastructure.

Unmasking ChatGPT Agent: HTTP Signatures and Cloudflare Pitfalls

When OpenAI launched ChatGPT Agent—its browser automation tool replacing the Operator research preview—developer Simon Willison decided to reverse-engineer its behavior. What followed was a revealing journey through HTTP headers, cryptographic signatures, and a humbling lesson about Cloudflare configurations.

The Identity Game

Willison created a debug endpoint to capture headers when ChatGPT Agent accessed his page. The initial request showed a familiar user-agent string:

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 
(KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36

But contradictions emerged: while the user-agent claimed macOS, the Sec-Ch-Ua-Platform: "Linux" header revealed the underlying Linux infrastructure. More importantly, two critical headers stood out:

Signature-Agent: "https://chatgpt.com"
Signature-Input: sig1=...
Signature: sig1=...

These implement RFC 9421 HTTP Message Signatures, a 2024 standard allowing tamper-proof request authentication. ChatGPT Agent cryptographically signs requests using a private key, with the corresponding public key available at:

https://chatgpt.com/.well-known/http-signing-public-key

This provides a verifiable method to identify ChatGPT Agent traffic—a significant improvement over easily spoofed user-agent strings.

The Crawler Red Herring

Minutes later, Willison's endpoint received requests from Bingbot and Yandex, seemingly accessing the same private URLs shared with ChatGPT Agent. Initial analysis suggested alarming exposure:

207.46.13.9 → Verified Bingbot IP
77.88.5.113 → Yandex reverse DNS: 77-88-5-113.spider.yandex.com

But the mystery unraveled when Jatan Loya asked: "Do you have crawler hints enabled in CF?" Willison discovered an active Cloudflare Crawler Hints setting—a feature that proactively shares URL metadata with search engines to optimize crawling.

Lessons in Infrastructure Transparency

Modern bots authenticate, not impersonate: ChatGPT Agent uses cryptographic signatures rather than hiding in generic user-agent strings
Cloudflare features have side effects: Enabling Crawler Hints unintentionally exposed private URLs
Verification is critical: Microsoft/Yandex validation tools confirmed crawler identities, but configuration audits proved equally important

As Willison noted: "I deleted my posts... didn’t want misinformation to spread." The incident underscores how complex infrastructure layers—from OpenAI’s Azure-hosted proxies to Cloudflare’s edge network—can create misleading diagnostics. For developers building agent-based tools, HTTP Message Signatures offer a robust identification framework, while infrastructure teams must vigilantly audit feature flags that might inadvertently expose internal systems.

Source: Simon Willison's investigation (Original Post)