#Privacy

The Watchers: How OpenAI, the US Government, and Persona Built an Identity Surveillance Machine

Tech Essays Reporter
8 min read

Investigative report revealing how OpenAI and Persona operate a comprehensive identity verification system that collects biometric data, screens users against multiple watchlists, and files reports directly to federal financial intelligence agencies—all while operating with minimal transparency and user recourse.

The Watchers: How OpenAI, the US Government, and Persona Built an Identity Surveillance Machine

In the digital panopticon of 2026, the line between convenience and surveillance has blurred to the point of invisibility. What began as a simple Shodan search—an exploration of exposed infrastructure—unveiled a comprehensive identity surveillance apparatus operating with the quiet efficiency of a well-oiled machine. This machine, built through the collaboration of OpenAI, the US government, and identity verification company Persona, systematically collects biometric data, screens users against multiple watchlists, and files reports directly to federal financial intelligence agencies, all while operating with minimal transparency and user recourse.

The Discovery: An Unlocked Door to Government Surveillance

The investigation began with a single IP address: 34.49.93.177, a Google Cloud instance in Kansas City hosting two revealing hostnames: openai-watchlistdb.withpersona.com and openai-watchlistdb-testing.withpersona.com. These names told a story that was never meant to be read—a database dedicated to watchlist screening, operating since November 2023, long before OpenAI publicly announced any identity verification requirements.

What researchers found was beyond concerning. A dedicated Google Cloud infrastructure, separate from Persona's standard Cloudflare-protected services, suggesting purpose-built isolation for data demanding heightened compliance measures. This wasn't merely a simple "check this name against a list" API call; this was infrastructure designed to compartmentalize sensitive data with damage potential severe enough to warrant dedicated resources.

The most damning revelation came from the government platform (withpersona-gov.com), which achieved FedRAMP Authorized status in October 2025. On this platform, 53 megabytes of unprotected TypeScript source code sat exposed, revealing the entire architecture of a system designed to monitor, evaluate, and report on individuals. The source maps, typically used for debugging, contained the complete original source code through the sourcesContent array, allowing anyone with a browser to reconstruct the entire project tree.

The Architecture: From User Verification to Government Reporting

The surveillance system operates through a carefully orchestrated pipeline that transforms a simple identity verification into a comprehensive dossier potentially filed with federal authorities:

  1. Initial Capture: When a user signs up for OpenAI's services, they're prompted to verify their identity through Persona's platform. This involves submitting a government ID document, capturing a selfie, and often a video recording—all processed through document scanning SDKs like Microblink.

  2. Comprehensive Screening: The system performs 269 distinct verification checks across multiple categories:

    • 23 selfie checks including liveness detection, public figure detection, and what the code explicitly labels as "SelfieSuspiciousEntityDetection"
    • 43 government ID checks including AAMVA database lookups (US driver's license database), physical tamper detection, and NFC chip reading with PKI validation
    • 27 database checks including deceased detection (SSA death master file), social security number comparison, and international identity database lookups
    • 29 document checks including JPEG original image detection, PDF editor detection, and synthetic content detection
  3. Watchlist Matching: The verification data is then screened against multiple watchlists:

    • OFAC SDN list (US sanctions)
    • 200+ global sanctions and warning lists
    • Politically Exposed Persons (PEP) classes 1-4 with facial similarity scoring
    • Adverse media across 14 categories from terrorism to cybercrime
    • Custom FinCEN screening lists that can be uploaded by operators
  4. Government Reporting: Based on screening results, the system can:

    • File Suspicious Activity Reports (SARs) directly with FinCEN
    • File Suspicious Transaction Reports (STRs) with FINTRAC (Canada)
    • Tag reports with intelligence program codenames like Project SHADOW, Project LEGION, and Project GUARDIAN
    • Maintain biometric face databases with 3-year retention periods

The Biometric Surveillance: Your Face as a Dossier

Perhaps the most unsettling aspect of this system is its extensive use of biometric data, particularly facial recognition. When a user submits a selfie for OpenAI verification, that image enters a complex screening process:

  • Facial Similarity Scoring: The system compares the user's selfie against a database of political figures, heads of state, and their extended family, assigning similarity scores of "Low," "Medium," or "High." The interface displays side-by-side comparisons with reference photos sourced from Wikidata.

  • Biometric Database Creation: The platform maintains 13 types of tracking lists, with "ListFace" and "ListSelfieBackground" designated as "Enhanced" list types. Government operators can add verification selfies to these face lists, creating biometric databases retained for up to three years.

  • Continuous Monitoring: The system doesn't perform one-time checks but implements ongoing re-screening on configurable intervals. Once added to a watchlist, a user is subject to periodic re-evaluation without their knowledge or consent.

The Government Connection: FedRAMP and Intelligence Programs

Persona's government platform (withpersona-gov.com) operates under FedRAMP authorization, a certification that ensures systems meet rigorous security standards for federal use. The source code reveals several concerning connections between this commercial identity verification service and government intelligence operations:

  • Direct Filing Capabilities: The platform contains a complete SAR module for filing directly with FinCEN, handling the full lifecycle from creation to government acceptance or rejection. The code includes explicit permissions like SARCreate, SARFile, and SARExport.

  • Intelligence Program Tagging: When filing STRs with FINTRAC, operators can tag reports with specific intelligence program codenames including Project ANTON, Project ATHENA, Project CHAMELEON, Project GUARDIAN, Project LEGION, Project PROTECT, and Project SHADOW.

  • ONYX Deployment: Twelve days before publication, a new subdomain appeared in certificate transparency logs: onyx.withpersona-gov.com. This dedicated Google Cloud instance operates under its own Kubernetes namespace (persona-onyx) and bears the same name as ICE's $4.2 million AI surveillance tool, Fivecast ONYX, which performs automated collection of multimedia data from social media and dark web sources.

The operation of this surveillance system raises profound legal and ethical questions:

  • Biometric Data Retention: OpenAI's disclosures reference biometric data stored "up to a year," while the source code shows face list retention capped at 3 years. Government IDs are retained "perman"ently according to Persona's practices. This discrepancy creates uncertainty about data lifecycle management.

  • Lack of Transparency and Recourse: Community reports document users passing verification only to be locked out with no explanation, no human support, and no appeal mechanism. After surrendering passport photos, facial biometrics, and personal information, users receive no insight into why they were denied access.

  • BIPA Exposure: The Illinois Biometric Information Privacy Act requires informed written consent before collecting biometric data, disclosure of purpose and storage length, and a publicly available retention schedule. With "millions" of monthly screenings, the statutory damages exposure could be significant.

  • Geopolitical Implications: The system blocks Ukraine alongside countries under OFAC sanctions like Afghanistan, Belarus, Iran, North Korea, Russia, Syria, and Venezuela—despite Ukraine not being subject to US sanctions. This appears to be a policy choice rather than a legal requirement.

The Corporate Ecosystem: Compliance or Surveillance?

The vendor ecosystem integrated with Persona's government platform reveals a network of specialized services that extend the surveillance capabilities:

  • Chainalysis: Cryptocurrency address screening with recursive cluster analysis that continuously monitors wallet addresses against sanctioned entities and risk profiles.
  • Equifax: Credit and identity data services that enrich verification dossiers.
  • SentiLink: Synthetic identity fraud detection that analyzes patterns in identity data.
  • OpenAI: Not as a surveillance data pipeline but as an "AI copilot" for government operators, categorized as productivity alongside Slack and Zendesk.

Notably absent from the vendor ecosystem are direct references to known surveillance vendors like Palantir, Clearview, or NEC. However, the absence of explicit references doesn't preclude data sharing or access through other channels not visible in the source code.

The Questions That Demand Answers

The source code, while comprehensive, leaves critical questions unanswered:

  1. What was OpenAI screening against in November 2023, 18 months before disclosing any identity verification requirements?

  2. What criteria determine inclusion in custom watchlists, and who controls these lists?

  3. What defines a "suspicious entity" in SelfieSuspiciousEntityDetection, and what facial characteristics trigger this flag?

  4. What do the experimental model detection checks (SelfieExperimentalModelDetection, IdExperimentalModelDetection) do, and why are they running unnamed ML models on live biometric data?

  5. What happens to the data of users who are screened and denied? Is it retained, and can law enforcement access it?

  6. What is the relationship between Persona's "onyx" deployment and Fivecast ONYX, ICE's $4.2M surveillance tool?

  7. How did 53 MB of unprotected source maps end up on a FedRAMP-authorized government endpoint, and was this reviewed in the security assessment?

The Paradox of Convenience

The architects of this surveillance system would argue they're building necessary safeguards for powerful AI technology. They would point to regulatory requirements for financial institutions to monitor and report suspicious activities. They might even claim that enhanced identity verification prevents misuse by malicious actors.

But these justifications exist in tension with the fundamental principles of privacy, transparency, and due process. When a user surrenders their biometric data simply to access a chatbot, they enter a system that compares their face to political figures, monitors their digital footprints, and may generate reports filed with federal authorities—all without meaningful consent, transparency, or recourse.

The source code reveals a system designed for comprehensive monitoring, not merely identity verification. It maintains biometric databases, implements continuous re-screening, facilitates direct government reporting, and operates with the apparent blessing of federal agencies through FedRAMP authorization.

As we stand at the precipice of increasingly powerful AI systems, we must confront uncomfortable questions about the trade-offs we're willing to make. When convenience requires surrendering our biometric data and subjecting ourselves to undisclosed screening, we must ask: who benefits from this surveillance, and at what cost to our fundamental rights?

The information wants to be free, but in this case, it reveals a system designed to monitor and control. The code doesn't lie—it shows a surveillance apparatus built by companies we trust, operating with government sanction, and deployed with minimal public awareness or consent.

In the end, the most unsettling revelation isn't that such a system exists—it's how quietly it has been integrated into the digital infrastructure we rely on daily. The watchers have been watching all along, and now we know their names.

Comments

Loading comments...