Confidential computing was designed to protect data in use through hardware-enforced isolation, but its repurposing from single-tenant desktop scenarios to multi-tenant cloud environments has created fundamental security gaps that persist despite continuous investment in mitigations.
The technology that promised to revolutionize data privacy in the cloud is failing to deliver on its core promise, not because of implementation bugs, but because of fundamental architectural mismatches between how these systems were designed and how they are being deployed.
The Promise That Never Materialized
Confidential computing emerged as the solution to a critical problem: how to run sensitive workloads on infrastructure you don't control. The concept is elegant—create hardware-enforced isolation boundaries that even the cloud provider cannot penetrate. Apple's Private Cloud Compute demonstrates what's possible when the technology works correctly: every production build published to transparency logs, user devices communicating only with nodes whose attested measurements match the log, and a virtual research environment for independent verification.
But these success stories mask a deeper structural problem. The vulnerability record grows every year, attestation infrastructure doesn't work at scale, and the hardware root of trust has a demonstrated shelf life. The answer isn't that implementations are buggy—it's that the technologies were designed for threat models that don't match their current deployment scenarios.
A Pattern of Repurposing Failure
The history of confidential computing follows a predictable pattern. Build technology X for threat model Y, then repurpose X for threat model Z because X already exists and deploying it is cheaper than building something new. This pattern has played out repeatedly in security technology, and confidential computing is just the latest example.
Smart cards, introduced in the late 1960s, were special-purpose computers in tamper-resistant packages. By the 1980s, they were executing cryptographic operations in banking and government ID systems. Hardware Security Modules like IBM's 4758, commercially available in the late 1990s, provided tamper-responding enclosures with their own processors and secure boot chains. These were discrete devices with well-defined physical boundaries.
Intel SGX, introduced with Skylake processors in 2015, brought the enclave concept to general-purpose computing. But SGX was designed for the desktop, specifically for single-tenant scenarios like content protection and DRM key management. The threat model was clear: one machine, one user, and the enclave protects the content owner's code from that user.
When cloud providers adopted SGX for multi-tenant use, they deployed a single-tenant design in a multi-tenant environment. The architectural mismatch opened attack surfaces the original designs didn't anticipate. Cache-timing attacks that were theoretical on a desktop became practical in the cloud because attackers could run arbitrary code on the same physical core.
The Vulnerability Catalog Grows
The side-channel attacks didn't stop with SGX's partial deprecation. They followed the technology into the cloud. TDXRay, published in 2026, reconstructs LLM user prompts word-for-word from encrypted TDX VMs by monitoring tokenizer cache access patterns. No cryptography was broken—the attack works because standard LLM tokenizers traverse hash maps to find token IDs, and that traversal creates memory access patterns observable at 64-byte cache-line resolution.
TEE.Fail, published in 2025, demonstrated that researchers built a $1,000 physical interposer that monitors the DDR5 memory bus and extracted ECDSA attestation keys from Intel's Provisioning Certification Enclave. Attestation can be forged. The attack requires physical access, which limits applicability—but cloud providers have physical access to every server they operate.
On March 31, 2026, Mark Ermolov announced the extraction of the SGX Global Wrapping Key from Intel Gemini Lake. This isn't a side-channel leak—it's extraction of the root cryptographic key that protects SGX sealing operations. The key wraps Fuse Key 0, which means the entire key hierarchy rooted in hardware fuses is compromised for that platform generation. No microcode update can change fuses.
Five Broken Design Assumptions
These vulnerabilities are not a collection of unrelated bugs. They are the predictable result of specific design assumptions that held in the original use cases but fail in the cloud and AI contexts where the technology is now deployed.
First, the assumption that the attacker does not share physical hardware with the victim. SGX was designed for a desktop where one user runs one workload. In the cloud, co-tenants share CPU cores, caches, branch predictors, TLBs, execution ports, memory controllers, and power delivery. CacheWarp, StackWarp, and TDXRay all exploit resources that remain shared because complete resource partitioning would make the hardware unusable for general-purpose computing.
Second, the assumption that the platform owner is not the adversary. TPMs and early SGX assumed the platform owner was the user or a trusted IT department. In the cloud, the provider controls the hypervisor, firmware, BMC, physical facility, and scheduling. The interfaces between the TEE and the provider-controlled environment become the attack surface.
Third, the assumption that the hardware root of trust is immutable. The attestation model depends on root keys being beyond the reach of software attacks. This assumption has been violated repeatedly. Ermolov reached fuse-based keys through microcode. Google's CVE-2024-56161 found an insecure hash in AMD's microcode signature validation.
Fourth, the assumption that attestation verification is someone else's problem. The specifications define how to produce attestation evidence but not how to verify it at scale. In the desktop DRM case, one binary produced one hash. In the cloud, PCR values are combinatorial across firmware, bootloader, kernel, and boot configuration.
Fifth, the assumption that performance and security tradeoffs are invisible. On a desktop running DRM playback, a 5% performance hit is imperceptible. On a cloud server running AI inference at scale, every percentage point is cost. Organizations are pressured to disable countermeasures for performance, reopening the attack surface.
AI Changes the Calculus—But Doesn't Fix the Problems
All of the problems described above are real and unresolved. None of them are stopping adoption, because AI changed the calculus. Model weights represent billions of dollars in training investment. A leaked foundation model is a competitive catastrophe. Running inference on shared cloud infrastructure means trusting the cloud provider not to inspect memory, which is the exact problem TEEs solve.
Multi-party AI scenarios require environments where no single party sees the complete dataset. TEEs provide the isolation boundary. This is why every major hyperscaler is building on confidential computing despite its known limitations.
But AI workloads amplify every weakness. GPU TEEs are new and their attestation models are immature. The attestation chain now spans CPU TEE, GPU TEE, and potentially TPM, each with different measurement schemes. AI workloads run on heterogeneous infrastructure across multiple cloud providers.
And AI workloads are the most valuable targets for the attacks TEEs are vulnerable to. An attacker who extracts model weights via a side channel gets a multi-billion-dollar asset.
The market treats the different TEE designs as interchangeable. They are not. Each has different properties and different security guarantees. Pretending otherwise is how organizations end up deploying against a threat model their chosen TEE was not designed to address.
The Trust Model Gap
The deeper issue is the gap between what is marketed and what is engineered. Confidential computing marketing says "even the infrastructure provider cannot access your data." The engineering reality is different.
The infrastructure provider cannot access your data through the software stack, but the hardware has known side-channel leakages that a sufficiently motivated attacker with privileged access can exploit. The attestation infrastructure that proves the TEE is genuine has structural limitations that make verification at scale dependent on each organization building its own reference value databases.
And the hardware root of trust that anchors the entire system has a demonstrated shelf life.
This is a reasonable tradeoff for many threat models. Most organizations are defending against curious administrators, software-level compromise, and regulatory compliance requirements. Side-channel attacks require significant expertise and often physical access.
But the market does not present it as a tradeoff.
What Needs to Happen
Closing the gap between the market narrative and the engineering reality requires work that is less exciting than launching new AI services.
Firmware and OS vendors need to publish reference measurements. The standards exist. CoRIM provides the format. RFC 9683 provides the framework. What is missing is the operational commitment to publish signed measurement values for every release.
The industry needs honest threat modeling that acknowledges what TEEs protect against and what they do not. TEE.Fail requires physical access, but cloud providers have physical access to every server. TDXdown requires a malicious hypervisor, which is precisely the threat TDX is designed to defend against.
Attestation verification needs to become a commodity. Organizations should not need to build their own reference value databases, write their own event log parsers, and maintain their own golden image registries. This infrastructure should be as standardized and available as Certificate Transparency logs are for the web PKI.
And the security research community's findings need to be incorporated into the market narrative rather than treated as exceptions. The pattern of continuous vulnerability discovery and mitigation is the normal state of the technology, not an aberration.
Confidential computing is directionally correct. The ability to verify what code is running on hardware you do not control, rather than simply trusting the operator, is a fundamental improvement in how we build systems. Signal proved the model works.
The challenge is closing the gap between that promise and the current engineering reality. The organizations deploying confidential computing for AI workloads today should understand what they are buying.
Against the threats they are most likely to face—curious administrators, software-level compromise, regulatory compliance gaps, and unauthorized data access by the infrastructure operator—confidential computing is a significant improvement. Against a well-resourced attacker with physical access to the hardware, side-channel expertise, or the ability to exploit a hardware root-of-trust vulnerability, it is a partial mitigation, not an absolute guarantee.
That is a defensible position. It is just not the one being marketed.
Comments
Please log in or register to join the discussion