The future of Siri, or: why private inference isn't private enough

Apple's plan to give Siri real AI capabilities through Google Gemini models raises a fundamental question: what happens when private inference meets the real world? A cryptographer's analysis reveals that protecting data in transit and at rest is only half the battle when agents need to act.

Apple's announcement yesterday about deploying Google Gemini models to power Siri's AI capabilities marks a significant moment in the evolution of personal assistants. The company will leverage Google's Confidential Inference alongside Apple's own Private Cloud Compute to process queries that combine your voice commands with the deep context your phone already holds about your life. On the surface, this seems like the right approach: keep the data encrypted, process it on trusted hardware, and ensure nothing lingers after the response returns.

But Matthew Green, a cryptographer at Johns Hopkins University who has spent decades designing and analyzing cryptographic systems, offers a more nuanced perspective. In his analysis, the technical protections Apple and Google are building address only a narrow slice of the real privacy challenge. The harder problems emerge when agents need to actually do things in the world.

The Privacy Paradox of Useful Agents

Consider what a truly helpful personal assistant requires. To schedule a business dinner for six people, the agent needs to scan your messages for availability, understand dietary restrictions from past conversations, search for restaurants that fit the constraints, and then execute bookings and calendar invites. This workflow demands access to your most intimate data: the years of messages, emails, and notes that reveal not just logistics but relationships, preferences, and secrets.

Apple's Private Cloud Compute was designed to solve part of this problem. The system processes data on Apple Silicon in Apple's datacenters, with cryptographic guarantees that the information is encrypted from your phone to a dedicated server and wiped immediately after inference. The stateless architecture means your data doesn't persist, and even Apple cannot access the inputs. When Apple extended PCC to encompass Google's Confidential Inference infrastructure, they aimed to maintain these guarantees while leveraging Google's more powerful models.

Green acknowledges this design works well for a specific use case: pure inference where data stays within the trusted boundary. "As long as you never plan to do anything beyond inference," he writes, "I find this to be a compelling story." The problem is that useful agents cannot stay within that boundary.

When Private Data Meets Public Networks

An AI that performs only inference resembles a human assistant locked in a windowless room with no internet access and no phone. Your data remains perfectly safe, but your assistant can only handle the simplest tasks: summarizing messages you've already received, or helping you draft responses that stay on your device. This describes what Apple Intelligence does today.

The moment an agent needs to accomplish real work, the privacy guarantees dissolve. To find a restaurant for your dinner, the agent must query a search engine or non-private LLM. To send calendar invites, it must communicate with external services. Each of these interactions potentially leaks information about your private data.

The nature of this leakage depends on how the agent is designed. A straightforward implementation might collect all relevant facts about your dinner attendees and upload them to a more capable public model: "Here are thirty detailed facts about my guests and our meeting requirements, find me a suitable restaurant." This approach is efficient and natural, since the non-private model will likely be more powerful than the private one. It also reveals an extraordinary amount about your private conversations, including details that may not be necessary for the task at hand.

As Green observes, "Is Mike's affair relevant to the seating chart?" The agent may not know, and the prompting design may not be careful enough to filter it out.

The Advertising Problem

From a corporate perspective, this data flow represents something valuable. Consider the position of someone running an advertising business at a major tech company. You have billions of users with deeply useful data stored on their phones, data that is extremely valuable for targeted advertising. This data is currently inaccessible because users resist having their private conversations scanned. An agent that accesses this data and then queries your search engine or LLM creates a new channel for monetization.

Whoever operates the search infrastructure learns a vast amount about user desires, some originating from the most intimate private conversations. If the agent designer and the search operator are the same entity, the data monetization opportunity is significant. Green suggests it is difficult to believe that major tech executives are unaware of this dynamic.

The technical architecture of private inference does nothing to prevent this. The data leaves your device encrypted, processed securely on trusted hardware, and then some portion of it flows outward through the agent's necessary interactions. The cryptographic protections ensure Apple and Google cannot see your raw data during processing, but they do not constrain what the agent communicates to external services.

The Lethal Trifecta

Simon Willison, a developer and security researcher, describes a condition he calls the "lethal trifecta" for AI systems: the combination of (a) access to private data, (b) untrusted content the LLM must parse, and (c) the ability to send external communications. These three elements create a perfect storm for data exfiltration attacks.

Apple's planned Siri agent embodies all three characteristics. It ingests data from your messages and documents, many of which originate from untrusted sources (emails, texts, web content). It has access to everything on your system. And to be useful, it must handle actions with external effects: calendar invites, messages, searches. This makes it a prime target for prompt injection attacks, where malicious instructions embedded in data cause the agent to reveal information it should protect.

The problem is not theoretical. OpenAI recently introduced a "lockdown mode" for ChatGPT, restricting web searches due to the risk that the model might upload sensitive documents. This demonstrates that even frontier LLMs remain vulnerable to manipulation when processing untrusted input alongside private data.

Green frames the threat starkly: "If you think spam directed at humans is bad, wait until it's spam directed at agents." The combination of private data access and external communication capabilities creates attack surfaces that cryptographic protections cannot address.

The Government Question

A third adversary enters when we consider government surveillance. If an agent has access to all your data, messages, and actions, it has the ability to detect criminal activity, whether that involves CSAM, terrorism, tax fraud, or other crimes. These agents become one-stop shops for crime detection, capable of identifying patterns and reporting them.

This is not hypothetical. The UK's OFCOM has published requirements for encrypted messengers that follow similar logic. The EU Commission has proposed comparable measures. The UK maintains Technical Capability Notices that allow it to demand system modifications from providers, potentially affecting devices worldwide. Apple is currently in a legal battle with the UK over its encrypted services.

In the United States, constitutional protections have traditionally limited such approaches. However, the Fourth Amendment applies only to government actors. A private company could configure its agents to report suspected crimes, then forward serious cases to authorities. This mirrors Apple's 2021 proposal to monitor photos for CSAM, which it later abandoned after significant backlash.

Green's key insight is that the difference between a helpful assistant, a corporate advertising tool, and a government surveillance mechanism comes down to prompting and model fine-tuning. Once you combine private data access with the ability to send messages, no technical protection from private inference alone can prevent misuse.

What Cryptography Cannot Solve

For decades, cryptography has aimed to replace trust with mathematical certainty: changing "I promise not to look" into "I cannot look." Private inference represents the most ambitious version of this promise, and against the adversary it was designed for (the provider performing the inference), it probably delivers on that promise.

The limitation Green identifies is that this adversary represents only a small piece of any agentic system. The adversaries that matter most are those who interact with the model directly or who designed its technical specifications. There is no cryptographic primitive that prevents an agent from uploading search facts to Google or reporting suspicious activity to the government. Those protections, if they exist at all, live in law, politics, and corporate incentives: the messy human institutions that cryptography was invented to let us stop trusting.

This does not mean the technical work is wasted. Private Cloud Compute and Confidential Inference solve real problems, and the industry would be worse off without them. But they are necessary conditions, not sufficient ones. Building truly private AI agents will require solving problems that extend well beyond cryptography: questions of policy, design, and governance that determine what agents actually do with the access they are granted.

The future of Siri, and of personal AI more broadly, depends not just on how well we protect data during inference, but on what we allow agents to do once they have it. That is a question no amount of technical sophistication can answer alone.

#AI #privacy #Cryptography #Apple #LLM-agents