Siri AI Runs on Gemini Models, But It Isn't Gemini: Untangling Apple's New Foundation Model Stack
#Regulation

Siri AI Runs on Gemini Models, But It Isn't Gemini: Untangling Apple's New Foundation Model Stack

Mobile Reporter
5 min read

Apple Intelligence and the new Siri are built on Google's Gemini models, yet Apple insists this isn't Gemini Assistant in disguise. For developers shipping on both platforms, the distinction matters more than the marketing suggests, especially around where inference happens and what privacy guarantees carry over.

Apple confirmed that Siri AI and the broader Apple Intelligence stack now run on Google's Gemini models, and the framing has caused predictable confusion. The short version: the models underneath are Gemini-derived, but what users interact with is not Gemini Assistant, Google's Android-native helper. If you build apps that touch system intelligence APIs on both iOS and Android, the difference shapes how you reason about capabilities, privacy, and where your users' data physically goes.

Siri AI is powered by Gemini models, but is not Gemini – what does that mean? | Siri AI animation shown

The naming problem developers keep tripping over

Google uses "Gemini" to mean two distinct things, and that overlap is the root of most misunderstanding. Gemini is a family of foundation models. Gemini Assistant is the conversational product on Android that replaced Google Assistant. Google routinely drops the "Assistant" suffix in its own marketing, so the same word ends up describing both the engine and the car.

Apple's Siri AI and Google's Gemini Assistant both sit on top of Gemini models, but they are separate products with separate behaviors. Siri AI is not a reskinned Gemini Assistant. For anyone integrating with App Intents or the assistant frameworks, this means you are still targeting Apple's surface area and Apple's semantics, not Google's. The model lineage doesn't change the API contract.

What "custom-built in collaboration with Google" actually means

Apple describes its third generation of Apple Foundation Models (AFM) as a family of five models, "custom-built in collaboration with Google." Macworld's Jason Snell read the available statements and landed on a useful breakdown: four of the five models are customized Gemini variants compiled and tuned to run on Apple Silicon, while the fifth and most capable model is closer to a standard Google model running on Google's own servers, trained on a different data mix.

Craig Federighi's own language is worth parsing carefully for developers who care about model behavior. He says the four on-Apple-Silicon models are "trained using proprietary data with reinforcement learning and refined using outputs from Gemini frontier models." That is not the same as saying Apple wrote new architecture from scratch. The practical interpretation: Apple took Gemini foundation models, rebuilt and quantized them for the model sizes its hardware needs, then retrained with its own data, weights, and guardrails.

The consequence at the application layer is that Siri AI pulls from Apple's own knowledge sources, not Google web search or Google's knowledge graph. If you were expecting Google-quality web grounding to leak through, it won't. You get Apple's data boundaries.

Apple Intelligence

Where inference happens, tier by tier

This is the part that should drive your privacy documentation and your threat modeling, because the four-plus-one split maps directly onto four different execution environments:

  • Two models run fully on-device. No data leaves the iPhone, iPad, or Mac. This is the same posture developers already plan around for on-device ML, and it carries the strongest guarantee.
  • Two models run on Apple Silicon inside Apple's Private Cloud Compute (PCC). Data leaves the device but enters an environment Apple says retains nothing and exposes nothing to either Apple or Google. The PCC design is meant to be independently verifiable by security researchers, which is the key claim: you are not asked to trust a policy, you are pointed at an architecture that outside experts can audit.
  • The largest model runs on Google servers, on NVIDIA GPUs rather than Apple Silicon, but on hardware dedicated to Apple's use. Apple states the PCC architecture still applies here.

Apple's security blog lists the requirements that supposedly hold across all tiers: stateless computation, enforceable guarantees, no privileged runtime access, non-targetability, and verifiable transparency. Those are strong words, and on Apple's own silicon they are backed by a hardware story Apple controls end to end.

The honest caveat for the Google-hosted tier

The one place where the guarantee gets softer is the biggest model running on Google's NVIDIA hardware. PCC on Google infrastructure is structurally not identical to PCC on Apple's own servers. Apple says the same protections apply and appears confident in that claim, but this is newer ground. Ben Lovejoy's framing in the original 9to5Mac piece is fair: this isn't an accusation that Apple is being misleading, it's an acknowledgment that a PCC implementation on third-party silicon is uncharted territory where undiscovered vulnerabilities are at least possible.

For developers, that translates into a concrete recommendation. If your app handles regulated or sensitive data and you route any of it through system intelligence features, treat the on-device and Apple-hosted PCC tiers as the well-understood baseline, and treat the Google-hosted frontier model as the tier that warrants the most scrutiny in your own privacy review.

Cross-platform takeaway

If you maintain apps on both iOS and Android, the cleanest mental model is this: the foundation models converge on Gemini, but the products, data boundaries, and trust stories diverge sharply. On Android, Gemini Assistant is Google's product end to end, grounded in Google's services. On iOS, Siri AI is Apple's product, grounded in Apple's data, with a tiered execution model that mostly keeps data inside Apple-controlled hardware and only reaches Google's servers for the heaviest workloads.

The shared model lineage might tempt you to assume feature parity or behavioral parity across platforms. Don't. Build and test against each platform's actual assistant surface, document where user data travels on each, and keep the Google-hosted Apple tier flagged as the area still proving itself. The marketing convergence is real, but your integration and privacy work still lives on two separate platforms with two separate sets of guarantees.

Comments

Loading comments...