Apple’s next Siri may carry Apple branding on the surface while sending some hard queries through Google Cloud on Nvidia AI chips underneath.
That is the sharper implication of the new reporting from The Information, summarized by 9to5Mac , which says Apple is preparing to use a version of Google’s Gemini model for upcoming Apple Intelligence features while still presenting on-device processing and Private Cloud Compute as central to its privacy pitch.
Apple’s Siri Problem Now Looks Like an Infrastructure Problem
The headline is not simply that Apple is working with Google. That was already known from the companies’ January announcement of a “multi-year collaboration,” according to the BBC. The more revealing detail is how Apple may be implementing that relationship.
The Information’s report says Apple is using a version of Google’s large Gemini model to train a smaller model that can run locally on Apple devices. The process is called distillation: a large model transfers some of its behavior into a smaller one that can run with less compute.
“Apple is using a version of Google’s large Gemini model to train a smaller version of the model that can run locally on Apple devices, a process known as distillation.”
MLXIO analysis: this is Apple trying to preserve its device-first story without pretending every useful AI task can live on an iPhone. The company can still design the Siri interface, set privacy rules, and control the user experience. But the intelligence layer appears to lean on Google’s model work and, for some requests, Google’s cloud capacity.
That context matters heading into WWDC, where Apple is expected to show iOS 27, the new Siri, and the next phase of Apple Intelligence. As we covered in iOS 27 Siri Redesign Reveals Apple’s AI Reset Button, Siri is no longer just a voice assistant update. It is the public test of whether Apple can make AI feel native to the iPhone rather than bolted on late.
The New Siri Stack May Split Work Across Device, Apple Cloud, and Google Cloud
The report points to a layered system rather than one clean “Apple AI” engine. Some work stays on device. Some runs through Private Cloud Compute. Some user queries to the new Siri reportedly run in Google Cloud on a licensed Gemini model.
That split creates the central engineering and trust challenge: Apple has to make several execution paths feel like one assistant.
| Layer | Reported or implied role | Strategic tension |
|---|---|---|
| On-device AI | Smaller distilled models running locally | Best fit for Apple’s privacy narrative, but limited by device compute |
| Private Cloud Compute | Apple’s privacy-branded cloud AI layer | The brand may remain even if not all processing runs only on Apple servers |
| Google Cloud + Gemini | Some Siri queries using licensed Gemini | More capability, but more dependence on outside infrastructure |
| Nvidia confidential compute | Encryption for data and models during processing | Helps privacy claims, but cloud AI still introduces trust questions |
The Nvidia detail is especially important. The Information says Apple recently approved the use of Nvidia privacy technology in this Google Cloud setting, which suggests Apple will use Nvidia AI chips for at least some compute needs.
The specific technology is confidential compute, a security feature inside Nvidia GPUs that encrypts data and AI models while they are being processed. The report says enabling it slightly slows AI query processing in the cloud, but could help Apple maintain its privacy commitments.
That trade-off is classic Apple: accept some friction if it protects the trust layer. The risk is that users do not judge AI on architecture diagrams. They judge it on whether Siri answers quickly, acts correctly, and does not feel evasive about where their data goes.
“Trillions of Parameters” Explains Why Apple Cannot Keep Everything Local
The report says the full Gemini model provided by Google has “trillions of parameters” and needs so much computing power that Apple has struggled to run it on its own Private Cloud Compute infrastructure.
That single phrase explains the deal better than any partnership language. Apple can shrink models for local use, but frontier-scale model inference still demands huge server capacity. If Siri is expected to handle more complex requests, Apple either builds that capacity, rents it, or partners for it.
Apple is also reportedly looking at acquisitions that could help shrink AI models for local execution. The Information says one company Apple has considered acquiring is Liquid AI, a Cambridge, Mass.-based startup focused on running AI locally on devices.
MLXIO analysis: Apple is trying to solve both sides of the same equation. Distillation and possible local-AI acquisitions reduce cloud dependence over time. Google Cloud and Nvidia chips cover the near-term gap. That makes the Google arrangement look less like a single vendor deal and more like a bridge while Apple works on smaller, device-friendly models.
The scale question is not academic. If Apple ships a meaningfully better Siri across eligible devices, even a partial rollout could become one of the largest consumer AI deployments almost overnight. But the source material does not establish the exact device list, rollout timing, or feature set. Those are WWDC questions, not facts yet.
Apple and Google’s AI Tie-Up Revives an Old Antitrust Shadow
Apple and Google have long mixed rivalry with dependence. The BBC notes that previous Google software deals on Apple devices have been valued in the billions. It also reported that, before a U.S. judge ruled in August 2024 that Google had operated an illegal online search monopoly, it was revealed Google had paid more than $26bn in 2021 to firms including Apple to make Google Search the default option on iPhones.
That history matters because AI access on mobile devices may become as strategically sensitive as search placement. The BBC quoted IDC analyst Francisco Jeronimo saying the new AI arrangement was “likely to be a red flag” for regulators.
Apple and Google said in January that Apple Intelligence would continue to run on Apple devices and Private Cloud Compute while maintaining Apple’s privacy standards. The new reporting complicates that message. 9to5Mac says Apple is expected to keep using the Private Cloud Compute branding even though the next wave of Apple Intelligence features will no longer run exclusively on Apple’s own servers.
That does not mean Apple is breaking its privacy promise. It means the wording will matter. If Apple uses third-party cloud infrastructure, it will need to explain where the privacy boundary sits.
Developers and Users Will Care About Different Failure Modes
Investors may read the arrangement as pragmatic. Apple gets a faster path to a better Siri. Google gets validation for Gemini. Nvidia’s role reinforces how central its chips remain to high-end AI inference.
Developers will ask narrower questions. If Siri becomes more capable, what new app actions become possible? What permissions will users approve? How will latency work when a request moves from device to Apple cloud to Google Cloud? Will Apple expose enough APIs, or reserve the richest experiences for its own system apps?
Users will be less interested in the vendor map. They will care whether Siri finally stops feeling brittle.
The BBC reported that improvements to Apple services, including a more personalized Siri, are to be powered with Google AI. The Information’s technical details now suggest how that could happen: smaller local models for device-side tasks, larger Gemini-backed processing for harder prompts, and Nvidia confidential compute to reduce the privacy risk of cloud inference.
For anyone planning to follow the reveal live, the timing is tight. Apple’s WWDC keynote is just over a week away from the report date, and we have a viewing guide here: New Siri Grabs the Mic: How to Watch WWDC 2026 Live.
WWDC Will Show Whether This Is a Bridge or a Dependency
The strongest version of Apple’s strategy is clear: use Google and Nvidia-backed cloud infrastructure now, distill more intelligence onto devices over time, and keep the final experience unmistakably Apple.
The weaker version is also clear: Apple has to rely on Google for the most important part of its AI assistant while asking users and regulators to accept that the privacy model still holds.
At WWDC, watch for three signals.
- Branding: Does Apple mention Google or Gemini clearly, or keep the focus on Apple Intelligence and Private Cloud Compute?
- Routing: Does Apple explain when Siri runs locally, when it uses Apple cloud systems, and when outside cloud infrastructure enters the loop?
- Capability: Does the new Siri demonstrate harder, personalized tasks, or mostly polished demos around limited use cases?
The thesis to test is simple: Apple is not abandoning its device-first AI strategy. It is buying time for it. The evidence that would confirm that thesis is a Siri that feels faster and more capable while Apple gives credible detail on privacy and local processing. The evidence that would weaken it is vagueness: broad AI promises, limited availability, and no clear answer on how much of Apple Intelligence now depends on Google’s servers.
The Bottom Line
- Apple’s next Siri may depend more on Google’s AI infrastructure than its consumer-facing branding suggests.
- The reported use of Gemini distillation shows Apple is trying to balance on-device AI with more powerful cloud-backed intelligence.
- The deal could reshape expectations for Apple Intelligence ahead of WWDC and iOS 27.










