MLXIO
person holding black android smartphone
AI / MLMay 28, 2026· 8 min read· By MLXIO Insights Team

Apple Google AI Deal Sends Siri to Nvidia Cloud Chips

Share

MLXIO Intelligence

Analysis Snapshot

71
High
Confidence: MediumTrend: 10Freshness: 95Source Trust: 100Factual Grounding: 93Signal Cluster: 20

High MLXIO Impact based on trend velocity, freshness, source trust, and factual grounding.

Thesis

High Confidence

Apple’s next Siri appears set to keep an Apple-branded, privacy-first surface while relying on Google’s Gemini, Google Cloud, and Nvidia confidential-compute AI chips for some underlying AI workloads.

Evidence

  • The Information says Apple is using a version of Google’s Gemini model to train a smaller model that can run locally on Apple devices through distillation.
  • The report says some user queries to the new Siri will run in Google Cloud on a licensed Gemini model.
  • Apple recently approved Nvidia privacy technology in the Google Cloud setting, with confidential compute encrypting data and AI models during processing.
  • 9to5Mac says Apple’s WWDC unveiling of iOS 27, the new Siri, and other reveals is little more than one week away.

Uncertainty

  • The report does not specify which Siri queries will run on device, through Private Cloud Compute, or in Google Cloud.
  • The scale of Apple’s use of Nvidia AI chips is not defined.
  • Apple’s exact WWDC messaging around Google, Gemini, and cloud processing remains unknown.

What To Watch

  • WWDC disclosures on the new Siri, iOS 27, Apple Intelligence, and Private Cloud Compute.
  • Apple privacy or security documentation explaining Google Cloud and Nvidia confidential compute usage.
  • Post-launch evidence on Siri latency and behavior for requests routed to cloud AI systems.

Verified Claims

Apple is reportedly preparing to use a version of Google’s Gemini model for upcoming Apple Intelligence features.
📎 The article says reporting from The Information, summarized by 9to5Mac, says Apple is preparing to use a version of Google’s Gemini model for upcoming Apple Intelligence features.High
Apple is reportedly using Gemini distillation to train a smaller model that can run locally on Apple devices.
📎 The article states that Apple is using a version of Google’s large Gemini model to train a smaller version that can run locally on Apple devices, a process known as distillation.High
The next Siri may use a layered AI system that includes on-device processing, Apple Private Cloud Compute, and Google Cloud for some queries.
📎 The article says some work stays on device, some runs through Private Cloud Compute, and some user queries to the new Siri reportedly run in Google Cloud on a licensed Gemini model.High
Apple reportedly approved the use of Nvidia confidential compute technology in the Google Cloud setting.
📎 The article says The Information reported Apple recently approved the use of Nvidia privacy technology in this Google Cloud setting.High
Nvidia confidential compute encrypts data and AI models while they are being processed, but may slightly slow cloud AI query processing.
📎 The article describes confidential compute as a security feature inside Nvidia GPUs that encrypts data and AI models during processing and says enabling it slightly slows AI query processing in the cloud.High

Frequently Asked

How is Apple reportedly using Google Gemini for Siri?

Apple is reportedly using a version of Google’s Gemini model to train a smaller model for local device use, and some Siri queries may run through Google Cloud on a licensed Gemini model.

Will the new Siri run entirely on Apple devices?

Not entirely, according to the article. The reported system splits work across on-device AI, Apple Private Cloud Compute, and Google Cloud for some Siri queries.

Why are Nvidia chips involved in Apple’s reported AI setup?

The article says Apple reportedly approved Nvidia privacy technology in the Google Cloud setting, suggesting Nvidia AI chips will be used for at least some cloud compute needs.

What is distillation in Apple’s reported Gemini implementation?

Distillation is described as training a smaller model from a larger model so it can run with less compute, such as locally on Apple devices.

What privacy feature is Apple reportedly using with Nvidia GPUs?

Apple is reportedly using Nvidia confidential compute, which encrypts data and AI models while they are being processed in the cloud.

Updated on May 28, 2026

Apple’s next Siri may carry Apple branding on the surface while sending some hard queries through Google Cloud on Nvidia AI chips underneath.

That is the sharper implication of the new reporting from The Information, summarized by 9to5Mac , which says Apple is preparing to use a version of Google’s Gemini model for upcoming Apple Intelligence features while still presenting on-device processing and Private Cloud Compute as central to its privacy pitch.

Apple’s Siri Problem Now Looks Like an Infrastructure Problem

The headline is not simply that Apple is working with Google. That was already known from the companies’ January announcement of a “multi-year collaboration,” according to the BBC. The more revealing detail is how Apple may be implementing that relationship.

The Information’s report says Apple is using a version of Google’s large Gemini model to train a smaller model that can run locally on Apple devices. The process is called distillation: a large model transfers some of its behavior into a smaller one that can run with less compute.

“Apple is using a version of Google’s large Gemini model to train a smaller version of the model that can run locally on Apple devices, a process known as distillation.”

MLXIO analysis: this is Apple trying to preserve its device-first story without pretending every useful AI task can live on an iPhone. The company can still design the Siri interface, set privacy rules, and control the user experience. But the intelligence layer appears to lean on Google’s model work and, for some requests, Google’s cloud capacity.

That context matters heading into WWDC, where Apple is expected to show iOS 27, the new Siri, and the next phase of Apple Intelligence. As we covered in iOS 27 Siri Redesign Reveals Apple’s AI Reset Button, Siri is no longer just a voice assistant update. It is the public test of whether Apple can make AI feel native to the iPhone rather than bolted on late.


The New Siri Stack May Split Work Across Device, Apple Cloud, and Google Cloud

The report points to a layered system rather than one clean “Apple AI” engine. Some work stays on device. Some runs through Private Cloud Compute. Some user queries to the new Siri reportedly run in Google Cloud on a licensed Gemini model.

That split creates the central engineering and trust challenge: Apple has to make several execution paths feel like one assistant.

Layer Reported or implied role Strategic tension
On-device AI Smaller distilled models running locally Best fit for Apple’s privacy narrative, but limited by device compute
Private Cloud Compute Apple’s privacy-branded cloud AI layer The brand may remain even if not all processing runs only on Apple servers
Google Cloud + Gemini Some Siri queries using licensed Gemini More capability, but more dependence on outside infrastructure
Nvidia confidential compute Encryption for data and models during processing Helps privacy claims, but cloud AI still introduces trust questions

The Nvidia detail is especially important. The Information says Apple recently approved the use of Nvidia privacy technology in this Google Cloud setting, which suggests Apple will use Nvidia AI chips for at least some compute needs.

The specific technology is confidential compute, a security feature inside Nvidia GPUs that encrypts data and AI models while they are being processed. The report says enabling it slightly slows AI query processing in the cloud, but could help Apple maintain its privacy commitments.

That trade-off is classic Apple: accept some friction if it protects the trust layer. The risk is that users do not judge AI on architecture diagrams. They judge it on whether Siri answers quickly, acts correctly, and does not feel evasive about where their data goes.

“Trillions of Parameters” Explains Why Apple Cannot Keep Everything Local

The report says the full Gemini model provided by Google has “trillions of parameters” and needs so much computing power that Apple has struggled to run it on its own Private Cloud Compute infrastructure.

That single phrase explains the deal better than any partnership language. Apple can shrink models for local use, but frontier-scale model inference still demands huge server capacity. If Siri is expected to handle more complex requests, Apple either builds that capacity, rents it, or partners for it.

Apple is also reportedly looking at acquisitions that could help shrink AI models for local execution. The Information says one company Apple has considered acquiring is Liquid AI, a Cambridge, Mass.-based startup focused on running AI locally on devices.

MLXIO analysis: Apple is trying to solve both sides of the same equation. Distillation and possible local-AI acquisitions reduce cloud dependence over time. Google Cloud and Nvidia chips cover the near-term gap. That makes the Google arrangement look less like a single vendor deal and more like a bridge while Apple works on smaller, device-friendly models.

The scale question is not academic. If Apple ships a meaningfully better Siri across eligible devices, even a partial rollout could become one of the largest consumer AI deployments almost overnight. But the source material does not establish the exact device list, rollout timing, or feature set. Those are WWDC questions, not facts yet.

Apple and Google’s AI Tie-Up Revives an Old Antitrust Shadow

Apple and Google have long mixed rivalry with dependence. The BBC notes that previous Google software deals on Apple devices have been valued in the billions. It also reported that, before a U.S. judge ruled in August 2024 that Google had operated an illegal online search monopoly, it was revealed Google had paid more than $26bn in 2021 to firms including Apple to make Google Search the default option on iPhones.

That history matters because AI access on mobile devices may become as strategically sensitive as search placement. The BBC quoted IDC analyst Francisco Jeronimo saying the new AI arrangement was “likely to be a red flag” for regulators.

Apple and Google said in January that Apple Intelligence would continue to run on Apple devices and Private Cloud Compute while maintaining Apple’s privacy standards. The new reporting complicates that message. 9to5Mac says Apple is expected to keep using the Private Cloud Compute branding even though the next wave of Apple Intelligence features will no longer run exclusively on Apple’s own servers.

That does not mean Apple is breaking its privacy promise. It means the wording will matter. If Apple uses third-party cloud infrastructure, it will need to explain where the privacy boundary sits.

Developers and Users Will Care About Different Failure Modes

Investors may read the arrangement as pragmatic. Apple gets a faster path to a better Siri. Google gets validation for Gemini. Nvidia’s role reinforces how central its chips remain to high-end AI inference.

Developers will ask narrower questions. If Siri becomes more capable, what new app actions become possible? What permissions will users approve? How will latency work when a request moves from device to Apple cloud to Google Cloud? Will Apple expose enough APIs, or reserve the richest experiences for its own system apps?

Users will be less interested in the vendor map. They will care whether Siri finally stops feeling brittle.

The BBC reported that improvements to Apple services, including a more personalized Siri, are to be powered with Google AI. The Information’s technical details now suggest how that could happen: smaller local models for device-side tasks, larger Gemini-backed processing for harder prompts, and Nvidia confidential compute to reduce the privacy risk of cloud inference.

For anyone planning to follow the reveal live, the timing is tight. Apple’s WWDC keynote is just over a week away from the report date, and we have a viewing guide here: New Siri Grabs the Mic: How to Watch WWDC 2026 Live.


WWDC Will Show Whether This Is a Bridge or a Dependency

The strongest version of Apple’s strategy is clear: use Google and Nvidia-backed cloud infrastructure now, distill more intelligence onto devices over time, and keep the final experience unmistakably Apple.

The weaker version is also clear: Apple has to rely on Google for the most important part of its AI assistant while asking users and regulators to accept that the privacy model still holds.

At WWDC, watch for three signals.

  • Branding: Does Apple mention Google or Gemini clearly, or keep the focus on Apple Intelligence and Private Cloud Compute?
  • Routing: Does Apple explain when Siri runs locally, when it uses Apple cloud systems, and when outside cloud infrastructure enters the loop?
  • Capability: Does the new Siri demonstrate harder, personalized tasks, or mostly polished demos around limited use cases?

The thesis to test is simple: Apple is not abandoning its device-first AI strategy. It is buying time for it. The evidence that would confirm that thesis is a Siri that feels faster and more capable while Apple gives credible detail on privacy and local processing. The evidence that would weaken it is vagueness: broad AI promises, limited availability, and no clear answer on how much of Apple Intelligence now depends on Google’s servers.

The Bottom Line

  • Apple’s next Siri may depend more on Google’s AI infrastructure than its consumer-facing branding suggests.
  • The reported use of Gemini distillation shows Apple is trying to balance on-device AI with more powerful cloud-backed intelligence.
  • The deal could reshape expectations for Apple Intelligence ahead of WWDC and iOS 27.

Roles in Apple’s Reported AI Setup

LayerPrimary PlayerRole
User experienceAppleControls Siri branding, interface, privacy rules, and device-first positioning
Model intelligenceGoogleProvides Gemini model technology reportedly used for distillation and some advanced AI tasks
Compute infrastructureGoogle Cloud and NvidiaHandles cloud-based AI processing using Nvidia AI chips for harder requests
MLXIO

Written by

MLXIO Insights Team

Algorithmic Research & Human Oversight

Powered by advanced algorithmic research and perfected by human oversight. The Insights Team delivers highly structured, cross-verified analysis on emerging tech trends and digital shifts, filtering out the fluff to give you high-fidelity value.

Related Articles

apple logo on blue surface
AI / MLMay 22, 2026

Apple Intelligence 2.0 Bets on Siri to Rescue iPhone AI

Apple Intelligence 2.0 could make iOS 27 feel AI-native, but only if Siri and everyday tools become genuinely useful.

8 min read

a hand holding a phone
AI / MLMay 13, 2026

Apple Bets Big on Gemini AI to Revolutionize Siri’s Smarts

Apple adopts Google’s Gemini AI to transform Siri from a rigid assistant into a context-aware, conversational powerhouse.

6 min read

logo
AI / MLMay 24, 2026

Gemini Takes Over Google I/O 2026 — and Your Workflow

Google turned I/O 2026 into a Gemini takeover, pitching AI agents across Search, Android, Workspace, shopping and eyewear.

8 min read

logo
AI / MLMay 23, 2026

Google I/O Puts Gemini on Trial as Claude Grabs Devs

Google I/O is now a credibility test: Gemini must prove it can win real developer workflows, not just demos.

8 min read

person using phone and laptop
AI / MLMay 24, 2026

Gmail Live Turns Inbox Search Into Gemini Voice Chat

Gmail Live uses Gemini to answer spoken inbox questions, turning Gmail search into a voice conversation.

7 min read

apple logo on blue surface
TechnologyMay 27, 2026

New Siri Grabs the Mic: How to Watch WWDC 2026 Live

Apple’s WWDC 2026 keynote streams June 8 with iOS 27, macOS 27 and a Siri overhaul expected.

8 min read

person typing on gray and black HP laptop
TradingMay 28, 2026

Alleged $1.2M Polymarket Win Puts Google Employee in Court

A Google employee is charged over alleged secret Search-data bets that netted $1.2M on Polymarket.

6 min read

a man sitting in front of a laptop computer
TradingMay 28, 2026

A $1.2M Polymarket Bet Puts Google Secrets on Trial

A Google engineer allegedly made $1.2M on Polymarket using internal search data, exposing prediction markets’ insider risk.

7 min read

black flat screen computer monitor turned on beside black computer keyboard
TechnologyMay 28, 2026

Samsung’s 4K 360Hz QD-OLED Kills Gaming’s Tradeoff

Samsung Display’s 31.5-inch 4K 360Hz QD-OLED panel erases the sharpness-vs-speed split, with mass production due in late 2026.

7 min read

black Sony PS Vita on brown wooden surface
TechnologyMay 28, 2026

Intel Arc G3 Extreme Grabs OneXPlayer 3's OLED Bet

OneXPlayer 3 will test Intel Arc G3 Extreme in a 144 Hz OLED handheld, but price and launch details remain missing.

7 min read

Stay ahead of the curve

Get a weekly digest of the most important tech, AI, and finance news — curated by AI, reviewed by humans.

No spam. Unsubscribe anytime.