What is Gemini 3.5 Flash?

Gemini 3.5 Flash is Google’s AI model introduced at Google I/O for coding and autonomous AI agent workflows.

How fast is Gemini 3.5 Flash?

Google says Gemini 3.5 Flash delivers 4 times the output tokens per second of other frontier models, and an optimized version is 12 times faster at the same quality.

Why is Gemini 3.5 Flash focused on agents instead of chatbots?

The article says Gemini 3.5 Flash is built for delegated digital work: planning, executing, testing, revising, calling tools, and continuing through multi-step tasks.

What can Gemini 3.5 Flash do for coding?

Google says Gemini 3.5 Flash can independently execute coding pipelines and, in internal tests, build an operating system from scratch.

What is Antigravity in relation to Gemini 3.5 Flash?

Antigravity is Google’s agentic development platform and IDE, and Flash 3.5 was co-developed with it as a native environment for agents to work and execute.

12x Faster Gemini 3.5 Flash Ditches Chatbots for Agents

Gemini 3.5 Flash is 4x faster than other frontier models, and Google says an optimized version runs 12x faster at the same quality — a speed claim that explains why this launch is about agents, not chat.

Google introduced Gemini 3.5 Flash on Tuesday at Google I/O, positioning it as its strongest model yet for coding and autonomous AI agents, according to TechCrunch. The model can independently execute coding pipelines, manage research projects, and, in internal tests, build an operating system from scratch.

That is the strategic signal. Google is moving the Gemini pitch away from “ask a question, get an answer” and toward delegated digital work: plan, execute, test, revise, and ask for help only when judgment or permission is needed. For developers and enterprises, the value proposition shifts from conversational polish to completed workflows.

“3.5 Flash offers an incredible combination of quality and low latency. It outperforms our latest frontier model, 3.1 Pro, on nearly all the benchmarks,” Koray Kavukcuoglu, DeepMind’s chief technologist, told reporters.

Gemini 3.5 Flash turns prompts into long-running work

The most important part of Gemini 3.5 Flash is not that it can chat better. It is that Google says it can run autonomously for multiple hours.

That matters because agentic systems behave differently from chatbots. A chatbot answers. An agent breaks work into steps, calls tools, checks outputs, and keeps going. In Google’s framing, Flash is designed for that loop.

At I/O, Google engineer Varun Mohan demonstrated agents splitting off to work on separate components before coming together to build a full operating system inside Antigravity, Google’s agentic development platform and IDE. Kavukcuoglu said Flash 3.5 was co-developed with Antigravity so agents would have a “native environment where they can live, work, and execute.”

Google also released Antigravity 2.0, a stand-alone desktop application built around agent-first development. That ties the model launch to tooling, not just model access. It also matches the direction we covered in Google’s I/O developer push around Gemini and AI agents.

Speed is the product feature, not just a benchmark flex

Google’s Flash branding has always implied a tradeoff: faster, lighter, cheaper-to-run models for high-volume use cases. With 3.5 Flash, Google is arguing that tradeoff has narrowed.

In a Google blog post, the company said 3.5 Flash outperforms Gemini 3.1 Pro on coding and agentic benchmarks including:

Benchmark	Google-reported result for Gemini 3.5 Flash
Terminal-Bench 2.1	76.2%
GDPval-AA	1656 Elo
MCP Atlas	83.6%
CharXiv Reasoning	84.2%

Google also said the model delivers 4 times the output tokens per second of other frontier models, and that an optimized version is 12x faster with the same quality.

For agentic AI, latency is not cosmetic. If one agent performs ten steps, tests code, spins off subagents, reads files, and retries failures, small delays compound. A model that is merely impressive in a chat window can become expensive and sluggish when asked to run a multi-hour software task.

That is why the economics matter. Google says 3.5 Flash can complete some long-horizon tasks “often at less than half the cost of other frontier models.” The source material does not provide public pricing, so buyers still lack the key metric: cost per completed task. But Google is clearly trying to make Flash the workhorse model for high-volume agent execution, not the trophy model for demos.

Coding is the proving ground because failure is easier to see

Software development is a natural test bed for agents because the outputs can be inspected. Code compiles or it does not. Tests pass or fail. Dependencies exist or they do not. That does not eliminate hallucination or bad architecture, but it gives teams more measurable feedback than many knowledge-work tasks.

Google’s examples lean into this. The company says 3.5 Flash can develop new applications, maintain codebases, prepare financial documents, rename and categorize unstructured assets, transform legacy code into Next.js, and generate different UX approaches for a checkout flow in 60 seconds.

The enterprise examples are more revealing than the stage demo:

Shopify is running subagents in parallel to analyze complex data for merchant growth forecasts.
Macquarie Bank is piloting the model for customer onboarding by reasoning over 100+ page documents.
Salesforce is integrating 3.5 Flash into Agentforce for multi-subagent enterprise tasks.
Ramp is using it for OCR on complex invoices and reasoning over historical patterns.
Xero is deploying agents for multi-week workflows tied to 1099 tax forms.
Databricks is using agentic workflows to diagnose issues and propose fixes for data scientists.

MLXIO analysis: these examples show Google aiming Flash at boring, costly work rather than flashy consumer prompts. That is where agentic AI has a clearer business case: document-heavy onboarding, internal data analysis, invoice processing, code migration, and workflow automation.

For more on the cost side of this strategy, see our prior analysis of Cheap AI Agents: Google’s Gemini 3.5 Flash Bets Big.

Gemini 3.5 Pro becomes the planner, Flash becomes the labor layer

Google’s next move is already telegraphed. Gemini 3.5 Pro is coming, and the company says Pro and Flash are designed to work together.

Tulsee Doshi, Google’s senior director and head of product, described the split to TechCrunch:

“3.5 Pro becomes your orchestrator, your planner, and then it actually can leverage Flash to be the various sub-agents. I think it really comes down to where do you really want that reasoning power, where you actually want that larger model that can really push on the reasoning side versus where do you have tasks that really do merit good brute force tool use capabilities?”

That architecture is the real product story. Google is not pitching one model to do everything. It is building a hierarchy: a stronger reasoning model plans, while faster Flash agents execute subtasks.

If that works, the unit of AI consumption changes. Users will not simply choose a model. They will launch a swarm of specialized agents inside tools such as Antigravity, Gemini API, Gemini Enterprise, the Gemini app, and AI Mode in Search.

Search and Gemini Spark bring agent risk to consumers

The launch is not confined to developers. 3.5 Flash is now the default model in the Gemini app and AI Mode in Search globally. Google also announced agentic capabilities coming to Search, letting users create, customize, and manage AI agents directly on the platform.

The model will also power Gemini Spark, Google’s personal AI agent designed to run 24/7 to help consumers manage their digital life. That expands the stakes beyond enterprise automation.

The risk profile changes when an AI system moves from advice to action. Google is also facing a lawsuit in which plaintiffs allege that weeks of Gemini chats preceded a user’s severe crisis and suicide, according to TechCrunch; those claims have not been independently established here. Broader access to autonomous agents raises harder questions around escalation, permissions, sensitive content, and user dependency.

Google says Gemini 3.5 has strengthened cyber and CBRN safeguards and is better calibrated to engage with sensitive questions rather than refuse them outright. That last point is important: refusal alone is not a complete safety strategy for agents that may need to handle ambiguous or high-stakes tasks.

Buyers should judge agents by supervision cost

For software teams and AI buyers, the key question is no longer access. Gemini 3.5 Flash is available generally today through Antigravity, the Gemini API, Gemini Enterprise, the Gemini app, and AI Mode in Search.

The question is supervision.

A useful agent must reduce total work, not just move work from coding to review. Buyers should measure:

Completion cost: What does a finished workflow cost, including retries and human review?
Error rate: How often does the agent create hidden defects or bad assumptions?
Permission control: When does it pause for approval, and can that behavior be audited?
Integration quality: Can it operate safely inside existing codebases, data systems, and enterprise workflows?
Time saved: Does it compress multi-week work, as Google says partners are testing, or merely generate more output to inspect?

MLXIO analysis: the first adopters with the clearest payoff will likely be teams with well-defined workflows, good documentation, and strong review processes. Messy systems will not magically become agent-ready because the model is faster.

The next AI platform fight moves from chat windows to delegated labor

Gemini 3.5 Flash is less a model launch than a statement about where Google thinks AI monetization is headed. The chat interface was the entry point. The larger prize is delegated labor across code, documents, search, enterprise systems, and consumer software.

The evidence to watch is not another polished I/O demo. It is whether Google can show repeatable production use: lower review burden, fewer failed agent runs, clearer audit trails, and real cost-per-task advantages over slower frontier models.

If 3.5 Pro successfully acts as planner and 3.5 Flash becomes the fast execution layer, Google will have a more credible agent platform. If users still need to babysit every long-running task, then Flash will remain a faster chatbot with better demos — not the operating layer for AI work.

The Bottom Line

Google is shifting Gemini from chatbot interactions toward autonomous digital work.
Lower latency could make multi-hour AI agents more practical for coding and enterprise workflows.
The launch signals that speed and tool-use may matter as much as raw model intelligence in the next AI wave.

Dimension	Chatbots	Agentic AI with Gemini 3.5 Flash
Core function	Answer user questions	Plan, execute, test, revise, and call tools
Work style	Single interaction or short exchange	Long-running autonomous workflows
Target users	General conversational use	Developers and enterprises needing completed workflows

12x Faster Gemini 3.5 Flash Ditches Chatbots for Agents

Analysis Snapshot

Thesis

Evidence

Uncertainty

What To Watch

Verified Claims

Frequently Asked

Useful Tools

Gemini 3.5 Flash turns prompts into long-running work

Speed is the product feature, not just a benchmark flex

Coding is the proving ground because failure is easier to see

Gemini 3.5 Pro becomes the planner, Flash becomes the labor layer

Search and Gemini Spark bring agent risk to consumers

Buyers should judge agents by supervision cost

The next AI platform fight moves from chat windows to delegated labor

The Bottom Line

Chatbots vs. Gemini 3.5 Flash-style agents

Google's Gemini 3.5 Flash speed claims

Sources

MLXIO Insights Team

Explore More Topics

Related Articles

Cheap AI Agents: Google’s Gemini 3.5 Flash Bets Big

Google I/O Puts Gemini on Trial as Claude Grabs Devs

AI Job Cuts Are Dumb — Gemini Makes Hassabis' Case

4x Faster Gemini 3.5 Bets on AI That Actually Acts

Gemini Takes Over Google I/O 2026 — and Your Workflow

Google Pixel July Update Kills Bootloop Nightmare on 21 Pixels

Printable Invite Drops Galaxy Unpacked Into Apple’s Way

UK Threatens Apple's App Store and Apple Pay Toll Booth

Xbox’s Billion-a-Day Dream Sparks a Fan Revolt

990g Lenovo ThinkBook 14x Grabs OLED, Dual SSD Slots

Stay ahead of the curve