Key takeaways

  1. Gemini 3.5 launched at Google I/O in May 2026, framed around agentic workflows — multi-step tasks, not single answers.
  2. 3.5 Flash is the first release and is now the default model in the Gemini app and Google Search's AI Mode.
  3. Google's pitch is speed plus tool use: Flash is built to plan and execute without getting bogged down in latency.
  4. It sits in a crowded field — released months after Gemini 3.1 Pro and alongside aggressive launches from OpenAI, xAI, and others.
  5. The real significance is distribution: a more capable default model reaching Search and the Gemini app touches billions of users.

Every Google I/O has a thesis, and in May 2026 the thesis was a single word repeated across the keynote: action. Gemini 3.5, the new model family Google unveiled, is built around the idea that the era of the chatbot that answers your question is ending, and the era of the assistant that completes your task is beginning. That’s a marketing line, but underneath it there’s a real product shift worth understanding.

The first model out of the gate is Gemini 3.5 Flash, released May 19, 2026. And the most important fact about it isn’t a benchmark — it’s where Google put it.

The move that matters: it’s the default now

Google made 3.5 Flash the default model behind the Gemini app and, critically, Search’s “AI Mode”. That placement is the whole story. Google doesn’t need Gemini to win every benchmark; it needs Gemini to be good enough while reaching more people than any competitor can. AI Overviews and the Gemini app already touch billions of users. Swapping in a more capable, more agentic default model means a capability upgrade propagates to that entire base overnight, with no one needing to choose it.

OpenAI and xAI ship models. Google ships models into the front page of the internet.

That’s a structural advantage no standalone model lab has. When you evaluate Gemini 3.5, the relevant comparison isn’t just “is it smarter than GPT” — it’s “what happens when a billion people get a better assistant without asking for one.”

What “agentic” actually means here

Strip the buzzword and “agentic” describes a concrete change in what the model is optimized to do. A traditional LLM is tuned to produce a good answer to a prompt. An agentic model is tuned to plan and execute a sequence — call a tool, read the result, decide the next step, call another tool — without losing the thread or stalling on latency.

That last part is why Google led with Flash rather than its most powerful model. Flash is the speed-optimized tier, and for agentic work, speed isn’t a luxury — it’s the constraint. A model that reasons brilliantly but takes ten seconds per step is unusable for a workflow that chains twenty steps. Google’s bet is that for the tasks people actually want automated — booking, drafting, researching, coordinating — fast-and-capable beats slow-and-brilliant. Google’s launch framing emphasized exactly this: frontier-level performance for agents and coding, delivered quickly enough to be practical.

The context Google won’t put on a slide

Here’s the honest framing the keynote skips. Gemini 3.5 didn’t land in empty space. It arrived months after Gemini 3.1 Pro, itself a strong February release, and into a market where xAI shipped Grok 4.3 at aggressive prices in April, OpenAI is moving toward a public listing while shipping models, and a wave of capable Chinese models keeps compressing the field. The pace is such that “the newest model” is a title that now changes hands roughly monthly.

In that environment, an incremental model release isn’t really news on its own — capability gaps between the frontier labs are the smallest they’ve been in years. What makes Gemini 3.5 matter is not that it’s a leap over its rivals on a chart. It’s that Google is the one player who can turn a modest capability gain into a planetary-scale product change by flipping the default. The model is the smaller story. The distribution is the bigger one.

What to take from it

If you’re a user, the practical upshot is simple: your Gemini app and your Google searches quietly got more capable, and more willing to do things rather than just describe them. You didn’t opt in; that’s the design.

If you’re building, the signal is about direction. The frontier labs are converging on the same conclusion at the same time — that the next phase of value isn’t a smarter answer, it’s reliable multi-step action. Google saying it out loud at I/O, and shipping a fast model purpose-built for it as the new default, is confirmation that the agentic era isn’t a forecast anymore. It’s the product roadmap of every major lab, arriving at once. The thing to watch with Gemini isn’t its next benchmark score — it’s whether the agentic actions inside AI Mode and the Gemini app actually finish the job: book the table, draft the reply, file the form, without a human stepping in to clean up after. Distribution is Google’s to lose. Reliability is the part it hasn’t earned yet — and for an assistant a billion people get without asking, the first time it confidently does the wrong thing is the moment they learn to switch it back off.

Frequently asked questions

When was Gemini 3.5 released?

Google introduced the Gemini 3.5 family at its I/O event in May 2026, beginning with the release of 3.5 Flash on May 19, 2026.

What is Gemini 3.5 Flash?

It's the first model in the 3.5 family — a speed-and-cost-optimized model built for agentic workflows like multi-step tool use, coding, and planning. Google has made it the default model behind the Gemini app and Search's AI Mode.

How is Gemini 3.5 different from Gemini 3.1?

Gemini 3.1 Pro (February 2026) was a reasoning-focused flagship. The 3.5 family shifts emphasis toward 'action' — agentic execution of multi-step tasks — and 3.5 Flash specifically prioritizes speed and tool use over topping every reasoning benchmark.

Why does Gemini 3.5 matter if it's not the smartest model?

Distribution. Because 3.5 Flash is the default in Google Search's AI Mode and the Gemini app, an incremental capability gain reaches billions of users automatically — which can matter more in practice than a higher benchmark score on a model few people use.

About Aditya Marin Gasga

Founding Editor

Aditya covers the whole AI surface area for Signal — frontier models, agent infrastructure, the economics of inference, and the policy decisions that quietly shape what everyone else can build. He writes for operators who need a calibrated view of what's actually shipping versus what's keynote theatre.

  • Founder of Signal; sets the publication's editorial line
  • A decade across product, growth, and AI tooling at venture-backed startups
  • Reads the model release notes, the system cards, and the benchmark papers — and tells you which ones matter
More from Aditya Marin Gasga →