// news · frontier-models · tools2026-05-23source: google / llm-stats / whatllm

Gemini 3.5 Flash hits GA — frontier-tier intelligence at 4x speed, $1.50/$9 pricing, 76.2% Terminal-Bench 2.1

Google made Gemini 3.5 Flash generally available this week, claiming frontier-level intelligence at four times the speed of comparable models, with pricing of $1.50 per million input tokens and $9 per million output tokens, a 1M-token context window, and 76.2% on Terminal-Bench 2.1. The release reframes the frontier conversation away from the highest-scoring model and toward the most efficient capable model — the architectural shift several analysts have been calling for since Q1.

Flash's positioning is explicit: it beats Gemini 3.1 Pro on coding and agent benchmarks at a fraction of the per-token cost and a fraction of the latency. For agent workloads — where a single user-facing action can fan out into dozens of model calls — that combination is more consequential than another two-point bump on the top of the leaderboard.

The strategic effect is to compress the distance between "frontier" and "efficient" tiers. Anthropic's Opus 4.7 and OpenAI's GPT-5.5 still hold the absolute capability ceiling, but for agentic deployment economics Flash is now the default option. Several agent runtimes that previously routed top-tier requests to Opus are quietly switching the default to Flash for everything except long-horizon reasoning.

See our analysis →

LLM-Stats — AI Updates Today May 2026 Latest AI Model Releases → · WhatLLM — New AI Models May 2026: The Frontier Took a Breath → · FelloAI — Best AI Models in May 2026: ChatGPT, Claude, Gemini & Grok →