// news · multimodal · frontier-models2026-05-21source: google / antigravity

Gemini 3.5 Flash hits 76.2% Terminal-Bench 2.1 and 1656 GDPval Elo — frontier-class capability at Flash-tier price

Google's Gemini 3.5 Flash hit 76.2% on Terminal-Bench 2.1, 1656 Elo on GDPval-AA, and 83.6% on MCP Atlas at launch this week. The numbers put Flash within striking distance of full-Pro frontier models on coding and agentic benchmarks while shipping at Flash-tier pricing. It's the first explicit demonstration that 'Flash' no longer means 'small/cheap/limited' — it means 'frontier capability with latency-and-cost optimizations.'

The benchmark mix is deliberate. Terminal-Bench captures CLI-driven engineering work; GDPval covers reasoning-on-economic-tasks; MCP Atlas measures multi-tool agentic workflows. Hitting frontier-class numbers across all three at Flash pricing is the procurement-shift moment for buyers running production inference at scale. The Flash tier becomes the default routing target, and Pro-tier is reserved for the residual workloads that genuinely need the capability ceiling.

The competitive answer required from OpenAI and Anthropic is the under-priced read. OpenAI's GPT-5.5 family has its own pricing curve; Anthropic's Claude family does too. Both labs now have to compress the capability gradient — push more capability into the cheaper tier — to keep procurement teams from defaulting to Gemini 3.5 Flash for the bulk of traffic.

CNBC — Google AI Ultra Gemini Spark Omni → · Google Cloud — I/O 26 innovations → · explainx — Google I/O 2026 complete recap →