// news · multimodal · tools2026-05-24source: 9to5google / cryptobriefing / magicshot

Google Gemini Omni Flash capped at 10 seconds per clip — deployment decision, not a technical limit; longer-form coming from Omni Pro

Google clarified that Gemini Omni Flash's 10-second-per-clip cap is a deployment decision to broaden early access while compute demand is high — not a technical limitation of the model. Longer-form generation is expected from Omni Pro or later Flash updates. The detail matters because the 10-second cap reads as a quality limitation in benchmarks; it's actually a capacity-rationing choice.

The framing is consistent with Gemini Spark's launch on the same day. Google has decided that 2026 H1 compute supply is the binding constraint, and is rationing capacity across the model surface — keeping the high-end use cases (longer-form video, 24/7 cloud agents) gated to AI Ultra subscribers at $99.99-$200/mo, while the broader Flash-tier free/low-tier user gets the same model but with shorter trajectories. It's a different way to monetize a compute-constrained product than NVIDIA's pure $/token model.

The competitive implication: OpenAI shut down the consumer Sora 2 app in March 2026 after reportedly burning $8-12 million a month, leaving Sora 2 as API-only. That decision was the opposite framing — sunset the loss-making consumer product, keep the API revenue. Google is choosing to keep the consumer surface alive but rate-limit the loss via clip-length caps. Which approach generates better long-term consumer retention is the open question; the immediate competitive answer is that Google's Gemini app and YouTube Shorts now have a working video-generation surface that Sora consumer no longer does.

See our analysis →

9to5Google — Gemini Omni create anything model starts today with lifelike video → · Crypto Briefing — Google unveils Gemini Omni multimodal AI video generation → · AVB / AI Video Bootcamp — What Is Gemini Omni Google New AI Video Model Explained →