// news · multimodal · frontier-models2026-05-29source: google blog / vo3ai / digitalapplied

Gemini Omni Flash rolls out to AI Plus Pro Ultra subscribers via Gemini app and Flow creative studio — any-input multimodal generation goes consumer-default

Google's Gemini Omni Flash started rolling out on May 19, 2026 to AI Plus / Pro / Ultra subscribers via the Gemini app and Google's Flow creative studio. The any-input multimodal model accepts text, image, audio, and video in a single prompt and produces video output, edited photos, or custom digital avatars. The rollout positions Omni Flash as the consumer-default any-input multimodal model, paralleling Veo 3.1's positioning as the Vertex-AI-side video specialist.

The any-input-architecture substance is the substantive piece. Through 2024-2025 the dominant multimodal-generation pattern was input-type-specific: text-to-video, text-to-image, image-to-video, audio-to-text. Each input modality had a separate model and a separate pipeline. Omni's any-input architecture collapses the pipeline by accepting arbitrary combinations of text, image, audio, and video in a single prompt and reasoning across all of them to produce a single output. The deployment-economics consequence is that creators can compose multimodal prompts without the multi-tool orchestration overhead that previous-generation tools required.

The competitive context is the multimodal-generation landscape. The Omni Flash architecture announcement at I/O 2026 is the architecture-level positioning. Sora 2 (OpenAI), Kling 3.0 (Kuaishou), Seedance 2.0 (ByteDance), Veo 3.1 (Google Vertex-AI-side) are the four-model video-specialist competitive surface. Omni Flash occupies the any-input axis rather than the video-specialist axis — a different competitive frame than the head-to-head video-generation benchmark race. The consumer-rollout to AI Plus/Pro/Ultra tiers via the Gemini app and Flow studio brings any-input generation into the consumer default tier; the competitive question through Q3 2026 is whether the any-input convenience displaces the video-specialist quality advantage at the consumer-creator scale.

See our analysis →

VO3AI — Gemini Omni Google Unified Multimodal Video Model I/O 2026 → · Digital Applied — AI Video Generation 2026 Omni vs Sora vs Veo 3 Compared → · Visla — What Gemini Omni Means for Veo and Business Video Creation →