// news · multimodal · tools2026-05-27source: google / techcrunch / blog.google

Google Gemini Omni's video-generation API rolls out — unified image, audio, video, and text input with full-video output as primary surface

Google Gemini Omni's video-generation API began rolling out this cycle, with the unified-multimodal interface (image + audio + video + text input → full-video output) opening to developer customers under the Vertex AI Studio surface. The API is the first commercially-available multi-modality-in, multi-modality-out interface from a frontier lab — and it ships in the same week the broader video-generation tier became competitive.

The API surface is the substantive operational news. Through 2024-2025 video-generation APIs accepted text input only or text-plus-image input only — even Veo 3.1's strongest releases supported just one or two input modalities. Gemini Omni's API accepts image, audio, video, and text in a single call, with the model orchestrating across them to produce video output. The pattern matches how creators actually work — drop a reference image, an audio clip, a partial video, and a text instruction, and get a coherent video back — and it is the first API surface that matches that reality rather than forcing creators into single-modality-input workflows.

The downstream third-party ecosystem impact is what makes the rollout consequential. Adobe Firefly Video's integration with Gemini Omni is the obvious first downstream beneficiary; Adobe positions "powered by Gemini Omni" while keeping the Adobe creative-suite surface as the customer-facing layer. Runway, Pika, and Luma face the harder strategic question — integrate Omni and compete on workflow value-add, or differentiate against Omni on style and specialized vertical control. Combined with the broader video-tier competitive week, the multimodal API ecosystem just shifted from "text-to-video" as the dominant pattern to "multimodal-orchestration" as the dominant pattern. The creator-stack architecture has changed.

See our analysis →

Google Blog — Search and I/O 2026 announcements → · TechCrunch — Google Gemini Omni and Antigravity 2.0 at I/O 2026 → · ResultSense — Google launches Gemini Omni for multimodal AI video →