// news · multimodal2026-06-24source: google / aimlapi

Google releases Gemini Nano Banana 2 (gemini-3.1-flash-image) + Nano Banana Pro (gemini-3-pro-image) as GA — native visual models with video-to-image generation support

Google released Gemini Nano Banana 2 (gemini-3.1-flash-image) and Nano Banana Pro (gemini-3-pro-image) as generally-available native visual models. New capability: video-to-image generation — pass a video file as multimodal context alongside text prompt to generate high-quality thumbnails, cinematic movie posters, or summary infographics.

The substantive piece is the video-to-image multimodal-context primitive. Pre-Nano-Banana-2 multimodal generation accepted text-and-image inputs producing image outputs; video as input modality required separate processing pipelines. The video-as-multimodal-context capability directly opens production workflows for thumbnail generation, movie-poster creation, video-summary infographics — categories that previously required multi-step pipelines.

The competitive read against Seedance 2.0's director-workspace architecture and the broader video-AI vendor landscape is that Google's Nano Banana stack competes on different production-workflow dimensions. Seedance for fused-video-generation, Nano Banana for video-derived-image production. Procurement decisions stratify across these specialization axes.

See our analysis →

Google AI — Release notes | Gemini API → · AI/ML API — Best AI Video Generators 2026 →