// news · open-source · models · agents2026-04-28source: blogs.nvidia.com

NVIDIA ships Nemotron 3 Nano Omni — 30B hybrid Mamba-Transformer MoE (3B active), multimodal for agents

Nemotron 3 Nano Omni (April 28) unifies vision, audio, language, and text into one open multimodal model. The architecture is the interesting bit: a hybrid Mamba-Transformer MoE with 30B parameters and only 3B activated per forward pass.

NVIDIA's positioning for the Nemotron 3 family is explicitly agentic — these are open models built for AI agent workloads rather than chat. Nano Omni is the small variant; Super and Ultra sizes are in the works for H1 2026.

The architectural detail worth tracking: hybrid Mamba-Transformer is no longer a research-paper curiosity. NVIDIA has now shipped a 30B production model on the architecture, with 3B activated parameters per forward pass (Mixture-of-Experts gating). For agentic workloads — long context, sustained inference, lots of tool calls — Mamba's linear scaling in sequence length matters more than transformer's quadratic attention.

The model is available via Hugging Face, OpenRouter, build.nvidia.com, and 25+ partner platforms. NVIDIA also announced the Nemotron Coalition — a group of global AI labs that will collaborate on the upcoming Nemotron 4 family.

NVIDIA blog →