NVIDIA Nemotron 3 Nano Omni Ships at Edge Scale With Open Weights
30B parameters, 3B active per forward pass, vision-audio-language in one architecture, and a 9x throughput claim against comparable open omni models. The interesting piece is the licensing — full open weights, datasets, and training techniques, with Palantir, Foxconn, and Dell named as launch customers.
NVIDIA released Nemotron 3 Nano Omni on April 29, an open-weight multimodal model that bolts a C-RADIOv4-H vision encoder and a Parakeet-TDT-0.6B-v2 audio encoder onto its Nemotron 3 Mamba-Transformer Mixture-of-Experts backbone. The model has 30 billion total parameters and activates only 3 billion per forward pass — small enough to run on a single workstation GPU, large enough to handle long-context document, video, and audio reasoning in a single architecture rather than a multi-model pipeline.
The headline claim is 9x higher throughput than other open omni models at comparable interactivity, which is plausible because the omni architecture eliminates the separate forward passes that bolted-on multimodal systems require. The licensing is the more consequential piece. NVIDIA released weights, training data, and training recipes, and shipped the model on Hugging Face and as an NVIDIA NIM microservice on AWS SageMaker JumpStart simultaneously. Palantir, Foxconn, and Dell were named as launch adopters — a strong signal that the model is being positioned for industrial and edge deployment rather than chatbot work.
The release lands in the same window as Google's Gemini Omni and Alibaba's Qwen3.5 Omni, completing a three-vendor architectural consensus: omni-style unified modality is the default for frontier work, and the open-weight tier is no longer trailing the proprietary one on this axis. The throughput advantage and the edge-deployable size are the differentiators NVIDIA is leaning on — not parameter count.
NVIDIA Blog — NVIDIA Launches Nemotron 3 Nano Omni Model, Unifying Vision, Audio and Language for up to 9x More Efficient AI Agents → · Hugging Face — Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence for Documents, Audio and Video Agents → · AWS Machine Learning Blog — NVIDIA Nemotron 3 Nano Omni model now available on Amazon SageMaker JumpStart →