// news · open-source · models · agents2026-04-28source: nvidia developer

NVIDIA Nemotron 3 Super — 120B hybrid MoE (12B active) tuned for local agent deployment

NVIDIA's open Nemotron 3 Super lands as a 120B-parameter hybrid MoE with 12B active and a 1M-token context window. The explicit design target: local agent deployment with tool-augmented coding workloads.

The "Super" tier sits above Nano in the Nemotron 3 family and below the closed Nemotron 4 Coalition models. The activation ratio (12B out of 120B per token) is the deployability story — at 12B active you can serve from a single workstation-class GPU, while the total parameter count buys capacity that dense 12B models can't match.

Why this shape: agentic coding workloads aren't a single forward pass; they're long tool-augmented sequences with frequent retrieval of routine knowledge interspersed with rare specialist calls. MoE activates the right specialists per-token. NVIDIA is positioning Nemotron 3 Super as the open-weights answer to that workload — and as a way to keep developers on NVIDIA silicon for local inference.

NVIDIA Developer Blog →