NVIDIA Vera Rubin: six chips delivering 3-4× compute density and 10× inference-cost reduction over Blackwell
NVIDIA's Vera Rubin platform — the successor to Blackwell — is in full production and shipping to AWS, Google Cloud, Microsoft, and OCI in the second half of 2026. Rubin comprises six new chips: the Vera CPU, Rubin GPU, NVLink 6 Switch, ConnectX-9 SuperNIC, BlueField-4 DPU, and Spectrum-6 Ethernet Switch. NVIDIA claims 3-4× compute density over Blackwell with 10× reduction in inference token cost and 4× fewer GPUs needed to train MoE models.
The headline 10× inference cost reduction is the number that matters to economics. Frontier-model inference dominates AI infrastructure spend today — at 10× lower cost, models that are unprofitable on Blackwell become profitable on Rubin, and previously-considered-impossible inference workloads (e.g., 1M-context daily queries on consumer apps) become standard.
The 4× MoE training reduction is the bet on architectural direction. NVIDIA is saying explicitly: MoE wins. Dense models will continue to exist (Mistral Medium 3.5 proves they can compete), but the silicon roadmap is now optimized for MoE-style activation patterns.
NVIDIA — Rubin platform press release → · NVIDIA GTC 2026 live blog →