Nvidia launches Nemotron 3 Super, a hybrid‑architecture model that outpaces GPT‑OSS and
Photo by Mariia Shalabaieva (unsplash.com/@maria_shalabaieva) on Unsplash
Nvidia unveiled Nemotron 3 Super, a 120‑billion‑parameter hybrid model whose open weights aim to outpace GPT‑OSS and Qwen in throughput, VentureBeat reports.
Key Facts
- •Key company: Nvidia
Nvidia says Nemotron 3 Super can sustain a 1 million‑token context window, a scale aimed at long‑horizon agentic tasks such as software engineering and cyber‑security triage (VentureBeat). The hybrid architecture blends a Mixture‑of‑Experts (MoE) layer with a Mamba‑Transformer backbone and a latent‑MoE component, delivering “12 B active” parameters out of a 120 B total count (NVIDIA Nemotron 3 Super research report).
Benchmark data released by Nvidia shows the model outperforms GPT‑OSS‑120B and Qwen 3.5‑122B on the Artificial Analysis suite, achieving up to 2.2 × higher inference throughput on reasoning workloads and 7.5 × on token‑generation tasks (TheDeepView). The speed gains stem from native speculative decoding via MTP layers, which Nvidia claims reduces latency without sacrificing accuracy (NVIDIA Nemotron 3 Super blog).
Open‑weight availability is confirmed on Hugging Face, where the model is listed under the “Qwen3‑Nemotron‑235B‑A22B‑GenRM‑2603” repository (Hugging Face). Community observers note that, until now, no open model has combined this mix of MoE, Mamba, and latent‑MoE, positioning Nemotron 3 Super as a potential “game‑changer” for unit‑testing‑style quality control in AI pipelines (user report).
Bryan Catanzaro, Nvidia’s VP of applied deep learning, highlighted the model’s focus on agentic workflows, emphasizing that the 1 M‑token context is designed to keep large language models efficient when handling multi‑step, high‑complexity tasks (TheDeepView). The release follows Nvidia’s broader push to expand open‑source AI offerings, a strategy the company argues is necessary as competing closed models increasingly run on rival silicon (Wired).
Sources
- Dataconomy
- Reddit - r/LocalLLaMA New
This article was created using AI technology and reviewed by the SectorHQ editorial team for accuracy and quality.