Nvidia unveils high‑performance Vera CPU, targeting the wider AI server market

While NVIDIA has long dominated GPUs, it now pivots to CPUs—unveiling the high‑performance Vera chip aimed at the broader AI server market, Servethehome reports.

Key Facts

•Key company: Nvidia

NVIDIA’s Vera CPU represents the company’s first fully in‑house processor core for data‑center workloads, a departure from the licensed Arm Neoverse V2 cores that powered the Grace line. According to Servethehome, the new “Olympus” core is built on Arm v9.2‑A architecture and was designed from the ground up to meet the bandwidth, latency and instruction‑set demands of modern AI models. By integrating a custom core, NVIDIA can tightly couple the CPU’s cache hierarchy and memory controller with its upcoming Rubin GPU family, reducing inter‑connect overhead and enabling higher sustained FLOP rates per watt than the Grace‑Rubin pairing.

The Vera silicon arrives as a dual‑CPU module, each die featuring up to 128 cores and supporting up to 2 TB of DDR5‑5600 memory per socket. Servethehome notes that the module’s inter‑socket link leverages NVIDIA’s NVLink 4 technology, delivering up to 600 GB/s of peer‑to‑peer bandwidth, which the company claims will double the effective memory bandwidth available to AI inference pipelines. The chip also incorporates a dedicated tensor‑accelerator block, distinct from the main GPU, to offload matrix‑multiply kernels that are traditionally handled by the GPU’s SMs. This hybrid approach is intended to keep the GPU’s compute units focused on large‑scale training while the CPU‑side tensor block handles latency‑critical inference tasks.

From a software perspective, NVIDIA is positioning Vera as the backbone of the “Vera Rubin” seven‑chip AI platform unveiled at GTC 2026. VentureBeat reports that the platform will ship with a unified driver stack that presents the CPU, tensor block, and GPU as a single logical device to frameworks such as PyTorch and TensorFlow. This integration is meant to simplify deployment for cloud providers and enterprises that currently manage separate CPU and GPU stacks. The company also announced early collaborations with OpenAI, Anthropic and Meta, suggesting that the first generation of large language models will be optimized for the Vera‑Rubin architecture.

While the technical ambitions are clear, the shift to a custom core introduces risk. Servethehome points out that Grace’s reliance on an off‑the‑shelf Arm core gave NVIDIA a proven, low‑risk path to market; by contrast, Olympus is the first NVIDIA‑designed CPU core since the Denver family a decade ago. The company will need to validate the core’s performance, power envelope and silicon yield across a range of server workloads before it can claim parity with established x86 competitors such as Intel’s Sapphire Rapids or AMD’s Genoa. Nonetheless, NVIDIA’s deep expertise in high‑performance interconnects and AI‑centric silicon gives it a unique advantage in delivering a tightly integrated stack that could reshape the economics of AI‑heavy data centers.

Analysts see Vera as NVIDIA’s bid to capture a larger slice of the server‑CPU market, which has historically been dominated by ARM‑based and x86 designs. ZDNet’s coverage of the Rubin announcement emphasizes that the combined CPU‑GPU platform could “transform AI computing as we know it,” by reducing the need for separate host CPUs in AI clusters. If the performance claims hold up in real‑world deployments, Vera could enable hyperscale operators to consolidate workloads onto fewer nodes, lowering total cost of ownership and power consumption. However, the ultimate test will be whether customers adopt the new platform at scale, given the entrenched software ecosystems and procurement pipelines built around existing CPU vendors.

Nvidia unveils high‑performance Vera CPU, targeting the wider AI server market

Key Facts

Sources

🏢Companies in This Story

Related Stories