NVIDIA Vera CPU: The First Processor Built Specifically for Agentic AI
#Chips

NVIDIA Vera CPU: The First Processor Built Specifically for Agentic AI

Startups Reporter
5 min read

NVIDIA has launched Vera CPU, claiming twice the efficiency and 50% faster performance than traditional CPUs, with major cloud providers and manufacturers already adopting the technology.

NVIDIA has unveiled the Vera CPU, positioning it as the world's first processor purpose-built for the age of agentic AI and reinforcement learning. The company claims the new chip delivers results with twice the efficiency and 50% faster performance than traditional rack-scale CPUs, marking a significant shift in how AI infrastructure is designed.

The timing appears strategic. As reasoning and agentic AI advances, the computational demands of systems that plan tasks, run tools, interact with data, execute code, and validate results are growing exponentially. NVIDIA argues that scale, performance, and cost are increasingly driven by the infrastructure supporting these models rather than the models themselves.

Built on Grace, Designed for the Future

The Vera CPU builds upon the success of the NVIDIA Grace CPU, enabling organizations across industries to build AI factories that can unlock agentic AI at scale. With the highest single-thread performance and bandwidth per core among available CPUs, Vera represents a new class of processor designed to deliver higher AI throughput, responsiveness, and efficiency for large-scale AI services.

These services include coding assistants, consumer agents, and enterprise AI applications that require consistent, predictable performance under heavy workloads. The chip's architecture appears particularly suited for multi-tenant environments where many jobs run simultaneously.

Configurable for Every Data Center

NVIDIA announced a new Vera CPU rack integrating 256 liquid-cooled Vera CPUs capable of sustaining more than 22,500 concurrent CPU environments, each running independently at full performance. This allows AI factories to quickly deploy and scale to tens of thousands of simultaneous instances and agentic tools within a single rack.

The new Vera rack utilizes the NVIDIA MGX modular reference architecture, supported by 80 ecosystem partners worldwide. As part of the NVIDIA Vera Rubin NVL72 platform, Vera CPUs are paired with NVIDIA GPUs through NVIDIA NVLink-C2C interconnect technology, providing 1.8 TB/s of coherent bandwidth—seven times the bandwidth of PCIe Gen 6—for high-speed data sharing between CPUs and GPUs.

Additionally, NVIDIA introduced reference designs using Vera as the host CPU for NVIDIA HGX Rubin NVL8 systems, coordinating data movement and system control for GPU-accelerated workloads. System partners are offering both dual and single-socket CPU server configurations optimized for workloads such as reinforcement learning, agentic inference, data processing, orchestration, storage management, cloud applications, and high-performance computing.

Designed for Agentic Scaling

The Vera CPU combines high-performance, energy-efficient CPU cores with a high-bandwidth memory subsystem and the second-generation NVIDIA Scalable Coherency Fabric. This combination enables faster agentic responses under the extreme utilization conditions common for agentic AI and reinforcement learning workloads.

Key specifications include 88 custom NVIDIA-designed Olympus cores, delivering high performance for compilers, runtime engines, analytics pipelines, agentic tooling, and orchestration services. Each core can run two tasks using NVIDIA Spatial Multithreading, providing consistent, predictable performance ideal for multi-tenant AI factories running many jobs simultaneously.

To enhance energy efficiency, Vera introduces a second-generation low-power memory subsystem built on LPDDR5X memory, delivering up to 1.2 TB/s of bandwidth—twice the bandwidth at half the power compared with general-purpose CPUs.

Early Adoption and Ecosystem Support

The Vera CPU has already attracted significant interest from major technology companies. Cursor, an AI-native software development platform, is adopting Vera to boost performance for its AI coding agents. "We're excited to use NVIDIA Vera CPUs to improve overall throughput and efficiency so we can deliver faster, more responsive coding agent experiences for our customers," said Michael Truell, cofounder and CEO of Cursor.

Redpanda, a streaming data and AI platform, has tested Vera running Apache Kafka-compatible workloads and reported dramatically better performance than other systems. "We saw up to 5.5x lower latency," said Alex Gallego, founder and CEO of Redpanda. "Vera represents a new direction in CPU architecture, with more memory and less overhead per core, enabling our customers to scale real-time streaming workloads further than ever and unlock new AI and agentic applications."

National laboratories planning to deploy Vera CPUs include Leibniz Supercomputing Centre, Los Alamos National Laboratory, Lawrence Berkeley National Laboratory's National Energy Research Scientific Computing Center, and the Texas Advanced Computing Center (TACC). "Vera's per-core performance and memory bandwidth represent a giant step forward for scientific computing," said John Cazes, director of high-performance computing at TACC.

Leading cloud service providers planning to deploy Vera CPUs include Alibaba, ByteDance, Cloudflare, CoreWeave, Crusoe, Lambda, Nebius, Nscale, Oracle Cloud Infrastructure, Together.AI, and Vultr. Infrastructure providers adopting Vera CPUs include Aivres, ASRock Rack, ASUS, Compal, Cisco, Dell, Foxconn, GIGABYTE, HPE, Hyve, Inventec, Lenovo, MiTAC, MSI, Pegatron, Quanta Cloud Technology (QCT), Supermicro, Wistron, and Wiwynn.

Availability and Market Positioning

NVIDIA Vera is in full production and will be available from partners in the second half of this year. The broad adoption across hyperscalers, system makers, and cloud providers suggests NVIDIA is positioning Vera as the new CPU standard for AI workloads that matter most to developers, startups, public-private institutions, and enterprises.

This strategy aims to democratize access to AI infrastructure while accelerating innovation across the ecosystem. By providing a single software stack across the NVIDIA platform, customers can optimize for specific workloads while maintaining consistency.

The launch comes at a pivotal moment when AI systems are becoming increasingly autonomous and capable of complex reasoning. Whether Vera can deliver on its ambitious performance claims and establish itself as the go-to CPU for agentic AI remains to be seen, but the level of early interest suggests the market is ready for purpose-built AI infrastructure.

For developers and organizations building the next generation of AI applications, Vera represents a potential foundation for scaling agentic systems beyond what traditional CPUs can support. The question now is whether this specialized approach will prove more effective than continuing to adapt general-purpose processors for AI workloads.

Comments

Loading comments...