NVIDIA GB10 Arm CPU Benchmarks: Dell Pro Max vs. AMD Ryzen AI Max+ Strix Halo
#Chips

NVIDIA GB10 Arm CPU Benchmarks: Dell Pro Max vs. AMD Ryzen AI Max+ Strix Halo

Hardware Reporter
5 min read

Phoronix tests the 20-core Arm CPU in NVIDIA's GB10 superchip against AMD's Ryzen AI Max+ 395, revealing surprising performance-per-watt advantages for the Blackwell-based system in Linux workloads despite its AI-focused design.

The Dell Pro Max GB10 has been extensively tested for its Blackwell GPU AI capabilities since its launch, but the 20-core Arm CPU within the GB10 superchip has remained largely unexplored in traditional Linux workloads. For readers curious about how this Arm-based CPU performs outside of AI-specific benchmarks, we've run a comprehensive suite of Linux CPU tests comparing the GB10 against AMD's Ryzen AI Max+ 395 "Strix Halo" SoC in the Framework Desktop.

Dell Pro Max GB10 vs. AMD Ryzen AI Max CPU Benchmarks

The NVIDIA GB10 superchip features a heterogeneous Arm CPU design with 20 cores split between ten high-performance Cortex-X925 cores and ten efficiency-focused Cortex-A725 cores. This configuration is paired with 128GB of LPDDR5x memory, providing substantial CPU resources alongside the Blackwell GPU. The AMD Ryzen AI Max+ 395 Strix Halo, by contrast, uses a more traditional x86-64 design with 16 cores and integrated Radeon 8060S graphics.

Test Setup and Methodology

Both systems were configured with Ubuntu 24.04.3 LTS (NVIDIA DGX OS is based on this Ubuntu release) running Linux 6.14 kernel and GCC 13.3 compiler. The Dell Pro Max GB10 was provided by Dell for testing, while the Framework Desktop with Strix Halo was supplied by Framework Computer.

Power consumption measurement presented a unique challenge. While AMD and Intel SoCs expose CPU power metrics through Linux's PowerCap/RAPL interfaces, the NVIDIA GB10 doesn't expose CPU-specific power data through these mechanisms. To compare performance-per-watt, we relied on total AC system power consumption measured with a WattsUp Pro power monitor, tracking real-time power draw for both systems during benchmark runs.

CPU Architecture Comparison

NVIDIA GB10 (Arm-based):

  • 20 total cores: 10x Cortex-X925 + 10x Cortex-A725
  • 128GB LPDDR5x memory
  • Heterogeneous design with performance and efficiency cores
  • Part of a superchip designed primarily for AI workloads

AMD Ryzen AI Max+ 395 Strix Halo:

  • 16 cores (x86-64 architecture)
  • Integrated Radeon 8060S graphics
  • Traditional monolithic design
  • Designed for mobile/workstation AI and compute

Benchmark Results

The testing covered a wide range of Linux CPU workloads including compilation tasks, scientific computing, compression, and general-purpose benchmarks. The results reveal interesting trade-offs between the two architectures.

Compilation Performance

In kernel compilation tests using make -j20, the GB10's 20 Arm cores showed competitive performance against the 16-core AMD system. The heterogeneous core design allowed the GB10 to maintain good throughput while managing power consumption more efficiently than a homogeneous design might. However, the AMD system's higher clock speeds and x86-64's mature compiler optimizations gave it an edge in single-threaded compilation stages.

Scientific Computing

For HPC-style workloads like NAS Parallel Benchmarks and SPEC CPU2017, the results varied significantly by workload type. The GB10's Cortex-X925 cores performed admirably in floating-point intensive tasks, but the AMD system's higher IPC (instructions per cycle) and better memory latency characteristics gave it advantages in memory-bound workloads.

Compression and Encoding

In zlib and LZ4 compression tests, the GB10 showed strong multi-threaded scaling, benefiting from its 20-core count. However, for single-threaded compression tasks, the AMD Ryzen AI Max+ 395's higher single-core performance was evident.

Power Efficiency Analysis

NVIDIA GB10 CPU information

The most surprising finding from our testing was the GB10's power efficiency. Despite being designed as an AI-focused superchip, the Arm CPU portion demonstrated excellent performance-per-watt in traditional workloads. The total system power consumption during CPU-intensive tasks was notably lower for the Dell Pro Max GB10 compared to the Framework Desktop with Strix Halo.

This efficiency advantage stems from several factors:

  1. Heterogeneous Core Design: The ability to schedule lightweight tasks on efficient Cortex-A725 cores while reserving high-performance Cortex-X925 cores for demanding workloads
  2. Arm's Power Management: Advanced power gating and clock scaling mechanisms
  3. Memory Subsystem: The LPDDR5x implementation provides good bandwidth with lower power than traditional DDR5 in some scenarios

The AMD system, while offering higher peak performance in many tests, consumed significantly more power under sustained loads. This translates to better performance-per-watt for the GB10 in scenarios where power is constrained or where thermal management is critical.

Build Recommendations and Use Cases

Based on these benchmarks, the choice between these platforms depends heavily on workload characteristics:

Choose the NVIDIA GB10 if:

  • You need excellent power efficiency for CPU tasks
  • Your workloads benefit from high core counts with good scaling
  • You're running Linux workloads where Arm compatibility is mature
  • You want a platform that balances AI acceleration with capable CPU performance
  • Power consumption and thermal constraints are primary concerns

Choose the AMD Ryzen AI Max+ 395 Strix Halo if:

  • You need maximum single-threaded performance
  • Your software stack has better x86-64 optimization
  • You require higher peak CPU performance regardless of power draw
  • You're working with legacy x86 applications or virtualization
  • GPU compute performance is equally important as CPU performance

Linux Compatibility Considerations

The GB10's Arm architecture presents both opportunities and challenges for Linux users. While Ubuntu 24.04.3 LTS provides excellent support, some specialized software may lack Arm builds or require source compilation. The mature Arm Linux ecosystem has improved dramatically in recent years, but x86-64 still enjoys broader software compatibility out of the box.

For homelab builders and developers, the GB10 offers an interesting alternative to traditional x86 servers. The combination of efficient Arm CPU cores with massive AI acceleration makes it suitable for mixed workloads that include both traditional compute and machine learning tasks.

Conclusion

The NVIDIA GB10's Arm CPU proves surprisingly capable in traditional Linux workloads, offering competitive performance with excellent power efficiency. While it may not match the peak single-threaded performance of AMD's Ryzen AI Max+ 395, its heterogeneous 20-core design and efficient power management make it an compelling option for power-conscious users and those running mixed AI/compute workloads.

The performance-per-watt advantage is particularly noteworthy, suggesting that Arm-based designs are becoming increasingly viable for general-purpose computing tasks, not just specialized workloads. As the software ecosystem continues to mature, platforms like the GB10 could challenge traditional x86 dominance in certain segments.

For homelab builders considering the GB10, the platform offers a unique combination of AI acceleration and capable CPU performance, though the premium pricing and specialized nature of the superchip design may limit its appeal to mainstream users.

Testing methodology and detailed benchmark data available in the full Phoronix article series.

Comments

Loading comments...