Cloudflare's Gen 13 Servers: 2x Throughput and 50% Better Perf/Watt with EPYC Turin
#Hardware

Cloudflare's Gen 13 Servers: 2x Throughput and 50% Better Perf/Watt with EPYC Turin

Hardware Reporter
4 min read

Cloudflare has detailed their upgrade to AMD EPYC Turin processors, achieving 2x throughput and 50% better performance-per-watt in their new Gen 13 server platform.

Cloudflare has unveiled their latest server generation, dubbed "Gen 13," which represents a significant leap in performance and efficiency through the adoption of AMD's EPYC 9005 "Turin" processors. The company's technical blog post reveals impressive gains of 2x throughput and 50% better performance-per-watt compared to their previous generation hardware, marking a substantial upgrade for their global infrastructure.

AMD EPYC Turin CPU

From Genoa-X to Turin: The Evolution of Cloudflare's Hardware

Two years ago, Cloudflare detailed their choice of AMD EPYC Genoa-X processors for their Gen 12 servers, which provided excellent performance for their edge computing workloads. For their Gen 13 platform, they've moved to the flagship EPYC 9965 SKU from AMD's latest EPYC 9005 series. This transition represents more than just a generational upgrade—it's a fundamental rethinking of how Cloudflare approaches server architecture.

The decision to go with the EPYC 9965, even without the 3D V-Cache variants that AMD offers in this generation, demonstrates Cloudflare's confidence in Turin's core architecture. The company's engineers found that the base Turin processors delivered outstanding results across their diverse workload profile, from HTTP request handling to more complex edge computing tasks.

FL2: The Rust Rewrite That Amplified Performance

One of the most interesting aspects of Cloudflare's Gen 13 deployment isn't just the hardware upgrade—it's the software improvements that came alongside it. Cloudflare's FL2 transition to a Rust-based rewrite of their core request handling layer played a crucial role in achieving the impressive performance metrics they're now reporting.

FL2's modern architecture, with its emphasis on better memory access patterns and reduced overhead, ended up providing up to 50% more requests per CPU and up to 70% lower latency. This combination of hardware and software optimization demonstrates how the two domains must work in concert to achieve maximum performance gains.

PQOS: Fine-Grained Resource Control

Cloudflare's engineers also highlighted AMD's Platform Quality of Service (PQOS) extensions as a valuable feature in their Turin deployment. PQOS provides more fine-grained control over shared resources like cache and memory bandwidth, allowing Cloudflare to better manage their multi-tenant edge environment.

This level of control is particularly important for a company like Cloudflare, which runs a vast array of different services on the same hardware—from simple CDN caching to complex serverless functions. The ability to allocate resources more precisely helps ensure that high-priority workloads get the resources they need without starving background processes.

Quantitative Improvements: The Numbers Behind the Upgrade

When comparing their Gen 13 EPYC 9965-based servers to the previous Gen 12 Genoa-X platform, Cloudflare found:

  • 2x throughput improvement
  • 50% better performance-per-watt
  • Up to 60% higher rack throughput

These numbers are particularly impressive given that they're achieved without the additional performance boost that 3D V-Cache would provide. For a company operating at Cloudflare's scale, these improvements translate directly to reduced operational costs and the ability to handle more traffic with the same physical infrastructure.

The Broader Context: EPYC Turin's Market Impact

Cloudflare's results with EPYC Turin align with the broader performance narrative that has emerged from independent benchmarking over the past year and a half. The EPYC 9005 series has consistently demonstrated exceptional performance across various workloads, from traditional server tasks to more specialized computing scenarios.

For those interested in the technical details, Cloudflare has published a second blog post that outlines more specifics about their AMD EPYC 9965 server layout and components. This companion piece covers topics like ideal GB-per-core configuration, thermal efficiency considerations, 100 GbE networking, and PCIe 5.0 capabilities.

What This Means for the Industry

Cloudflare's successful deployment of EPYC Turin servers provides valuable data points for other companies considering similar upgrades. The combination of raw performance improvements, power efficiency gains, and software optimization creates a compelling case for modernizing server infrastructure.

The 50% improvement in performance-per-watt is particularly noteworthy in today's climate, where energy costs and environmental concerns are increasingly important factors in infrastructure decisions. For large-scale operators, this kind of efficiency gain can mean the difference between needing to build new data centers versus maximizing the capacity of existing facilities.

As cloud providers and enterprises continue to evaluate their hardware strategies, Cloudflare's Gen 13 deployment serves as a real-world validation of AMD's server processor roadmap and the benefits of a holistic approach to infrastructure modernization that considers both hardware and software improvements.

Comments

Loading comments...