Cloudflare's Gen 13 servers represent a fundamental architectural shift from cache-dependent to parallelism-based design, delivering double the traffic capacity with the same response times and improved energy efficiency.
Cloudflare recently introduced its Gen 13 servers, marking a significant shift in how its edge network handles traffic. Rather than relying on large CPU caches for speed, the company redesigned its software to leverage many more processor cores working in parallel in its latest AMD-based servers. This hardware-software co-design approach demonstrates how modern cloud infrastructure is evolving to take advantage of changing processor architectures.
The Architectural Shift: From Cache to Parallelism
The most notable aspect of Cloudflare's approach is its move away from depending on very large CPU caches that had compensated for software not scaling well across many cores. Instead, Cloudflare optimized its stack to work efficiently with processors that have fewer but more cores, representing a fundamental change in how edge computing infrastructure is designed.
According to Cloudflare's engineers, when testing newer AMD Turin Dense CPUs with about one-third of the cache of previous generations, latency initially increased by around 50%. By analyzing the bottlenecks and rewriting key parts of their software, Cloudflare eliminated this latency penalty while unlocking significant performance gains.
Gen 13 Hardware Specifications
The Gen 13 servers are built around impressive hardware:
- 192-core AMD EPYC Turin 9965 processor
- 768 GB of DDR5-6400 memory
- 24 TB of PCIe 5.0 NVMe storage
- Dual 100 GbE network interface card
These specifications allow Gen 13 servers to handle up to twice as much traffic as the previous Gen 12 models while meeting the same response-time targets. More importantly, the changes deliver around 60% more capacity per rack without increasing power consumption, while also providing more memory, storage, and network bandwidth.

The Software Revolution: FL2 in Rust
The key to Cloudflare's success lies in its redesigned FL2 software stack, written in Rust. The engineers explain that FL2's cleaner architecture, with better memory access patterns and less dynamic allocation, doesn't depend on massive L3 caches the way their previous FL1 implementation did.
"The goal was to support workloads that now scale with parallelism rather than cache, enabling significantly higher request capacity and better performance-per-watt across Cloudflare's global edge infrastructure," write Syona Sarma, JQ Lau, Ma Xiong, and Victor Hwang in their detailed technical post about the platform.
This transition represents a significant engineering effort. Cloudflare had to identify and eliminate bottlenecks in their code that were previously hidden by the large cache of their previous processors. The move to Rust likely contributed to this improvement through better memory safety and more explicit control over memory access patterns.
Performance Improvements and Energy Efficiency
The results of this architectural shift are impressive. Gen 13 servers can handle twice the traffic of Gen 12 while maintaining the same response times. This improvement comes with better energy efficiency—60% more capacity per rack without increasing power consumption.
The combination of more cores, faster memory, and improved networking (100 GbE) creates a platform that can handle the growing demands of edge applications, including increasingly complex AI workloads and real-time data processing.
Community Reaction and Technical Questions
The announcement sparked interesting discussions in the developer community. On Hacker News, many readers found the architectural shift intriguing but questioned how much of the improvement came from the hardware versus the software rewrite.
User gdwatson commented: "I don't think they explained how they solved the cache issue except to say they rewrote the software in Rust. They talked about Rust's greater memory safety; it would have been nice to know whether there were specific language features that played into the cache difference or whether it just made the authors comfortable using a systems language in this application and that made the difference."
This highlights the need for more detailed technical disclosure about specific optimizations and benchmarks that would help the community understand the contribution of different factors to the performance improvements.
Broader Implications for Cloud Infrastructure
Cloudflare's approach represents a significant trend in cloud infrastructure design: hardware-software co-design. As processor architectures evolve with more cores and different cache characteristics, software must adapt to take advantage of these changes.
This shift has several important implications:
- Rust in systems programming: Cloudflare's success with Rust demonstrates the language's potential for high-performance systems programming.
- Parallelism over caching: As processor designs favor more cores over larger caches, software must be designed to scale horizontally rather than depending on cache locality.
- Energy efficiency: Better performance-per-watt is becoming increasingly important as data centers expand and face scrutiny over energy consumption.
- Edge computing evolution: The edge is becoming more capable of handling complex workloads, requiring more sophisticated infrastructure designs.
Technical Implementation Details
In their detailed post, Cloudflare's engineers shared some technical insights about their implementation:
- They worked closely with AMD to analyze performance bottlenecks
- The FL2 stack was redesigned with better memory access patterns
- Dynamic allocation was reduced to improve performance
- The transition to 100 GbE networking required careful thermal management
- PCIe encryption hardware support was improved for security
These details show that the transition wasn't just about swapping hardware but required a comprehensive rethinking of how the software interacts with the underlying hardware.
The Future of Edge Computing
Cloudflare's Gen 13 servers represent a step forward in edge computing capabilities. By optimizing for parallelism rather than cache, they've created a platform that can handle more complex workloads at the edge, reducing the need for round trips to central data centers.
This approach is particularly important as edge applications become more sophisticated, including AI inference, real-time data processing, and increasingly complex security features. The ability to handle these workloads efficiently at the edge reduces latency, improves user experience, and reduces bandwidth costs.
Conclusion
Cloudflare's architectural shift from cache-dependent to parallelism-based design demonstrates the importance of hardware-software co-design in modern cloud infrastructure. By redesigning their FL2 stack in Rust to better utilize the many cores in their new AMD processors, they've achieved significant performance improvements while maintaining energy efficiency.
This approach offers valuable lessons for other organizations building edge infrastructure:
- Plan for hardware evolution when designing software
- Consider memory access patterns and allocation strategies
- Leverage modern systems programming languages for better control
- Focus on performance-per-watt as well as raw performance
As processor architectures continue to evolve, we can expect to see more organizations adopting similar approaches to build more efficient and capable edge infrastructure.
For more technical details about Cloudflare's implementation, check out their detailed engineering post about the Gen 13 servers and the official announcement for more information about their server hardware.

Comments
Please log in or register to join the discussion