A critical performance regression in Linux 7.0's new SLUB allocator has been identified and partially fixed, with a 64% drop in IOPS for certain workloads.
A critical performance regression has been identified in the Linux 7.0 kernel's new SLUB allocator implementation, with developers rushing to fix a bug that caused up to 64% performance degradation in specific workloads. The issue was discovered in late February and has since been traced to changes introduced during the Linux 7.0 merge window.
The Performance Regression
The problem stems from the "sheaves" series of changes merged into the mainline kernel, which introduced unnecessary sheaf refill restrictions when combined with the mempool allocation strategy. This regression was particularly noticeable in workloads with persistent cross-CPU allocation and free patterns.
According to Ming Lei of Red Hat, the impact was severe: "ublk null target benchmark IOPS drops significantly compared to v6.19: from 36M IOPS to ~13M IOPS (64% drop)." This dramatic performance decrease affected systems running workloads that heavily utilize the slab allocator across multiple CPUs.
Technical Details
The regression was introduced through commit 815c8e35511d, which merged the 'slab/for-7.0/sheaves' branch into the main slab development branch. However, pinpointing the exact problematic commit proved challenging. Lei noted that bisecting within the sheaves series was blocked by a kernel panic at commit 17c38c88294d ("slab: remove cpu (partial) slabs usage from allocation paths"), preventing identification of the first bad commit.
The Fix
SUSE Linux engineer Vlastimil Babka has devised an initial fix that addresses part of the problem by allowing sheaf refill when blocking is not allowed. This change doesn't completely resolve the performance regression but represents significant progress toward a full solution.
Babka's fix is currently pending through a pull request and is expected to be merged before the Linux 7.0-rc3 kernel release scheduled for this Sunday. A second patch is also in development to handle the possibility of memory-less nodes, which should further improve performance.
Impact and Context
This regression highlights the complexity of kernel memory management and how changes intended to improve one aspect of performance can inadvertently harm another. The SLUB allocator is a critical component of the Linux kernel, responsible for managing kernel memory allocations efficiently.
The severity of this regression - a 64% drop in IOPS for affected workloads - demonstrates how even well-intentioned architectural changes can have unintended consequences at scale. For users running workloads with heavy cross-CPU allocation patterns, this fix will be particularly important.
Looking Ahead
With the fix pending merge and another patch on the way, Linux 7.0 should see restored performance levels once these changes are incorporated. The rapid response from the kernel development community in identifying and addressing this issue demonstrates the effectiveness of the open-source development model, where problems can be quickly identified, analyzed, and resolved by contributors from multiple organizations including Red Hat and SUSE.
The incident serves as a reminder of the importance of thorough testing, particularly for fundamental kernel components like the slab allocator, before changes are merged into mainline development kernels.

Comments
Please log in or register to join the discussion