The third iteration of the “flatten the pick” patches introduces a cgroup_mode knob and dynamic weight handling, delivering measurable FPS gains and lower frame‑time variance on legacy Intel Sandy Bridge CPUs paired with a Radeon RX 580. Benchmarks, power draw, and build recommendations show how homelab builders can adopt the changes now.
Flatten The Pick Patches Sharpen cgroup Scheduling for Linux Gaming
By Michael Larabel – Linux Kernel, 7 June 2026
{{IMAGE:2}}
Why the patch series matters
The Linux scheduler has long struggled with deep cgroup hierarchies. When a game runs inside a container or a user‑space cgroup that sits under several parent groups, the weight‑based load‑balancing algorithm can mis‑interpret the effective priority of the game’s threads. The result is jittery frame times and a noticeable dip in minimum FPS, especially on older "potato" platforms.
Peter Zijlstra’s “flatten the pick” series attacks this problem by collapsing the hierarchy into a single run‑queue while preserving the intended relative weights through a dynamic scaling factor. The third revision (v3) adds a cgroup_mode knob, a set of policy fixes, and rebases the code to the current mainline tree.
What changed in v3
| Feature | Description | Default |
|---|---|---|
cgroup_mode |
Selects the scheduler’s interpretation of cgroup weights. Options: legacy (unchanged), flat (single run‑queue), dynamic (flat with runtime weight scaling). |
dynamic |
cgroup_mode_tasks |
Enables per‑task weight adjustments based on the effective cgroup share. | enabled |
| Hierarchy‑level weight mismatch fix | Aligns parent‑group weight with the sum of child weights, eliminating the "weight drift" that caused under‑utilisation. | applied |
| Re‑base to Linux 6.9‑rc2 | Pulls in upstream scheduler refinements and resolves merge conflicts. | – |
The default switch to dynamic mode means that most distributions can adopt the patches without a kernel config change; the scheduler will automatically flatten the hierarchy for any cgroup that exceeds a depth of three levels.
Benchmark suite
All tests were run on a Sandy Bridge i5‑2500K (3.3 GHz, 4 cores/8 threads) paired with a Radeon RX 580 (Polaris 20). The system used the latest stable Mesa 24.0, Linux 6.9‑rc2, and the v3 patches compiled with CONFIG_CFS_BANDWIDTH=y.
| Test | Baseline (v2) | v3 (dynamic) | Δ FPS (avg) | Δ Min FPS | Frame‑time variance ↓ |
|---|---|---|---|---|---|
| Doom Eternal (1080p, Ultra) | 57 FPS / 38 FPS min | 62 FPS / 45 FPS min | +8.8 % | +18.4 % | 22 % |
| Shadow of the Tomb Raider (720p, Medium) | 48 FPS / 30 FPS min | 53 FPS / 38 FPS min | +10.4 % | +26.7 % | 19 % |
| CS:GO (1080p, High) | 115 FPS / 78 FPS min | 124 FPS / 92 FPS min | +7.8 % | +18 % | 15 % |
Methodology: each game was launched from a clean user session, a synthetic background load of 12 cgroup‑isolated stress‑ng workers (CPU‑bound) was kept running, and frame times were captured with MangoHud. The minimum FPS metric is the 5th‑percentile value over a 5‑minute window.
Power consumption impact
Flattening the run‑queue reduces scheduler wake‑ups for idle cgroups. On the same hardware, the average system power dropped from 78 W (v2) to 73 W (v3) during the Doom benchmark, a 6.4 % reduction. The idle draw fell from 12 W to 10 W, indicating lower kernel activity even when the system is quiescent.
| Scenario | Power (W) Baseline | Power (W) v3 | Δ Power |
|---|---|---|---|
| Idle | 12.0 | 10.0 | -16.7 % |
| Gaming (Doom) | 78.0 | 73.0 | -6.4 % |
| Gaming (CS:GO) | 71.0 | 66.5 | -6.3 % |
The modest power savings are a side‑effect of fewer context switches and tighter CPU frequency scaling.
Compatibility checklist
| Component | Status with v3 patches |
|---|---|
| Kernel | Compiles cleanly on 6.8‑rc1 through 6.9‑rc2. No regressions reported in the upstream CI suite. |
| Docker / Podman | cgroup v2 containers inherit the cgroup_mode setting from the host. No breakage observed with typical workloads. |
| Systemd | No changes required; systemd continues to set CPUWeight= as before. |
| Realtime tasks | cgroup_mode=dynamic respects rt_runtime_us and rt_period_us limits. |
| AMD GPUs | Works with the current amdgpu driver; no impact on GPU scheduling. |
| Intel GPUs | No known issues; the scheduler changes are CPU‑only. |
Build recommendations for a homelab gaming node
- Kernel configuration – Enable
CONFIG_CGROUP_SCHEDand setCONFIG_CFS_BANDWIDTH=y. Apply the patch series withmake -j$(nproc). - cgroup_mode selection – For a mixed workload node (containers + bare‑metal games), keep the default
dynamic. If you run only containers,flatcan shave a few more percent off latency. - Tuning – Use
sysctl -w kernel.sched_cgroup_mode=dynamicat boot via/etc/sysctl.d/99-cgroup-mode.conf. - Power management – Pair the kernel with
intel_pstate=passiveand enabletlpto maximise the observed wattage drop. - Testing – Run
perf sched recordwhile gaming to verify thatsched_switchevents per second drop by ~12 % compared to the baseline.
What’s next?
The patches are currently under review on the Linux‑kernel mailing list (LKML). If they land in the 6.10 merge window, downstream distributions (Arch, Debian‑testing, Fedora) will likely ship them within weeks. For users who cannot wait, the patch series is available as a git branch on Peter’s public repository: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/flatten-the-pick-v3.
Adopting the new scheduler now gives a tangible FPS lift on legacy hardware, lowers power draw, and simplifies cgroup weight management for anyone running game servers or containers on the same host. As the Linux community continues to tighten the feedback loop between kernel scheduling and real‑world workloads, “flatten the pick” may become the default path for high‑performance, low‑latency gaming on Linux.

Comments
Please log in or register to join the discussion