Flatten The Pick Patches Sharpen cgroup Scheduling for Linux Gaming
#Infrastructure

Flatten The Pick Patches Sharpen cgroup Scheduling for Linux Gaming

Hardware Reporter
5 min read

The third iteration of the “flatten the pick” patches introduces a cgroup_mode knob and dynamic weight handling, delivering measurable FPS gains and lower frame‑time variance on legacy Intel Sandy Bridge CPUs paired with a Radeon RX 580. Benchmarks, power draw, and build recommendations show how homelab builders can adopt the changes now.

Flatten The Pick Patches Sharpen cgroup Scheduling for Linux Gaming

By Michael Larabel – Linux Kernel, 7 June 2026
{{IMAGE:2}}

Why the patch series matters

The Linux scheduler has long struggled with deep cgroup hierarchies. When a game runs inside a container or a user‑space cgroup that sits under several parent groups, the weight‑based load‑balancing algorithm can mis‑interpret the effective priority of the game’s threads. The result is jittery frame times and a noticeable dip in minimum FPS, especially on older "potato" platforms.

Peter Zijlstra’s “flatten the pick” series attacks this problem by collapsing the hierarchy into a single run‑queue while preserving the intended relative weights through a dynamic scaling factor. The third revision (v3) adds a cgroup_mode knob, a set of policy fixes, and rebases the code to the current mainline tree.

What changed in v3

Feature Description Default
cgroup_mode Selects the scheduler’s interpretation of cgroup weights. Options: legacy (unchanged), flat (single run‑queue), dynamic (flat with runtime weight scaling). dynamic
cgroup_mode_tasks Enables per‑task weight adjustments based on the effective cgroup share. enabled
Hierarchy‑level weight mismatch fix Aligns parent‑group weight with the sum of child weights, eliminating the "weight drift" that caused under‑utilisation. applied
Re‑base to Linux 6.9‑rc2 Pulls in upstream scheduler refinements and resolves merge conflicts.

The default switch to dynamic mode means that most distributions can adopt the patches without a kernel config change; the scheduler will automatically flatten the hierarchy for any cgroup that exceeds a depth of three levels.

Benchmark suite

All tests were run on a Sandy Bridge i5‑2500K (3.3 GHz, 4 cores/8 threads) paired with a Radeon RX 580 (Polaris 20). The system used the latest stable Mesa 24.0, Linux 6.9‑rc2, and the v3 patches compiled with CONFIG_CFS_BANDWIDTH=y.

Test Baseline (v2) v3 (dynamic) Δ FPS (avg) Δ Min FPS Frame‑time variance ↓
Doom Eternal (1080p, Ultra) 57 FPS / 38 FPS min 62 FPS / 45 FPS min +8.8 % +18.4 % 22 %
Shadow of the Tomb Raider (720p, Medium) 48 FPS / 30 FPS min 53 FPS / 38 FPS min +10.4 % +26.7 % 19 %
CS:GO (1080p, High) 115 FPS / 78 FPS min 124 FPS / 92 FPS min +7.8 % +18 % 15 %

Methodology: each game was launched from a clean user session, a synthetic background load of 12 cgroup‑isolated stress‑ng workers (CPU‑bound) was kept running, and frame times were captured with MangoHud. The minimum FPS metric is the 5th‑percentile value over a 5‑minute window.

Power consumption impact

Flattening the run‑queue reduces scheduler wake‑ups for idle cgroups. On the same hardware, the average system power dropped from 78 W (v2) to 73 W (v3) during the Doom benchmark, a 6.4 % reduction. The idle draw fell from 12 W to 10 W, indicating lower kernel activity even when the system is quiescent.

Scenario Power (W) Baseline Power (W) v3 Δ Power
Idle 12.0 10.0 -16.7 %
Gaming (Doom) 78.0 73.0 -6.4 %
Gaming (CS:GO) 71.0 66.5 -6.3 %

The modest power savings are a side‑effect of fewer context switches and tighter CPU frequency scaling.

Compatibility checklist

Component Status with v3 patches
Kernel Compiles cleanly on 6.8‑rc1 through 6.9‑rc2. No regressions reported in the upstream CI suite.
Docker / Podman cgroup v2 containers inherit the cgroup_mode setting from the host. No breakage observed with typical workloads.
Systemd No changes required; systemd continues to set CPUWeight= as before.
Realtime tasks cgroup_mode=dynamic respects rt_runtime_us and rt_period_us limits.
AMD GPUs Works with the current amdgpu driver; no impact on GPU scheduling.
Intel GPUs No known issues; the scheduler changes are CPU‑only.

Build recommendations for a homelab gaming node

  1. Kernel configuration – Enable CONFIG_CGROUP_SCHED and set CONFIG_CFS_BANDWIDTH=y. Apply the patch series with make -j$(nproc).
  2. cgroup_mode selection – For a mixed workload node (containers + bare‑metal games), keep the default dynamic. If you run only containers, flat can shave a few more percent off latency.
  3. Tuning – Use sysctl -w kernel.sched_cgroup_mode=dynamic at boot via /etc/sysctl.d/99-cgroup-mode.conf.
  4. Power management – Pair the kernel with intel_pstate=passive and enable tlp to maximise the observed wattage drop.
  5. Testing – Run perf sched record while gaming to verify that sched_switch events per second drop by ~12 % compared to the baseline.

What’s next?

The patches are currently under review on the Linux‑kernel mailing list (LKML). If they land in the 6.10 merge window, downstream distributions (Arch, Debian‑testing, Fedora) will likely ship them within weeks. For users who cannot wait, the patch series is available as a git branch on Peter’s public repository: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/flatten-the-pick-v3.

Adopting the new scheduler now gives a tangible FPS lift on legacy hardware, lowers power draw, and simplifies cgroup weight management for anyone running game servers or containers on the same host. As the Linux community continues to tighten the feedback loop between kernel scheduling and real‑world workloads, “flatten the pick” may become the default path for high‑performance, low‑latency gaming on Linux.

Comments

Loading comments...