#Vulnerabilities

PinTheft: How a RDS Zerocopy Double‑Free Became a Page‑Cache Overwrite Primitive

Tech Essays Reporter
4 min read

The V12 security team disclosed PinTheft, a Linux local privilege escalation that exploits a double‑free in the RDS zerocopy send path. By chaining the bug with io_uring fixed buffers, an attacker can corrupt the page cache of a SUID‑root binary and gain root shells. The vulnerability lives in the RDS kernel module (CONFIG_RDS + CONFIG_RDS_TCP) and is currently present on Arch Linux installations that enable it by default.

Thesis

The newly disclosed PinTheft exploit demonstrates how a seemingly innocuous double‑free in the Remote Direct Memory Access (RDS) subsystem can be amplified into a full‑blown local privilege escalation (LPE) by leveraging the fixed‑buffer feature of io_uring. The attack chain rewrites the page cache of a SUID‑root binary, allowing arbitrary code execution as root. Understanding the mechanics of the bug, the role of io_uring, and the distribution‑specific exposure is essential for system administrators and kernel developers alike.


Key Arguments

1. The underlying kernel defect: RDS zerocopy double‑free

  • Location in the code – The flaw resides in rds_message_zcopy_from_user() within the RDS module. When a zerocopy send is performed, the function pins each user page individually. If a later page faults, the error‑handling path releases the pages that have already been pinned.
  • Why it is a double‑free – After the error path drops the pinned pages, the RDS message cleanup routine later frees the same pages again because the scatter‑list entries and the entry count remain live even after the zerocopy notifier is cleared. Consequently, each failed zerocopy send can steal one reference from the first page in the scatter‑list.
  • Reference‑count impact – The stolen reference reduces the page’s page_count without the kernel’s knowledge, eventually allowing the page to be reclaimed while still being referenced elsewhere.

2. Amplification via io_uring fixed buffers

  • Fixed‑buffer semanticsio_uring can register an anonymous page as a fixed buffer (IORING_REGISTER_BUFFERS). The kernel then treats the buffer as pinned with a FOLL_PIN bias of 1024 references, effectively inflating its reference count.
  • Stealing the inflated references – By repeatedly triggering the RDS double‑free, the attacker drains the artificial 1024 references, causing the page to become eligible for reclamation despite still being registered with io_uring.
  • Re‑allocation as page cache – Once the page is freed, the kernel may reuse it for the page cache of a file. The PoC targets a SUID‑root binary, causing the reclaimed page to back that binary’s executable image.

3. Overwriting the page cache with a malicious ELF payload

  • Stale pointer exploitation – The io_uring registration still holds a pointer to the original page. After the page has been repurposed, the attacker writes through the stale pointer, overwriting the page‑cache contents of the SUID binary with a tiny ELF payload.
  • Execution flow – When the binary is next executed, the kernel loads the corrupted page cache, resulting in immediate execution of the attacker‑controlled payload, which spawns a root shell.

4. Real‑world exposure and distribution impact

  • Kernel configuration requirement – The exploit relies on the RDS kernel module being compiled (CONFIG_RDS and CONFIG_RDS_TCP). Most mainstream distributions ship the module as a loadable kernel module, but it is enabled by default only on Arch Linux among the tested distros.
  • Patch status – A patch addressing the double‑free has been merged upstream. The V12 security team released a proof‑of‑concept (PoC) alongside the patch to aid verification.

Implications

  • Immediate mitigation – Systems that enable RDS should either disable the module (modprobe -r rds) or upgrade to a kernel version containing the patch (kernel 6.9.4 or later). Administrators of Arch Linux should audit /etc/modprobe.d/ to ensure the module is not loaded unintentionally.
  • Broader security lessons – The chain illustrates how reference‑count bugs can be weaponized when combined with unrelated kernel features that manipulate reference counts (io_uring, splice, etc.). Auditors must therefore consider cross‑subsystem interactions, not just isolated code paths.
  • Future kernel hardening – The incident may prompt kernel developers to tighten validation around page‑pinning APIs and to add sanity checks that prevent a page’s reference count from dropping below a safe threshold after error handling.

Counter‑Perspectives

  • Limited attack surface – Critics argue that the requirement for both the RDS module and the ability to register fixed buffers via io_uring narrows the practical exploitability to a small subset of systems, primarily development machines or specialized servers.
  • Performance trade‑offs – Disabling RDS may impact workloads that rely on high‑throughput, low‑latency messaging (e.g., certain HPC or storage clusters). In such environments, the benefit of retaining RDS must be weighed against the residual risk, especially if the kernel version is recent and patched.
  • Alternative vectors – Some security researchers suggest that similar reference‑count manipulation could be achieved without io_uring, using other mechanisms such as userfaultfd or splice. While the current PoC focuses on io_uring, the underlying principle may inspire new exploit attempts.

Conclusion

PinTheft is a compelling case study in how a modest double‑free within the RDS zerocopy path can be elevated to a full LPE by exploiting the high‑reference‑count semantics of io_uring fixed buffers. The exploit underscores the necessity of holistic kernel auditing, the importance of timely patch deployment, and the need for administrators to scrutinize optional kernel modules that may not be essential for their workloads. For further technical details, the original PoC and patch can be examined in the V12 security repository: PinTheft PoC on GitHub.

Comments

Loading comments...