Researchers have uncovered a critical vulnerability in Linux's io_uring ZCRX subsystem that allows privilege escalation from regular user to root under specific conditions.
A new security vulnerability in the Linux kernel's io_uring ZCRX (Zero-Copy Receive) subsystem has been disclosed, affecting versions 6.15 through 6.19. The vulnerability, which allows a local privilege escalation (LPE) from regular user to root, stems from an out-of-bounds heap write in the zero-copy receive implementation.
The ZCRX subsystem, introduced in Linux 6.15, enables high-performance network packet reception directly into user memory without kernel copying. It manages a pool of network I/O vectors using a freelist stack and a counter tracking available slots. The vulnerability occurs because there is no upper bound check on this counter.
"The issue arises from two separate kernel teardown paths both returning niovs to the same freelist," explains the researcher who discovered the vulnerability. "When these paths overlap, the free_count exceeds the allocated array length, resulting in a 4-byte out-of-bounds write into adjacent slab memory."
The attack chain begins with a controlled 4-byte write of a small integer (the niov index) into memory adjacent to the freelist. By carefully choosing the memory area size during registration, an attacker can target specific kernel objects. The exploit involves:
- Corrupting a msg_msg object's list pointer
- Using heap grooming to place a fake msg_msg at a controlled address
- Leaking kernel addresses to bypass KASLR
- Overwriting modprobe_path to execute arbitrary code as root
The vulnerability requires specific conditions to be exploitable:
- Kernel versions 6.15 through 6.19 (without the fix)
- CONFIG_IO_URING_ZCRX=y enabled in the kernel configuration
- Real ZCRX-capable NIC hardware (Mellanox ConnectX-6+, Intel E800 series, Netronome NFP, etc.)
- CAP_NET_ADMIN privileges
"This vulnerability represents a significant risk for systems using ZCRX-capable hardware with the necessary privileges," security analysts note. "Container environments, particularly Kubernetes networking sidecars and Docker configurations with NET_ADMIN capabilities, are especially vulnerable."
The vulnerability was fixed in commit 770594e, which adds a bounds check before writing to the freelist. However, this fix has not yet been incorporated into any stable kernel branches as of the writing of this disclosure.
"The discovery highlights the increasing complexity of kernel subsystems and the challenges in ensuring their security," observes one kernel security researcher. "As performance optimizations like zero-copy become more prevalent, we must carefully audit the edge cases and interactions between different kernel components."
Organizations running affected kernel versions on compatible hardware should apply the fix as soon as it becomes available in stable branches. Until then, restricting CAP_NET_ADMIN privileges and disabling ZCRX functionality where possible can mitigate the risk.
For more technical details about the vulnerability and proof-of-concept code, refer to the original disclosure at ze3ter's blog.
Comments
Please log in or register to join the discussion