Disabling setuid: How Linux's no_new_privs Feature Reshapes Privilege Escalation Defenses
Share this article
For decades, setuid binaries have represented both essential functionality and persistent security liabilities in Linux systems. These executables—programs like passwd, sudo, and su that temporarily elevate user privileges—have repeatedly served as attack vectors for privilege escalation exploits. Now, a paradigm shift is underway leveraging the kernel's no_new_privs feature to neutralize this threat entirely.
The Dual Problem Space
Two seemingly unrelated issues converge here:
1. UID/GID Drift: Image-based distributions (e.g., container hosts) require strict /usr directory ownership by root:root—complicated by scattered setuid binaries owned by other users.
2. Setuid Exploit Surface: Vulnerabilities in privileged binaries enable attackers to hijack elevated rights—a risk amplified by complex legacy tools like sudo.
Disabling setuid elegantly resolves both. By deprecating these binaries, distributions eliminate ownership exceptions and shrink the attack surface.
How no_new_privs Works
When enabled (via systemd using NoNewPrivs=yes), this kernel flag persists across processes and prevents execve() from granting new privileges:
- Ignores setuid/setgid bits
- Disregards file capabilities
- Blocks LSM profile relaxation post-exec
Critically, it doesn’t restrict legitimate privilege changes via syscalls like setuid(). This enables controlled delegation without ambient risk.
Architectural Overhaul: Replacing Setuid Binaries
Instead of monolithic privileged binaries, functionality shifts to tightly scoped IPC services. Systemd-managed, socket-activated services now handle tasks like password changes or privilege delegation:
# Example systemd unit hardening for pwaccessd (shadow data service)
[Service]
RestrictAddressFamilies=AF_UNIX
RestrictSUIDSGID=true
ProtectSystem=strict
Key replacements:
- Authentication: pam_unix_ng (replacing pam_unix) + pwaccessd service
- Account Management: account-utils tools (chage, passwd) communicating via varlink
- Command Execution: run0 (systemd's sudo/su alternative) with transient units
Bridging Compatibility Gaps
Legacy script dependency on sudo/su is addressed via wrapper scripts that translate commands to run0 calls. For example:
# run0-sudo wrapper pseudocode
if [[ $1 == "-i" ]]; then
run0 --shell
else
run0 "$@"
fi
Polkit rules simulate classic sudo policies, though limitations remain around command-line argument validation (see upstream issues).
Current State and Open Challenges
openSUSE Tumbleweed/MicroOS offer working implementations via the disable-setuid package. However, critical hurdles persist:
- Containers:
no_new_privsbreaks setuid binaries inside containers. A BPF LSM-based solution is proposed for dynamic NNP toggling. - SELinux/Policy Gaps: Policies for new services (
pwaccessd,newidmapd) require refinement. - Edge Binaries: Exceptions like
newgrpandgpasswdneed alternatives.
The Binary Inventory
Thorsten Kukuk's analysis catalogs setuid/file-capability binaries in openSUSE:
| Binary | Package | Status |
|---|---|---|
sudo |
sudo | ❌ (use run0) |
polkit-agent-helper-1 |
polkit | ✅ (patched) |
unix_chkpwd |
pam | ❌ (use pam_unix_ng) |
ping |
iputils | ❌ (capabilities) |
Path Forward
While containerized setuid remains unresolved, the elimination of host-level setuid binaries marks significant progress. By leveraging kernel primitives like no_new_privs and rearchitecting privilege delegation through minimal services, Linux distributions can finally mitigate a half-century-old attack vector—without sacrificing functionality.
Source: Thorsten Kukuk, Enabling no_new_privs/NoNewPrivs, disabling setuid on Linux