What starts as a mundane task—configuring a RAID array—can spiral into a years-long odyssey through Linux kernel internals. For one developer, the journey began with a power outage and the unsettling realization that Linux's ubiquitous software RAID (mdadm) harbors a silent data-corruption flaw known as the RAID5 write hole. This vulnerability isn't confined to RAID5; as the developer discovered through meticulous man-page scrutiny, RAID6 is equally susceptible, upending common assumptions about storage safety.

The Write Hole: A Ticking Time Bomb

At its core, the RAID5 write hole stems from non-atomic writes. When data is written across disks, a power failure mid-operation can leave parity information inconsistent, corrupting data without detection. As the developer notes:

"RAID could notice that it's corrupt by checking against the parity, but it doesn’t, and there’s no guarantee the parity is correct anyway."

Scrubbing the array offers no fix—it blindly overwrites parity, cementing corruption. RAID6’s dual parity should theoretically enable error correction via a "three-way vote," but kernel documentation confirms it shares the same flaw.

Chasing Solutions: dm-Integrity and Performance Pitfalls

Seeking resilience, the developer explored dm-integrity, which adds per-sector checksums to detect corruption. However, default journaling slashed performance by over 50% due to redundant writes. Disabling journaling (--integrity-no-journal) was possible but unsupported by LVM, requiring manual setup via dmsetup and kernel-level hacks. Worse, dm-integrity’s sparse documentation demanded code archaeology:

  • Metadata reads always precede data writes, causing seeks that hamper throughput.
  • Optimal configuration (e.g., aligning 4MiB chunks with metadata sectors) is nontrivial and tooling-unfriendly.

Contributing Back: Patches and Systemic Gaps

This investigation spurred tangible open-source impact. The developer:
1. Patched kernel documentation to clarify dm-integrity behavior.
2. Enhanced systemd-integritysetup for Ubuntu, enabling journalless operation.
3. Highlighted tooling flaws, like md(4)’s obscurity and integritytab’s reliance on /dev/sd* paths.

Yet, the compromises proved exhausting. After two years of wrestling with complexity, the developer abandoned the effort for ZFS + Snapraid + MergerFS—a pragmatic, if imperfect, alternative.

Why This Matters for Engineers

This saga underscores critical lessons:
- RAID is not backup: Write holes and silent corruption risks persist, demanding checksummed filesystems like ZFS or Btrfs for critical data.
- Documentation saves lives: Reading man pages (mdadm(8), md(4)) revealed truths absent from tutorials.
- Performance vs. integrity: dm-integrity’s trade-offs reflect a broader tension; robust error correction often sacrifices speed.

In an era of sprawling cloud storage, the episode is a stark reminder: even decades-old solutions harbor hidden perils, and sometimes, the path to resilience leads away from convention.

Source: russ.har.mn