Dell and Kioxia push 10 PB storage into a 2‑RU chassis – what it means for AI and data‑lake builders
#Hardware

Dell and Kioxia push 10 PB storage into a 2‑RU chassis – what it means for AI and data‑lake builders

Trends Reporter
4 min read

Dell’s new PowerEdge R7725xd packs 40 × 245 TB QLC SSDs from Kioxia into a 2‑RU server, delivering almost 10 PB of flash. The move showcases how high‑capacity NVMe is becoming a practical option for AI workloads, but it also raises questions about cost, endurance and the future role of HDDs.

A new density milestone for flash storage

Dell’s latest PowerEdge R7725xd server crams 40 × Kioxia LC9 E3.L QLC SSDs into a 2‑RU enclosure, reaching a nominal 9.8 PB of raw capacity. The system runs on an AMD EPYC 9005 platform and can be equipped with up to five 400 Gbps Ethernet adapters, meaning that the massive data set can be moved out of the box at line‑rate speeds.

Dell-Kioxia teaser.

The LC9 drive, part of Kioxia’s high‑capacity QLC family, offers 245.76 TB per unit in a 3.5‑inch form factor. By stacking 40 of them Dell achieves a storage density that would have required several rack units a few years ago. The company estimates that a full rack populated with twenty of these servers could hold close to 200 PB of data.

Why the community is paying attention

  • AI training pipelines – Large language models and computer‑vision systems routinely consume terabytes of training data. Having that data on‑premises in a single rack reduces the latency of shuffling between storage and compute nodes.
  • Data‑lake economics – The quoted TCO improvements stem from eliminating multiple tiers (HDD + SSD + cloud) and from the power‑efficiency of QLC flash compared with older NAND generations.
  • Network readiness – Five 400 Gbps NICs align with the bandwidth needs of modern distributed training frameworks such as PyTorch Distributed or TensorFlow Multi‑Worker Mirrored Strategy.

Industry analysts have pointed to this as a sign that flash‑only architectures are moving from niche (high‑performance caching) to mainstream bulk storage. The announcement also puts pressure on competitors: Micron’s 6600 ION, SanDisk’s UltraQLC SN670 and SK Hynix’s AIN D are all targeting the 200‑plus‑TB per‑drive segment.

Counter‑points and lingering doubts

While the headline numbers are impressive, several practical concerns temper the enthusiasm:

  1. Endurance of QLC NAND – QLC cells wear out faster than MLC or TLC. Dell’s configuration relies on over‑provisioning and aggressive wear‑leveling, but workloads that involve frequent writes (e.g., continuous ingestion pipelines) may still see a reduced drive lifespan. Customers will need to monitor write amplification closely.
  2. Cost per gigabyte – Even with volume pricing, a 245 TB QLC SSD still costs several thousand dollars. For organizations that can tolerate slightly higher latency, a hybrid approach that mixes lower‑cost HDDs for cold data could remain more economical.
  3. Supply‑chain volatility – The QLC market has seen periodic shortages. If demand for AI‑grade storage outpaces production, lead times could stretch, making the Dell‑Kioxia solution a premium offering rather than a commodity.
  4. Alternative architectures – Some cloud providers are experimenting with disaggregated storage fabrics that separate compute from flash pools. In such models, the need for a monolithic 10 PB server diminishes, as data can be accessed over high‑speed fabrics like CXL or Gen‑Z.

What this means for developers and architects

For teams building large‑scale AI pipelines, the Dell‑Kioxia server offers a concrete example of how to collapse storage hierarchy and simplify data movement. When evaluating whether to adopt a similar design, consider:

  • Workload profile – If the primary pattern is read‑heavy inference or batch training with relatively static datasets, QLC endurance is less of a concern.
  • Capacity planning – A single 2‑RU box can hold almost 10 PB, but planning for growth beyond that may still require multiple racks or a tiered strategy.
  • Software stack – Ensure that the chosen orchestration tools (Kubernetes, Slurm, etc.) can expose the NVMe devices directly to containers or VMs, avoiding unnecessary abstraction layers that could erode performance.

Looking ahead

Kioxia’s roadmap hints at 1 PB‑class drives in the next few years, while Samsung is reportedly developing a near‑line SSD that could rival HDDs on a per‑drive basis. If those products materialize, the density curve will keep rising, but the trade‑offs around cost, endurance and power will remain central to the conversation.

For now, the Dell PowerEdge R7725xd demonstrates that flash‑only storage at the petabyte scale is no longer a theoretical exercise. Whether it becomes the default choice for AI infrastructure will depend on how quickly the ecosystem can address the durability and pricing challenges that accompany QLC technology.


Read the official Dell announcement here and explore Kioxia’s LC9 specifications on their product page.

Comments

Loading comments...