Nvidia’s AI Rack Costs Surge: Memory Now 25% of $7.8 M Price Tag

Morgan Stanley’s teardown shows Nvidia’s next‑gen Vera Rubin‑based VR200 NVL72 rack will cost hyperscalers about $7.8 million, with DRAM and NAND memory alone accounting for roughly $2 million – a 435 % increase over the prior generation. The analysis breaks down the bill‑of‑materials, explains why LPDDR5X and 3D NAND prices have spiked, and assesses the impact on cloud providers’ capex planning.

Announcement

Nvidia’s upcoming VR200 NVL72 rack, built around the Vera Rubin GPU and Vera CPU, is projected to cost hyperscale cloud operators around $7.8 million per unit. Morgan Stanley’s recent BOM analysis indicates that memory now consumes about 25 % of that total, roughly $2 million per rack – a 435 % jump from the GB300 NVL72 predecessor.

Nvidia Image credit: Nvidia

Technical specifications and cost breakdown

Component	Qty per VR200 NVL72	Unit cost (est.)	Total cost
Vera Rubin GPU	8	$55,000	$440,000
Vera CPU	8	$5,000	$40,000
LPDDR5X (54 TB)	54 TB	$8‑$10 / GB	$432,000‑$540,000
3D NAND SSD (≈1 TB per GPU)	~8 TB	$125 / GB*	$1,000,000
Advanced PCB, cooling, power, networking	–	–	$2.5 M‑$3.0 M
Misc. (packaging, test, markup)	–	–	$1.3 M‑$1.5 M
Total	–	–	≈$7.8 M

*Spot price for 3D NAND is roughly $125 / GB in Q2 2026, according to DRAMeXchange.

Memory explosion

LPDDR5X – The VR200 rack ships 54 TB of LPDDR5X, three times the 17 TB found in the GB300 generation. At $8 / GB the raw die cost is $432 k; at $10 / GB it climbs to $540 k. Nvidia’s own markup on SOCAMM2 modules (the only package that fits Vera CPUs) is believed to add another 20‑30 %.
3D NAND – Each Rubin GPU is paired with a high‑capacity 3D NAND module for model weights and inference buffers. The cumulative storage per rack exceeds 1 PB, but the analysis isolates roughly $1 M of that as “memory” because it is directly tied to AI workload performance rather than bulk archival storage.
HBM4 – The Rubin GPUs also carry HBM4 stacks (≈30 GB per GPU). While HBM is accounted for in the GPU price, its cost is proportional to the $55 k GPU price tag and does not shift the memory‑percentage calculation.

Why the BOM is heavier than before

Switching fabric – The VR200 uses a next‑gen NVLink‑3 mesh with 400 Gbps per link, requiring custom silicon and high‑speed PCB traces. Estimated cost increase: $300 k per rack.
Power delivery – 600 W per GPU plus 250 W per CPU forces a redesign of the power‑module architecture, adding $250 k‑$350 k.
Cooling – Liquid‑directed cooling plates with integrated pumps replace the air‑only solution of the GB300, adding roughly $200 k.
Packaging – Nvidia now bundles the Rubin GPU and Vera CPU in a system‑in‑package (SiP) to reduce latency, a process that adds $150 k‑$200 k in test and yield costs.

Market implications

Cloud capex pressure

Hyperscalers typically amortize AI rack spend over 3‑5 years. At $7.8 M per rack, the annualized cost rises from roughly $1.6 M (GB300) to $2.6 M. Assuming a 30 % utilization target for inference workloads, the effective cost per inference dollar climbs by about $0.30.

Supply‑chain stress points

LPDDR5X scarcity – The LPDDR5X market is currently supplied by Samsung, SK Hynix, and Micron, each operating at >90 % capacity. A 20 % YoY demand increase from AI servers could push spot prices toward $12 / GB, eroding Nvidia’s margin unless long‑term contracts are secured.
3D NAND fab constraints – NAND fabs are already booked for consumer SSD demand. Nvidia’s need for high‑density, low‑latency modules may force the company to lock in capacity at a premium, similar to the 2023 HBM2e shortage.
Packaging bottlenecks – SiP assembly requires advanced under‑fill and wafer‑level packaging equipment that only a handful of vendors operate. Lead times of 12‑18 months have been reported for similar high‑performance AI stacks.

Pricing strategy outlook

Nvidia’s announced $55 k price for the Rubin GPU is ≈30 % lower than the $78 k price of the H100 at launch (adjusted for inflation). The lower GPU price is clearly a tactic to keep the overall rack cost competitive against rival offerings from AMD (MI300X) and custom ASICs from Graphcore. However, the memory‑driven cost surge means that future price reductions will have to come from supply‑chain negotiations rather than GPU price cuts.

Competitive positioning

AMD – The MI300X‑based racks still rely on DDR5, which is cheaper per GB but offers lower bandwidth for the largest models. Their BOM shows memory at ~15 % of total cost, giving them a cost‑advantage on smaller workloads.
Google TPU v5 – Google’s in‑house TPU continues to use HBM2e and on‑chip SRAM, keeping memory share below 20 %. Their vertical integration shields them from market‑wide DRAM price spikes.

Bottom line

Nvidia’s VR200 NVL72 rack illustrates how memory economics now dominate AI server pricing. With LPDDR5X and 3D NAND prices climbing 400 %+ year‑over‑year, hyperscalers will need to secure long‑term supply contracts or redesign workloads to fit tighter memory budgets. The $7.8 M price tag is not just a headline number; it reflects a shift in the cost structure where silicon performance gains are increasingly offset by the price of the data they must hold.

Sources: Morgan Stanley VR200 BOM analysis (via Twitter), DRAMeXchange Q2 2026 pricing, SemiAnalysis memory cost model, Framework DDR5 pricing tracker.

#AI #Memory #Cloud #Cost #Nvidia