Early-Z testing is a foundational GPU optimization that dramatically reduces unnecessary pixel shading by culling occluded fragments early. This deep dive explores its intricate mechanics, surprising limitations with modern shader techniques, and practical strategies for maximizing rendering performance.
The Hidden Engine of GPU Efficiency: Demystifying Early-Z Testing

For decades, Early-Z testing has been the silent workhorse of real-time graphics, enabling techniques like depth pre-passes that keep forward rendering viable. Yet its nuanced interactions with modern shader features remain widely misunderstood. As GPUs evolve, mastering Early-Z becomes critical for unlocking peak performance in complex rendering pipelines.
The Logical Pipeline vs. Hardware Reality
Graphics APIs depict a logical pipeline where depth operations occur after pixel shading—a historical artifact from when depth buffers primarily resolved visibility. In reality, drivers analyze shaders and states to determine if depth testing can safely move before pixel shading (Early-Z), culling fragments without executing expensive shaders:

"This ‘sneaky’ optimization works because for opaque geometry, culling before shading produces identical results to the logical pipeline—just faster," explains the analysis. "The magic lies in drivers guaranteeing correctness while exploiting hardware parallelism."
When Early-Z Thrives… and Stumbles
The Ideal Scenario
With standard opaque shaders (no discards, depth exports, or UAV writes), Early-Z shines. Front-to-back rendering slashes pixel shader invocations dramatically, as demonstrated by the author's test app:
- Back-to-front draw: 648,000 shader invocations
- Front-to-front draw: 440,640 invocations (32% reduction)

The Disruptors
Discard/Alpha Test: Forces partial Late-Z when depth writes are enabled, crippling culling efficiency. Even an unused
discardinstruction in the shader disables full Early-Z:// This unused discard still disables full Early-Z! if (false) discard;Depth Export: Pixel shader overrides (
SV_Depth,gl_FragDepth) force full Late-Z—the GPU can't predict outputs pre-shading. Conservative variants (SV_DepthGreaterEqual) offer limited reprieves.UAV/Storage Writes: Side effects break Early-Z's "pure function" assumption. Without explicit forcing, drivers default to Late-Z to preserve correctness.
Taking Control: Forcing Early-Z
APIs like D3D offer [earlydepthstencil] to override driver decisions. This enables Early-Z with UAVs—crucial for techniques like Order-Independent Transparency—but introduces caveats:
- Depth exports are ignored
- Discard doesn't prevent depth writes
- Without ROVs, UAV writes race across overlapping fragments

Rasterizer Order Views: The Savior?
ROVs/FSI enforce submission-order UAV writes, restoring expected depth-test behavior when forcing Early-Z:
"ROVs guarantee UAV writes only occur for visible fragments and respect draw order, making forced Early-Z viable for advanced techniques—with a parallelism penalty."
The Decision Matrix
| Shader Features | Depth Write | Implicit Early-Z? | Forced Early-Z Behavior |
|---|---|---|---|
| None | Off | ✅ Likely | Correct |
| Discard | On | ⚠️ Partial (reduced) | ❌ Depth write ignores discard |
| UAV Writes | Off | ❌ Late-Z | ✅ Writes if visible (unordered) |
| UAV + ROV | On | ❌ Late-Z | ✅ Correct with ROVs |
| Depth Export | Any | ❌ Late-Z | ❌ Export ignored |
Strategic Insights
- Prepass Wisely: Depth-only passes maximize Early-Z efficiency for opaque geometry.
- Isolate Disruptors: Batch non-discard opaques first to prime the depth buffer.
- ROVs > Atomcis: For OIT, prefer ROVs over depth+payload atomics when forcing Early-Z.
- Mobile Caveat: Behavior varies—test target hardware aggressively.
As rendering complexity escalates, understanding Early-Z transitions from optimization to necessity. The difference between theory and hardware reality isn't just academic—it's the gap between stutter and silky frames.
Source: To Early-Z or Not to Early-Z by Michał Iwanicki (Principal Engine Architect)

Comments
Please log in or register to join the discussion