Tagged Unions Unleashed: How Zig and Modern Languages Fix C's Dangerous Legacy
Share this article
When I upgraded my Zig project from version 0.11.0 to 0.12.0, a change in the build system API forced me into the compiler’s source code. There, in Build.zig and Step.zig, I rediscovered tagged unions—a concept as old as C but reimagined with life-changing rigor. What began as routine maintenance became a revelation: modern languages are turning these once-error-prone constructs into indispensable tools for robust systems development.
The Fragile Foundation: Tagged Unions in C
In C, tagged unions are a manual dance of enum and union, lacking compiler safeguards. Consider this typical implementation:
union supported_values {
bool b;
uint8_t u8;
uint32_t u32;
// ... other types
};
struct value_tracker {
enum value_type type;
union supported_values value;
};
Here, the type field indicates which union variant is active, but nothing enforces consistency. Accessing value.u32 when type is misaligned leads to silent data corruption. Worse, real-world systems amplify the risk—like a telemetry module I encountered:
typedef void (*get_value_cb)(void *);
struct value_tracker {
enum telemetry_value_type type;
get_value_cb value_cb;
uint64_t value; // Stores all types, cast unsafely
};
static void fetch_gyro_x(void *data) {
*(double *)data = accelerometer_get_data()->gyro.x;
}
This design relies on void* callbacks and manual type-casting, inviting memory errors. Updating a value involves treacherous steps:
void update_value(struct value_tracker *val_track) {
val_track->value_cb((void *)&val_track->value); // Unsafe write
switch(val_track->type) {
case VAL_DOUBLE: {
double *p_val = (double *)&val_track->value;
// Hazardous conversion logic
break;
}
// ... other cases
}
}
The compiler can’t catch mismatches between type and the callback’s behavior, turning development into a high-wire act.
Zig’s Compiler-Enforced Revolution
Languages like Zig eliminate this fragility by baking tagged unions into the type system. Here, unions and tags are inextricably linked—modifying one without the other triggers compile-time errors:
const ComplexType = union(enum) {
ok: u8,
not_ok: void,
};
pub fn main() void {
var c = ComplexType{ .ok = 42 };
switch (c) {
.ok => |*value| value.* += 1, // Type-safe access
.not_ok => unreachable,
}
}
Attempting to add a maybe: u8 variant without updating the tag? Zig refuses to compile, preventing runtime disasters. This "unbreakable bond" ensures every variant is handled exhaustively. Zig leverages this for error handling too—functions like fn read() !u8 return a union of u8 or an error, with syntax like try and catch for ergonomic control flow.
Beyond Zig: Hare and Odin’s Innovations
Hare simplifies unions further with intuitive syntax:
type signed = (int | i8 | i16 | i32 | i64);
fn read_line(h: io::handle) ([]u8 | io::EOF | io::error);
Its match statement elegantly dispatches based on type, while Odin offers similar rigor. Both demonstrate how modern languages treat unions as first-class citizens, not afterthoughts.
Why This Matters for Developers
Tagged unions, when compiler-enforced, eradicate entire categories of bugs—misaligned types, unhandled cases, and unsafe casting. They’re foundational for error handling, state machines, and data serialization. In an era of high-stakes software, from embedded systems to distributed clouds, these gains aren’t just theoretical. They reduce debugging nightmares and accelerate reliable development.
Languages like Zig, Hare, and Odin recognize this, turning a C-era pattern into a pillar of modern type safety. As systems grow more complex, embracing such tools isn’t optional; it’s how we build software that doesn’t break silently. For deeper dives, explore The Codepath Combinatoric Explosion or Algebraic Data Types.