#AI

Tagged Unions Unleashed: How Zig and Modern Languages Fix C's Dangerous Legacy

LavX Team
3 min read

A developer's journey updating a Zig project reveals how tagged unions, once a fragile design pattern in C, become compiler-enforced powerhouses in languages like Zig and Hare. This exploration uncovers the critical safety gains and ergonomic improvements that eliminate entire classes of bugs in systems programming.

When I upgraded my Zig project from version 0.11.0 to 0.12.0, a change in the build system API forced me into the compiler’s source code. There, in Build.zig and Step.zig, I rediscovered tagged unions—a concept as old as C but reimagined with life-changing rigor. What began as routine maintenance became a revelation: modern languages are turning these once-error-prone constructs into indispensable tools for robust systems development.

The Fragile Foundation: Tagged Unions in C

In C, tagged unions are a manual dance of enum and union, lacking compiler safeguards. Consider this typical implementation:

union supported_values {
    bool b;
    uint8_t u8;
    uint32_t u32;
    // ... other types
};

struct value_tracker {
    enum value_type type;
    union supported_values value;
};

Here, the type field indicates which union variant is active, but nothing enforces consistency. Accessing value.u32 when type is misaligned leads to silent data corruption. Worse, real-world systems amplify the risk—like a telemetry module I encountered:

typedef void (*get_value_cb)(void *);

struct value_tracker {
    enum telemetry_value_type type;
    get_value_cb value_cb;
    uint64_t value; // Stores all types, cast unsafely
};

static void fetch_gyro_x(void *data) {
    *(double *)data = accelerometer_get_data()->gyro.x;
}

This design relies on void* callbacks and manual type-casting, inviting memory errors. Updating a value involves treacherous steps:

void update_value(struct value_tracker *val_track) {
    val_track->value_cb((void *)&val_track->value); // Unsafe write
    
    switch(val_track->type) {
        case VAL_DOUBLE: {
            double *p_val = (double *)&val_track->value;
            // Hazardous conversion logic
            break;
        }
        // ... other cases
    }
}

The compiler can’t catch mismatches between type and the callback’s behavior, turning development into a high-wire act.

Zig’s Compiler-Enforced Revolution

Languages like Zig eliminate this fragility by baking tagged unions into the type system. Here, unions and tags are inextricably linked—modifying one without the other triggers compile-time errors:

const ComplexType = union(enum) {
    ok: u8,
    not_ok: void,
};

pub fn main() void {
    var c = ComplexType{ .ok = 42 };
    
    switch (c) {
        .ok => |*value| value.* += 1, // Type-safe access
        .not_ok => unreachable,
    }
}

Attempting to add a maybe: u8 variant without updating the tag? Zig refuses to compile, preventing runtime disasters. This "unbreakable bond" ensures every variant is handled exhaustively. {{IMAGE:1}} Zig leverages this for error handling too—functions like fn read() !u8 return a union of u8 or an error, with syntax like try and catch for ergonomic control flow.

Beyond Zig: Hare and Odin’s Innovations

Hare simplifies unions further with intuitive syntax:

type signed = (int | i8 | i16 | i32 | i64);

fn read_line(h: io::handle) ([]u8 | io::EOF | io::error);

Its match statement elegantly dispatches based on type, while Odin offers similar rigor. Both demonstrate how modern languages treat unions as first-class citizens, not afterthoughts.

Why This Matters for Developers

Tagged unions, when compiler-enforced, eradicate entire categories of bugs—misaligned types, unhandled cases, and unsafe casting. They’re foundational for error handling, state machines, and data serialization. In an era of high-stakes software, from embedded systems to distributed clouds, these gains aren’t just theoretical. They reduce debugging nightmares and accelerate reliable development.

Languages like Zig, Hare, and Odin recognize this, turning a C-era pattern into a pillar of modern type safety. As systems grow more complex, embracing such tools isn’t optional; it’s how we build software that doesn’t break silently. For deeper dives, explore The Codepath Combinatoric Explosion or Algebraic Data Types.

Source: https://ciesie.com/post/tagged_unions/

Comments

Loading comments...