The Strange Elegance of a Structless Dynamic Array in C

A tiny C vector design turns absence into architecture, removing both structs and stored capacity while exposing the hidden bargains behind low-level convenience.

Thesis

The GitHub note titled “A generic dynamic array in C that stores no capacity and needs no struct” describes a compact experiment in C API design: a dynamic array represented as exactly two pointers, where one pointer-sized slot stores the current length and the other points to the allocated elements. Its appeal is not that it should replace every ordinary vector abstraction, but that it forces a useful question about systems programming: how much machinery does a data structure actually need before it becomes understandable, portable, and pleasant to use?

In a conventional C dynamic array, the programmer usually creates a struct containing a data pointer, a length, and a capacity. That model is familiar because it mirrors the real state of the allocation. The length tells us how many elements are meaningful, the capacity tells us how many can be written before reallocation, and the pointer gives us the backing storage. The design discussed here removes one of those fields entirely and hides another in an unusual place. An empty vector of integers can be written as int *vec[2] = { 0 };, while an empty vector of struct person values can be written as struct person *people[2] = { 0 };. The first slot, interpreted through uintptr_t, represents length. The second slot is the actual data pointer.

That is a small syntactic trick with larger philosophical consequences. C often rewards programmers who make representation explicit, but this design gains genericity by making representation slightly strange. There is no IntVec, no generated family of vector structs, no macro that declares a named container type. The vector’s type is visible in the second pointer, and the vector’s metadata is squeezed into the two-slot array that the programmer already declares. It is C at its most compressed, attractive because it looks like almost nothing, suspicious for the same reason.

Key arguments

The first argument in favor of the design is ergonomic. C has no built-in generics in the style of C++ templates, Rust generics, or Go’s type parameters, so reusable containers often involve one of several compromises. A library may use void *, sacrificing type checking at the boundary. It may use macros to generate type-specific structs and functions, producing better type behavior but more names and more generated surface area. It may require the programmer to manually define a container type for every element type. Each option is serviceable, but none feels native to the language.

This two-pointer vector sidesteps part of that problem. Because the data slot has the type T *, operations such as push can be written as macros that infer the element type from the vector expression. In spirit, it belongs to the lineage of C container experiments such as Sean Barrett’s stretchy buffer, which also tries to make dynamic arrays feel like ordinary C arrays while hiding metadata nearby. The difference is that stretchy buffers traditionally store metadata before the allocation, while this design stores the length in the wrapper object and declines to store capacity at all.

The second argument is that capacity can be reconstructed from length under a disciplined growth policy. Most dynamic arrays double their capacity when full. If capacity always follows powers of two, then moments of growth are predictable: when length is zero, allocate initial storage; when length is a power of two, grow to the next power of two. The design therefore does not need to remember capacity, because the current length determines when the next realloc should happen. The C library function realloc becomes the engine behind the whole abstraction, expanding the backing storage at the familiar exponential rhythm that gives amortized constant-time append behavior.

This is clever because it distinguishes between information that is necessary and information that is merely convenient. Capacity is not fundamental in the same way length is fundamental. A vector must know how many initialized elements it contains, but capacity is an optimization boundary, a record of space already paid for. If the growth policy is rigid enough, that boundary can be recomputed at the moments when it matters. The program loses some flexibility, but it also deletes a field from the representation.

The third argument is aesthetic, and in C that is not trivial. Systems programmers often distrust aesthetics because beauty can disguise undefined behavior, hidden allocation, or surprising control flow. Still, C APIs live or die by local readability. A declaration like int *vec[2] = { 0 }; is terse and mechanically simple. There is no separate type declaration. There is no constructor function. There is no naming ceremony. For small internal tools, experiments, and single-header utilities, that matters.

The design also demonstrates how much of C’s extensibility happens outside the standard type system. The described code depends on C23 plus GNU statement expressions, documented in the GCC manual. Statement expressions allow a macro to behave more like an inline generic function, evaluating local temporaries and returning a value. In this case, vec_push can push an element and return success or failure. That return value matters because allocation can fail, and a vector abstraction that hides allocation without reporting failure would be a poor citizen in serious C code.

Supporting evidence

The strongest evidence for the idea is the smallness of the representation. A vector is just two pointer slots. The first is not a real pointer in the ordinary sense, but a pointer-sized storage location for the length. The second points to the allocation. In effect, the design treats the two-element array as a tiny header object whose second member is already typed correctly for the element array.

The reliance on uintptr_t is the most delicate part. uintptr_t is an unsigned integer type capable of holding a converted object pointer value when the implementation provides it. The design reverses the usual intuition: instead of converting a pointer into an integer for inspection or hashing, it stores an integer length in a slot that has pointer type, then later reads that value back through uintptr_t. The original note correctly characterizes this as implementation-defined behavior. A program using this trick is asking the implementation to preserve the integer value through a pointer-shaped channel, and that request is not as portable or semantically clean as keeping a size_t len field in a struct.

The no-capacity rule is similarly elegant but constraining. Suppose the vector currently has length 8. Since 8 is a power of two, the next push triggers growth to capacity 16. At length 9, no growth is needed. At length 16, the next push grows to 32. This gives the familiar doubling schedule without storing the current allocation size. For append-heavy workloads, that may be good enough, because the key performance property remains: most pushes do not allocate, and occasional pushes pay for larger storage.

The cost appears when the programmer wants reservation. In a conventional vector, reserve(1000) means the data structure records that it has room for 1000 elements even if its length is still 0. In this structless design, there is nowhere durable to store that fact. If a manual reservation creates a larger allocation while length remains small, the push logic can still decide to reallocate at the next power-of-two length boundary, because it only sees length. The abstraction has forgotten that extra space exists. That is not merely an implementation inconvenience. It is a consequence of the representation’s central bargain: facts not stored must either be recomputable or unavailable.

This gives the design a clean educational value. Many data structures look inevitable only because their trade-offs have been normalized. Length, capacity, and pointer are so familiar that they seem like the natural atoms of a dynamic array. This example separates them. Pointer is necessary. Length is necessary. Capacity is useful, often very useful, but under a strict growth law it can be treated as derived state. Once seen that way, the ordinary vector struct becomes less like a law of nature and more like a practical treaty among speed, flexibility, and clarity.

Implications

The broader implication is that C remains a language where abstraction is negotiated through representation. In languages with stronger generic systems, a vector is usually a library type whose interface is separated from its physical layout. In C, the layout often leaks into the API because allocation, ownership, and type identity are all close to the surface. A clever macro can reduce friction, but it cannot fully erase the responsibilities that the type system does not carry.

That is why this design is most compelling as a study in minimalism rather than a universal recommendation. It shows how far one can compress a dynamic array while preserving typed element access and amortized growth. It also shows where compression starts to tax the reader. A normal struct { T *data; size_t len; size_t cap; } is boring, but boring representations have virtues. They are easy to inspect in a debugger, easy to serialize mentally, easy to extend with operations like reserve, and less likely to depend on implementation-specific behavior.

There is also a lesson about hidden state. The design appears to have less state because it stores no capacity, but the allocation still has capacity in reality. The allocator knows how many bytes were requested, and the program has arranged for that number to follow a power-of-two schedule. The capacity has not disappeared from the machine, only from the program’s explicit model. That can be a reasonable choice when the model’s job is narrow. It becomes hazardous when callers begin to expect the richer behaviors of a full vector library.

For library authors, the question becomes one of audience. If the header is meant for personal projects, controlled platforms, or exploratory code, the compactness may be a virtue. If it is meant for a public C library consumed across compilers, architectures, sanitizers, and build modes, the implementation-defined pointer conversion becomes harder to justify. Portability in C is not an abstract moral preference. It is the difference between code that merely works where it was born and code that can survive contact with other toolchains.

The dependence on GNU statement expressions adds a second portability boundary. GCC and Clang support them in GNU modes, but they are not standard C. A project already committed to GNU C may accept that happily. A project that advertises strict C23 compatibility cannot. This distinction matters because the phrase “C23” can sound like standard legitimacy, while the macro technique still requires a compiler extension. The design lives in the productive gray zone where many real C programs live, close to the standard but not fully inside it.

Counter-perspectives

A sympathetic critic would say that the ordinary three-field vector struct is still the better default. It is explicit, portable, and honest about the state a dynamic array manages. It supports reservation naturally. It avoids storing integers in pointer slots. It works with ordinary functions instead of depending heavily on macros. It may require a name, but names are not always clutter. Sometimes they are the price of making a concept visible.

Another critique is that the API’s visual simplicity may shift complexity into the programmer’s expectations. int *vec[2] does not immediately announce “dynamic array header” to a reader unfamiliar with the convention. A named type like IntVec or a generic macro declaration such as VEC(int) vec can communicate intent more directly, even if it involves more scaffolding. In low-level code, clarity often comes less from reducing tokens than from making invariants easy to find.

A third perspective is that capacity is not just an optimization, but an interface capability. Programs reserve space for many reasons: to avoid repeated allocation in a known workload, to keep pointers stable for a phase, to make failure happen before mutation begins, or to match data arriving from a file or network protocol. A vector design that cannot remember reserved space has chosen append simplicity over allocation control. That choice is coherent, but it narrows the domain where the abstraction feels complete.

The most balanced reading is that this GitHub experiment is valuable because it is precise about its own compromises. It does not pretend that implementation-defined behavior is free. It does not pretend that removing capacity has no consequence. It presents a small mechanism and lets the reader feel both its elegance and its pressure points. That is the best kind of systems programming artifact: not a finished doctrine, but a compact object lesson in how representation, portability, and ergonomics keep negotiating with one another in C.

#C Programming #data-structures #API Design #Systems Programming #memory-management