Flatiron and the Clojure Case for Small, Fast Analytics

Flatiron argues that many analytics workloads do not need a database server so much as they need a better memory shape.

Thesis

Flatiron is a small but revealing Clojure project because it treats performance less as an afterthought and more as a question of representation. Its central claim is simple: if analytical data is stored in the same loose, object-heavy shape that ordinary Clojure programs use for application data, then the JVM is forced to spend too much time chasing pointers, opening maps, boxing values, and rediscovering types. If the same data is stored as typed primitive columns, then a Clojure program can begin to behave more like a compact analytical engine, while still remaining inside the host language and its ordinary workflow.

That makes Flatiron more than another library for group-by and filtering. It is an argument about where the boundary between application code and database code should sit. Instead of asking developers to send data out to SQLite, H2, DuckDB, or a service-backed warehouse, Flatiron offers an in-process columnar table, a SQL-like macro DSL, fast aggregations, stable sorting, simple persistence, and graph algorithms over the same data. The result is not a replacement for full database systems, but a focused tool for a common middle zone: datasets too large for idiomatic sequences of maps, yet too local, transient, or embedded to justify a heavier database architecture.

Key arguments

The first technical idea behind Flatiron is that Clojure’s usual data model is expressive but expensive for analytics. A sequence of maps is pleasant to construct, inspect, transform, and pass around, but every row is a separate object graph. Numeric values are often boxed. Field lookup means map lookup. Iterating through one million rows to sum a single numeric field means walking through one million row containers, even though the computation really wants one contiguous numeric array. That mismatch between logical intent and physical layout is the kind of abstraction tax that becomes invisible at small scale and overwhelming once the row count rises.

Flatiron changes the physical layout. A table is a schema plus a vector of columns, and each column is backed by a Java array matched to its type: long[] for 64-bit integers, double[] for floating point values, byte storage for booleans, and object arrays for keywords and strings. This is the classic columnar move, familiar from analytical systems such as Apache Arrow, ClickHouse, DuckDB, and modern query engines, but Flatiron applies it in a deliberately small Clojure form. The important shift is that operations over one field no longer need to touch the whole row object. A sum over :Qty can run over the :Qty array, with tight loops and unchecked arithmetic, instead of repeatedly entering and leaving generic data structures.

The second idea is that Flatiron preserves some of Clojure’s compositional style without turning query execution into runtime string parsing. Its DSL, exposed through functions and macros such as select, where, sum, avg, count, and pivot, compiles into direct function calls. A query like filtering trades where quantity exceeds a threshold and then grouping by symbol is written in a shape that resembles SQL, but it is not sent through a SQL parser. It becomes Clojure code that dispatches once on column types and then loops over primitive storage. This matters because macro expansion lets Flatiron keep a pleasant surface without paying the cost of interpreting a query language at runtime.

The morsel engine is where this design becomes more concrete. Flatiron processes element-wise operations in batches of 1024 rows, using a source that pulls chunks through type-specialized operations. The purpose of the batch is to amortize dispatch: decide what operation is happening, then apply it across a block of primitive values. This is related to the broader vectorized execution tradition in analytical databases, where work is divided into cache-friendly chunks rather than interpreted row by row. A row-oriented executor repeatedly asks, for each row, what needs to happen next. A vectorized or morsel-oriented executor asks once for a block, then lets the CPU do repetitive work with fewer interruptions.

The design is especially visible in aggregation. Flatiron’s sum, count, avg, min, and max are single-pass reductions that skip nulls and run over backing arrays. Nulls are represented through sentinels rather than boxed nullable values, which avoids another layer of indirection. Filtering builds boolean masks, and group-by can accept a mask directly through lower-level APIs, allowing filter-then-aggregate pipelines to avoid materializing an intermediate table. That is a small implementation detail with a larger philosophical point: performance often comes from refusing to create unnecessary intermediate objects in the first place.

Flatiron also shows care in its handling of domain types. Dates, instants, datetimes, durations, and other logical values can be encoded into primitive numeric representations while still being exposed as richer domain values at the edges. A LocalDate, for example, can be represented as an epoch day in a long[]. That means comparison, sorting, grouping, and min or max operations keep their primitive execution path. The value becomes a domain object only when it crosses a boundary, such as construction, retrieval, or persistence. This is a useful pattern beyond Flatiron: do not make the hot path carry semantic ornamentation if the semantics can be encoded into an ordered primitive form.

The graph support adds a second dimension to the project. Flatiron includes a CSR, or compressed sparse row, graph representation built from source and target columns, with algorithms such as BFS, DFS, Dijkstra, PageRank, and connected components. This is unusual for a small analytics library because graph processing and table analytics are often split between different tools. Flatiron’s view is that edges are tabular data until they need to become adjacency structures, and once that conversion happens, the same in-memory system can support graph algorithms without exporting to a separate engine. For application developers who are already holding relationship data in Clojure, that is a meaningful reduction in ceremony.

The benchmark story is measured rather than inflated. Compared with Rayforce, the C17 engine whose design inspired Flatiron, the C implementation remains much faster. The provided numbers show Rayforce beating Flatiron’s parallel path by several multiples, helped by SIMD kernels, custom allocation, and lower-level control over memory behavior. Flatiron’s more interesting claim is not that pure Clojure can magically erase the distance to C, but that it can get close enough for many in-process analytical workloads while preserving the convenience of a hosted language. On an Apple M1 Max, the Flatiron parallel path is far behind Rayforce in absolute terms, but still fast enough to make ordinary Clojure sequence processing look like the wrong abstraction for million-row numeric work.

The scalar sum benchmark is also a useful caution. Flatiron reports that a single-threaded sum over one million integers is faster than its parallel version, because the operation is memory-bandwidth-bound and too simple to benefit from thread coordination. This is a healthy sign in the project’s framing. Parallelism is not treated as a universal accelerator. It helps when there is enough work per row, enough grouping complexity, or enough partitionable state to justify its overhead. For simple scans, a tight single-threaded loop can win. That distinction matters because many performance libraries sell parallel execution as if cores were the only constraint, when in practice memory layout, cache behavior, allocation, and coordination costs are often more decisive.

Implications

Flatiron’s largest implication is that language ecosystems can recover serious analytical performance without abandoning their native idioms entirely, but only if they are willing to change the physical representation of data. Clojure has long encouraged programmers to build systems out of immutable maps, vectors, sets, and sequences. That model is excellent for clarity, transformation, and correctness in many application domains. Flatiron does not reject it. Instead, it draws a boundary around a specific workload and says that analytical execution needs a different substrate.

This is similar to the division between application objects and columnar execution inside larger data systems. A business object may want names, attributes, methods, validation, and identity. A query engine wants compact arrays, predictable types, and loops that the CPU can understand. Flatiron lets a Clojure developer cross that boundary without leaving the process. Data can enter from CSV, Clojure collections, or a binary column store, move into typed columns, and then be queried with a compact DSL. That is a practical bridge between expressive programming and mechanical sympathy.

The project also reflects a broader trend toward embedded analytics. Tools such as DuckDB have become popular because they make analytical queries feel local again. Instead of provisioning a server, pushing data into a warehouse, and writing glue code around remote execution, a developer can run serious analytical SQL in a process. Flatiron belongs to that family in spirit, but it takes a more language-native and narrower route. It does not offer full SQL, transactions, a large optimizer, or a mature file ecosystem. Its value is that it is pure Clojure, has a tiny dependency footprint beyond core.async, and can be manipulated directly as data structures and functions.

That smallness is not only aesthetic. It changes the operational shape of the tool. A full embedded database brings a query planner, storage rules, SQL dialect decisions, transaction semantics, and interop boundaries. Those features are powerful when the problem asks for them. They are also weight. Flatiron is closer to a performance-oriented data structure library than a database. It gives the developer columns, masks, group-by, sorting, windows, persistence, and graph algorithms, then stays out of the way. For many internal tools, data science helpers, simulation outputs, batch jobs, local dashboards, and analytical components inside larger Clojure systems, that may be the better trade.

There is also a lesson here about the JVM. The JVM is sometimes caricatured as hostile to low-level performance, but Flatiron shows the more precise truth: the JVM punishes certain object shapes and rewards others. A million boxed numbers inside a million maps is a very different program from a million primitive longs inside one array. Clojure’s abstractions do not prevent fast execution by themselves. The question is whether the library author can arrange data so the hot path becomes simple enough for the JVM to optimize. Flatiron’s use of primitive arrays, unchecked arithmetic, logical type codecs, and type-specialized loops is a reminder that high-level languages often need low-level data discipline more than low-level syntax.

The graph engine strengthens this point because graph algorithms are also representation-sensitive. A graph stored as a collection of maps or nested sets is easy to inspect but costly to traverse at scale. CSR representation, used widely in sparse matrix and graph processing, compresses adjacency information into arrays so traversal can move through contiguous ranges. By building CSR graphs from columns, Flatiron turns tabular edge lists into an efficient graph form without making the developer adopt an entirely separate conceptual model. That connection between tables and graphs is especially valuable in domains such as fraud detection, dependency analysis, recommendation systems, package graphs, social networks, and infrastructure topology.

Counter-perspectives

The strongest counter-perspective is that Flatiron’s narrowness is both its virtue and its limit. Projects such as tech.ml.dataset already offer a broader Clojure data ecosystem, with richer data handling, more formats, stronger interoperability, and a larger collection of analytical conveniences. Embedded databases such as SQLite, H2, and DuckDB offer mature query languages, persistence models, indexing strategies, and years of production hardening. For users who need full SQL, joins, transactions, flexible file ingestion, or integration with BI tools, Flatiron is probably the wrong center of gravity.

Another concern is that custom analytical engines tend to accumulate complexity as users ask for familiar database features. Filtering, grouping, sorting, and simple windows are a coherent initial set. But analytical users soon ask for joins, richer expressions, more window frames, approximate aggregations, string functions, nested data, missing-value policies, statistics, partitioned storage, predicate pushdown, and query optimization. Each feature adds surface area, and each surface area creates pressure on the clean primitive core. Flatiron’s future quality will depend partly on its ability to preserve focus, resisting the temptation to become a miniature database unless the implementation can support that ambition without losing its mechanical clarity.

There is also the question of ergonomics. Clojure developers are comfortable with data as plain maps and vectors because those structures compose across libraries. Flatiron’s table and column types are specialized. That specialization is exactly why it is fast, but it creates a conversion boundary. Users need to decide when data is large enough, numeric enough, or query-heavy enough to justify moving into Flatiron’s representation. If the answer is unclear, a developer may prefer the slower but universal path of ordinary Clojure transformations until performance pain becomes obvious.

The benchmark comparison with Rayforce cuts both ways. On one hand, staying within an order of magnitude of a C engine for parts of the workload is impressive for a pure Clojure library. On the other hand, users with truly latency-sensitive or massive workloads may reasonably choose the lower-level engine, a mature embedded analytical database, or a native library. Flatiron’s value is not absolute speed in isolation. Its value is speed per unit of integration cost for Clojure programs. That is a narrower claim, but it is also a more credible one.

Flatiron is therefore best understood as a thoughtful experiment in choosing the right abstraction layer. It does not say that every Clojure program should become columnar, or that databases are unnecessary, or that pure language implementations can ignore the advantages of native code. It says that there is a significant class of analytical work where the decisive move is to stop representing tables as rows of maps and start representing them as columns of primitives. Once that move is made, a surprising amount becomes possible: fast aggregations, typed filters, compact persistence, stable sorting, window functions, parallel group-by, and graph algorithms. The philosophy is modest but consequential: performance is not only about algorithms, and not only about languages. It is about giving the machine a shape of data it can actually work with.

#Analytics #Clojure #Performance #data-structures #embedded-databases

Flatiron and the Clojure Case for Small, Fast Analytics

Thesis

Key arguments

Implications

Counter-perspectives

Comments