Chasing Efficient Java Development: Lessons from the 1BRC to the Hardwood Parquet Parser
#Regulation

Chasing Efficient Java Development: Lessons from the 1BRC to the Hardwood Parquet Parser

Backend Reporter
6 min read

Gunnar Morling’s InfoQ podcast reveals how modern Java runtimes, durable execution engines, and a zero‑dependency Parquet parser reshape scalability, consistency, and API design for data‑intensive systems.

Chasing Efficient Java Development: From the One‑Billion‑Row Challenge to Hardwood AI‑Native Parsing

Featured image

Published May 25 2026 – Podcast with Gunnar Morling, Senior Principal Technologist at Confluent


The problem: Java’s reputation for sluggish data pipelines

For years many teams treated Java as the “slow” option for high‑throughput analytics. The classic complaint was two‑fold:

  1. Memory pressure – the default object header and heap layout caused large GC pauses.
  2. Dependency bloat – parsing columnar formats like Apache Parquet pulled in the whole Hadoop stack, inflating the attack surface and making reproducible builds painful.

When you combine those issues with long‑running business workflows (order fulfillment, CDC pipelines, etc.) you end up with a system that is hard to reason about, difficult to scale, and fragile under failure. The One‑Billion‑Row Challenge (1BRC) demonstrated that Java can be fast, but it left open the question of how to keep that speed while delivering production‑grade consistency and clean APIs.


Solution approach: three pillars

1. Upgrade the runtime – let the JVM do the heavy lifting

  • Compact object headers (JEP 402) shrink each heap object by up to 25 %, which directly reduces GC work.
  • Concurrent collectors (ZGC, Shenandoah) now pause for sub‑millisecond intervals even on multi‑terabyte heaps.
  • Virtual threads (Project Loom) provide lightweight concurrency without the OS thread overhead, enabling fine‑grained parallelism in parsers and workflow engines.

The practical takeaway is simple: move every production service to at least JDK 17 and enable ZGC. The performance gain is measurable without any code change – a classic “upgrade‑first” win.


2. Durable execution engine – resumable workflows in plain Java

Morling’s Persistasaurus wraps a SQLite state store and lets developers write a workflow as ordinary Java code. Each logical step is automatically persisted; on failure the engine resumes from the last committed step.

Scalability impact – because the engine is just a library, you can run thousands of independent flows on a single JVM. No external orchestrator is required, which eliminates network latency and reduces the number of moving parts.

Consistency model – Persistasaurus guarantees exact‑once execution for each step, similar to a transactional outbox pattern, but without the boilerplate. The state store is ACID‑compliant, so you get strong consistency for the workflow while the surrounding services can stay eventually consistent.

API pattern – the engine exposes a fluent builder that returns a DurableTask<T>; the task can be started, paused, or queried for status. This mirrors the popular command pattern but with built‑in persistence, making the API intuitive for developers who already know Java streams.


3. Hardwood – a zero‑dependency, AI‑native Parquet parser

Hardwood was built to address the two pain points identified earlier: dependency bloat and parallel throughput.

  • Zero mandatory dependencies – the core parser uses only the JDK. Optional modules add compression (LZ4, GZIP) or S3 support, keeping the BOM under 200 KB.
  • Page‑level parallelism – Parquet files are split into pages; Hardwood schedules each page on a virtual‑thread pool. An adaptive balancer assigns more threads to columns with expensive encodings, achieving near‑linear scaling on a 32‑core machine.
  • AI‑first development – Claude Code generated most of the boilerplate and the initial page decoder implementations. A strict CLAUDE.md workflow forces a design document, ADRs, and code reviews before any AI‑generated snippet lands in main. This keeps the code maintainable and prevents the “copy‑paste‑bug” syndrome.

Consistency guarantees – Hardwood validates row‑group statistics against the file footer before decoding. If a mismatch is detected, the parser aborts with a clear exception, preventing silent data corruption.

API design – Two entry points are offered:

  1. RowReader – an iterator that yields Map<String, Object> or Avro records, ideal for object‑oriented processing.
  2. ColumnReader – returns primitive arrays (int[], double[]) for a single column, enabling zero‑copy vector‑API pipelines.

Both APIs are deliberately minimal; the public surface is limited to a handful of interfaces, which reduces the cognitive load for new contributors and eases versioning.


Trade‑offs and what to watch out for

Aspect Benefit Cost / Risk
JDK upgrade Immediate performance boost, smaller GC pauses Requires testing of legacy libraries; some frameworks still lag on JDK 17+ support
Durable execution No external orchestrator, exact‑once semantics SQLite adds a native dependency; not suited for ultra‑high‑throughput (>10 k ops/s) without sharding
Virtual threads Millions of lightweight tasks, simple code Still experimental in some monitoring tools; thread‑dump analysis differs from OS threads
Hardwood zero‑dep Small footprint, fast start‑up, easier supply‑chain security Missing advanced features (e.g., column pruning on encrypted files) that Hadoop‑based parsers provide
AI‑generated code Faster prototype cycles, consistent style Requires vigilant human review; risk of hidden performance regressions

In practice the combination works well for micro‑batch workloads (e.g., daily CDC ingestion, analytics pipelines that read a few hundred gigabytes per run). For real‑time streaming at millions of events per second, you may still need a dedicated native parser written in Rust or C++ and a separate orchestrator.


How to adopt these ideas in your own stack

  1. Audit your Java version – if you are on 8 or 11, schedule a migration to 17. Enable ZGC (-XX:+UseZGC) and measure pause times with JFR.
  2. Introduce Persistasaurus – replace ad‑hoc retry loops with a DurableTask wrapper. Start with a single flow (e.g., order validation) and expand.
  3. Swap the Parquet dependency – add io.github.gunnarhardwood:hardwood-core (see the GitHub repo). Begin with the RowReader API for compatibility, then benchmark the ColumnReader path for aggregation jobs.
  4. Set up a performance‑regression guard – integrate Apache Otava or a custom JMH suite that runs on each PR. Track both latency and allocation rates.
  5. Codify AI assistance – create a CLAUDE.md in your repo that mandates a design doc before any AI‑generated commit. Use the same pattern Morling described to keep the codebase clean.

Looking ahead

The Hardwood project is still in beta, but its design shows a clear path toward a full‑stack, Java‑first data platform that can compete with native‑code alternatives while staying within the familiar Java ecosystem. The key lesson from the podcast is that performance, consistency, and API simplicity are not mutually exclusive; they become achievable when you combine a modern JVM, a durable execution model, and disciplined AI‑assisted development.


Author: Gunnar Morling (Java Champion, Confluent) – interview conducted by Olimpiu Pop, InfoQ editor.


Further reading

Comments

Loading comments...