MYRA: Java’s Off‑Heap Engine for Sub‑Microsecond Latency

A recent post by Rohan Ray introduces MYRA (Memory Yielded, Rapid Access), a production‑grade Java library stack that leverages the Foreign Function & Memory (FFM) API introduced in JDK 22. The goal is clear: deliver deterministic, sub‑microsecond latency for high‑frequency trading, market data feeds, and other real‑time systems while preserving Java’s safety guarantees.

The Problem with Legacy Off‑Heap Approaches

Historically, Java developers have turned to sun.misc.Unsafe or JNI wrappers to access off‑heap memory. Both approaches suffer from stability and portability issues:

  • Unsafe is an internal API that changes with each JDK release, forcing frequent refactoring.
  • JNI incurs heavy boilerplate, increases the attack surface, and can introduce hard‑to‑debug memory corruption.

FFM offers a standardized, safe alternative that exposes the same low‑level capabilities but within the JVM’s bounds‑checking and type‑safety mechanisms.

MYRA’s Core Design Principles

The stack is built around four pillars that directly address the performance‑memory trade‑off:

  1. Zero GC – All data lives off‑heap in memory arenas; the garbage collector never touches the critical path.
  2. Zero Allocation – Reusable, stateless flyweight views replace object churn; the hot path never allocates on the heap.
  3. Zero Copy – Structured layouts allow direct reads/writes to raw memory, eliminating serialization overhead.
  4. Ultra‑Low Latency – Targeting < 30 µs mean latency with controlled tail behavior.

Each principle is realized through a set of six libraries:

  • roray-ffm-utils – Memory arenas and native resource handling.
  • myra-codec – Zero‑copy serialization.
  • myra-transport – Linux io_uring‑based networking.
  • express-rpc – A lightweight RPC framework.
  • jia-cache – Off‑heap caching.
  • (Future) – Additional utilities for metrics and diagnostics.

Benchmarks that Matter

The author reports on two key performance dimensions: serialization and networking.

Serialization

Using a realistic order‑book snapshot workload on an c6a.4xlarge instance with JDK 25, MYRA outperforms competitors in decode throughput:

Codec Decode (ops/s) Encode (ops/s)
MYRA 4,150,079 1,911,781
SBE 2,204,557 4,990,071
FlatBuffers 1,968,855 1,045,843
Kryo 1,322,754 1,342,611
Avro 454,553 466,816

The decode‑dominance of MYRA suits read‑heavy workloads common in trading and market data pipelines.

Networking

A ping‑pong latency test on an ARM64 Graviton instance demonstrates MYRA’s io_uring‑based transport outpacing Netty and NIO:

Transport Mean Latency (µs) Throughput (ops/s)
MYRA_TOKEN 28.70 34,843
Netty 39.34 25,417
NIO 13.22 75,645

The token‑based completion tracking strikes a balance between latency and consistency, delivering 27 % lower latency and 37 % higher throughput than Netty.

Why Java + FFM Beats C/C++/Rust for Most Real‑World Use Cases

The article addresses a common question: Why not write the entire stack in C++ or Rust?

Factor C/C++ Rust Java + FFM (MYRA)
Memory Safety Undefined behavior, manual checks Ownership model, steep learning curve Bounds‑checked, no segfaults
Performance Tuning Manual SIMD, architecture‑specific code Zero‑cost abstractions, but still requires expertise Off‑heap access with deterministic behavior
Talent Pool Scarce, high cost Niche, crypto‑centric Broad Java community
Tooling gdb, perf, valgrind rust‑c, cargo‑watch JDK profilers, Flight Recorder
Ecosystem Limited to low‑level libs Fragmented async runtimes Mature Maven, Spring, Loom

While C++ may still win in absolute raw speed for ultra‑low latency (1 µs budgets), the performance gap is often within 10‑15 %. For systems that process millions of messages per second, the trade‑off in developer velocity and safety favors Java + MYRA.

Implications for the Industry

MYRA’s approach signals a shift in how latency‑critical Java applications are built:

  • Deterministic Off‑Heap – Eliminates GC pauses in the critical path, a long‑standing pain point for HFT firms.
  • Standardized API – By basing the stack on the officially supported FFM API, future JDK releases will not break the ecosystem.
  • Open‑Source, No‑Enterprise Lock‑In – The author commits to a fully open‑source model, encouraging community contributions and reducing vendor lock‑in.

Developers in finance, ad‑tech, game servers, and IoT can now prototype high‑performance pipelines with the safety and tooling of the JVM, potentially reducing time‑to‑market by months.

Next Steps

The stack is slated for a public open‑source release by Christmas 2025. The author plans to continue optimizations, documentation, and community engagement. For those interested in exploring FFM or building low‑latency Java services, the repository is available at github.com/mvp‑express.

Source: https://www.roray.dev/blog/myra-stack/