pg_jitter: A Faster JIT Compilation Provider for PostgreSQL
#Regulation

pg_jitter: A Faster JIT Compilation Provider for PostgreSQL

Startups Reporter
5 min read

A lightweight JIT compilation provider for PostgreSQL that delivers microsecond-level compilation times and competitive query execution across PostgreSQL 14–18.

PostgreSQL's JIT compilation, introduced in version 11, solves a critical performance problem: the inefficiency of interpreting expressions and using per-row loops for internal data conversions. However, the standard LLVM-based JIT comes with a notorious drawback - compilation times that can range from tens to hundreds of milliseconds, sometimes even seconds. For typical OLTP queries, this overhead can easily exceed the query execution time itself, making JIT impractical for anything but the heaviest OLAP-style workloads.

Enter pg_jitter, a lightweight JIT compilation provider that adds three alternative backends - sljit, AsmJit, and MIR - delivering faster compilation and competitive query execution across PostgreSQL 14–18.

The Performance Problem

The core issue pg_jitter addresses is straightforward: standard LLVM JIT compilation is too slow for many real-world scenarios. When compilation takes hundreds of milliseconds, it's only worthwhile for queries that take seconds to execute. For the vast majority of database operations - point lookups, small result sets, or high-frequency queries - the compilation overhead becomes the dominant cost.

pg_jitter flips this equation by providing native code generation with microsecond-level compilation times. While LLVM takes tens to hundreds of milliseconds, pg_jitter's backends compile in the range of tens to low hundreds of microseconds (sljit), hundreds of microseconds (AsmJIT), or hundreds of microseconds to single milliseconds (MIR).

Three Backends, Three Strengths

sljit emerges as the most consistent performer, delivering 5–25% faster execution than the interpreter across all workloads. Its phenomenal compilation speed - tens to low hundreds of microseconds - makes it the best choice for most scenarios. The library is also the smallest at around 200 KB.

AsmJIT excels on wide-row and deform-heavy queries, achieving up to 32% faster performance than the interpreter. Its strength lies in specialized tuple deforming operations, making it ideal for analytical workloads dealing with wide tables.

MIR provides solid gains while being the most portable backend. It supports a wide range of architectures including arm64, x86_64, s390x, ppc, mips, and risc-v, making it a good choice when targeting multiple platforms.

Interestingly, even without considering compilation performance differences, LLVM often performs worse than all three pg_jitter backends in execution time. This counterintuitive result stems from pg_jitter's zero-cost inlining using compile-time pre-extracted code and manual instruction-level optimization.

Architecture and Implementation

pg_jitter implements PostgreSQL's JitProviderCallbacks interface. When PostgreSQL decides to JIT-compile a query, it calls compile_expr(), which walks the ExprState->steps[] array and emits native machine code for approximately 30 hot-path opcodes. These include arithmetic operations, comparisons, variable access, tuple deforming, aggregation, boolean logic, and jumps.

For the remaining opcodes, pg_jitter delegates to pg_jitter_fallback_step(), which calls the corresponding ExecEval* C functions. This two-tier approach ensures comprehensive coverage while optimizing the most frequently used operations.

Two-Tier Function Optimization

A key innovation in pg_jitter is its two-tier function optimization strategy:

Tier 1 handles pass-by-value operations (int, float, bool, date, timestamp, OID) as direct native calls with inline overflow checking. This eliminates FunctionCallInfo overhead entirely.

Tier 2 handles pass-by-reference operations (numeric, text, interval, uuid) through DirectFunctionCall C wrappers. When built with optional LLVM or c2mir support, these can be further optimized.

Runtime Flexibility

One of pg_jitter's most practical features is its meta provider, which allows runtime backend switching without restarting PostgreSQL. Users can set jit_provider = 'pg_jitter' and then switch between backends using SET pg_jitter.backend = 'sljit' or similar commands. This flexibility means you can tune performance for specific workloads without downtime.

Memory Management and Stability

JIT-compiled code in pg_jitter is tied to PostgreSQL's ResourceOwner system. A PgJitterContext is created per query, and each compiled function is registered on a linked list with a backend-specific free callback. When the query's ResourceOwner is released, all compiled code is freed appropriately for each backend.

The current source code is considered beta-quality - it passes all standard Postgres regression tests and shows good improvements in performance tests, but lacks large-scale production verification. The project supports PostgreSQL 14–18 from a single codebase using compile-time guards to handle version-specific differences.

Getting Started

Setting up pg_jitter requires PostgreSQL 14–18 with development headers, CMake >= 3.16, and C11/C++17 compilers. The build process supports all three backends simultaneously or individually, with options for custom PostgreSQL installations and precompiled function blobs for zero-cost inlining.

Configuration is straightforward: set jit_provider to either a specific backend (e.g., 'pg_jitter_sljit') or the meta provider ('pg_jitter'), then reload the configuration. For the meta provider, you can switch backends on the fly using the pg_jitter.backend GUC.

Performance Considerations

While pg_jitter dramatically reduces compilation overhead, it's important to note that execution can still slow down by up to ~1ms even for the fastest backends due to cold processor cache effects and increased memory pressure from rapid allocations during code generation.

For systems executing many queries per second, it's recommended to avoid JIT compilation for very fast queries such as point lookups or queries processing only a few records. The default jit_above_cost parameter is set to a very high number (100,000) to reflect this consideration, though users may want to adjust this to ~200-5,000 depending on their specific backend and workload.

The Bottom Line

pg_jitter represents a significant advancement in PostgreSQL's JIT capabilities. By providing microsecond-level compilation times and competitive execution performance across multiple backends, it makes JIT compilation practical for a much wider range of queries than the standard LLVM implementation.

Whether you're running analytical workloads on wide tables, need the flexibility of runtime backend switching, or simply want to eliminate the compilation overhead that makes standard JIT impractical for your use case, pg_jitter offers a compelling alternative that's worth exploring.

The project is actively developed and welcomes testing on additional platforms. If you've had success (or issues) running it on platforms beyond the tested Linux/MacOS/ARM64 and Linux/x86_64 combinations, the maintainer encourages reaching out at [email protected].

[IMAGE:1]

[IMAGE:2]

Comments

Loading comments...