QBE 1.3: Faster, Smarter, and More Portable Compiler Backend

QBE 1.3 adds a novel IL‑matching engine, several performance‑focused optimisations, Windows ABI support, and position‑independent code generation, pushing the compiler to 70 % of gcc ‑O2 on CoreMark and improving real‑world test suites by a third.

Thesis

QBE 1.3 represents the most substantial evolution of the project since the original 1.0 release, delivering three intertwined advances: measurable speed gains, a programmable instruction‑selection pipeline, and broader platform support. Together these changes narrow the gap between QBE and mainstream optimizing compilers while preserving the simplicity that makes QBE attractive for research and teaching.

Key arguments

1. Performance gains rooted in targeted optimisation passes

Benchmark‑driven focus – The team used the CoreMark suite as a concrete yardstick, discovering that QBE 1.2 was delivering only about 40 % of the performance of gcc -O2. By profiling the benchmark they isolated two hot functions (ee_isdigit and crcu8) whose non‑idiomatic C implementations inflated the runtime.
Selective pass suite – Rather than enable every experimental transformation, the release ships a curated set of passes (global value numbering, global code motion, loop optimisation, dead‑branch elimination, CFG simplification). On CoreMark these passes raise the score to roughly 63 % of gcc’s, and when the benchmark is manually tuned to inline the tiny digit test and replace the CRC routine with a table‑lookup, the 70 % target is met.
Real‑world impact – The Hare test suite, a more representative workload for QBE users, runs 33 % faster (1.7 s vs 2.6 s) with the new passes, confirming that the improvements are not limited to synthetic benchmarks.
In‑lining deferred – Inlining remains excluded because it conflicts with QBE’s streaming, per‑function compilation model. The developers acknowledge this limitation and plan to revisit it once the model evolves.

2. A metaprogramming layer for instruction selection

From hand‑written to generated matchers – Historically QBE matched instruction DAGs using a bottom‑up tree‑numbering algorithm derived from Ken Thompson’s Plan 9 compiler. The new mgen tool, written in OCaml, consumes specially‑annotated comment blocks that contain IL patterns and emits idiomatic C matchers.
Pattern representation – Each IL node receives a bitset indicating which top‑level patterns it satisfies. Hand‑written selector logic can then pick the most appropriate rule. Variables in patterns are collected by a tiny bytecode interpreter generated by mgen (the runmatch() function).
Future extensibility – Because the matcher is generated from declarative patterns, extending QBE with new back‑ends or optimisation recognisers (e.g., detecting bit‑rotation idioms) becomes a matter of adding pattern specifications rather than rewriting low‑level C code.

3. Platform reach: Windows ABI and position‑independent code

Windows support – Scott Graham contributed an implementation of the Windows AMD64 ABI. The assembler output remains AT&T syntax, making it compatible with the MinGW toolchain. Users can now target Windows simply with -t amd64_win.
Position‑independent code (PIC) – A new IL flag DYNCONST enables indirect access to globals via a dynamic constant, the analogue of a GOT entry on ELF platforms. This makes it possible to emit shared objects and link against dynamically‑loaded libraries on most targets, a capability that previously required hand‑crafted workarounds.

Implications

Educational value – The mgen pipeline demonstrates how a compiler can move from hard‑coded heuristics to a data‑driven matcher, offering a concrete case study for courses on compiler construction.
Research foothold – By exposing a programmable IL‑matching layer, QBE becomes a more attractive platform for experiments in instruction selection, peephole optimisation, and domain‑specific code generation.
Adoption barrier lowered – Windows and PIC support open QBE to a wider audience, especially developers who need a tiny, embeddable backend for cross‑compilation or sandboxed environments.
Performance trajectory – Although still trailing behind aggressive optimisers like clang -O3, the 33 % speedup on Hare and the approach toward the 70 % CoreMark goal indicate that continued pass engineering could eventually place QBE in the same performance tier as lightweight commercial compilers.

Counter‑perspectives

In‑lining limitation – The decision to omit inlining may frustrate users who rely on aggressive function expansion for tight loops. Until the streaming model is reconciled with inlining, certain workloads will continue to see a ceiling on achievable speed.
Complexity vs simplicity – Adding a metaprogramming toolchain and new IL flags inevitably increases the codebase and build dependencies (OCaml, generated C). Purists who value QBE’s minimalism might view this as a drift away from the original philosophy.
Windows ABI maturity – The Windows backend has not been extensively tested on actual Windows machines; users may encounter edge‑cases related to calling conventions or exception handling that are not yet covered.

Overall, QBE 1.3 delivers a balanced set of enhancements that improve raw performance, extend the compiler’s expressive power, and broaden its platform reach, all while keeping the core design principles intact.

For a deeper look at the generated matcher code, see the isel.c source file in the QBE repository.

#Compiler #Optimization #QBE #Windows ABI #PIC