An 18-year-old developer is documenting his journey of building a multi-target compiler backend from scratch, without relying on LLVM or other frameworks. The project aims to create a compiler that can generate optimized code for multiple architectures with fine-grained control over SIMD operations and security-hardened code generation.
Building a Multi-Target Compiler Backend from Scratch: The No-LLVM Approach
Summary
An 18-year-old developer is documenting his journey of building a multi-target compiler backend from scratch, without relying on LLVM or other frameworks. The project aims to create a compiler that can generate optimized code for multiple architectures (x86-64, SPIR-V, ARM64, RISC-V, WebAssembly) with fine-grained control over SIMD operations and security-hardened code generation.
Introduction
Gideon, an 18-year-old with three years of C++ experience, is undertaking an ambitious project: building a complete compiler backend from first principles. Having previously written ray tracers and video codecs without frameworks, Gideon is now tackling the complex world of compiler construction with the same ground-up approach.
"What I'm actually building is not a programming language," Gideon explains, "but a compiler backend toolkit—the part that turns intermediate representation into fast machine code across multiple targets."
The Architecture: A Custom Compiler Pipeline
The compiler follows a multi-stage pipeline:
Source → Parser → AST → SSMOL (HIR) → MREL (LIR) → x86-64 / SPIR-V / ARM64 / RISC-V / WASM
At the heart of this pipeline are two custom intermediate representations:
SSMOL (Static Single Assignment Meta-Object Language): A high-level IR that understands types, ownership semantics, and program semantics.
MREL (Machine Representation Expression Language): A target-agnostic low-level IR that handles virtual registers, stack slots, and machine operations without knowledge of physical register names.
The backend then translates MREL into specific machine code for each target architecture, handling register allocation, instruction selection, and final code emission per target.
Why Not Just Use LLVM?
The decision to build from scratch rather than leverage LLVM—a mature, widely-used compiler infrastructure—deserves examination. LLVM comprises approximately 4 million lines of code and represents a general-purpose solution that attempts to serve everyone's needs perfectly no one's.
Gideon's requirements include:
Fine-grained control over SIMD width selection per target: Essential for optimizing vector operations across different hardware capabilities.
Constant-time crypto primitive emission with secret register annotations: Critical for security-sensitive applications where timing attacks are a concern.
Security obfuscation passes: Including control flow flattening and opaque predicates to protect intellectual property and hinder reverse engineering.
Complete understanding and licensing control: A codebase that Gideon can fully comprehend and license without restrictions.
Building from scratch is undeniably slower, but it offers complete ownership over every design decision and implementation choice.
Current Status: The Parser Stage
Gideon is currently in the parser stage, implementing a hand-written recursive descent parser for a C++-like language. The language includes:
- Functions and structs
- Basic type system
- Ownership semantics (inspired by Rust but simplified)
- Explicit SIMD types (v128, v256, v512)
The parser emits an Abstract Syntax Tree (AST) that gets lowered to SSMOL, the high-level intermediate representation that understands types, ownership, and program semantics.
The Road Ahead
The immediate next steps involve:
SSMOL → MREL lowering: Converting high-level types to sizes, structs to offsets, and control flow to basic blocks.
MREL → x86-64 backend: Implementing register allocation, instruction selection, and ELF emission for the x86-64 architecture.
A working program: Getting a simple program (main() returning 42) to compile, link, and run successfully.
Expanding to other targets: Implementing support for SPIR-V compute kernels, followed by ARM64, RISC-V, and WebAssembly backends.
Documenting the Process
What makes this project particularly valuable is Gideon's commitment to documenting the build process in real-time. Rather than polished tutorials, he promises "build logs from someone actually building," focusing on:
- Problems that took days to solve
- Technical specifications written to maintain design coherence
- Wrong assumptions that cost significant time
This approach offers an unfiltered look at the challenges and triumphs of compiler development, providing practical insights for others working in systems programming, compiler development, or graphics programming.
Broader Implications
Gideon's project represents a significant undertaking in the world of compiler development. In an era where most compilers are built atop massive frameworks like LLVM, building from scratch is a rare and ambitious endeavor.
The project's focus on security-hardened code generation and fine-grained control over optimization targets addresses specific needs that general-purpose compiler infrastructures may not fully satisfy. For developers working in security-sensitive domains or requiring highly specialized optimizations, such a custom backend could offer substantial advantages.
Furthermore, documenting the entire process provides a valuable educational resource for aspiring compiler developers, offering insights into the practical challenges of compiler construction that are often glossed over in academic treatments.
Following the Journey
Gideon plans to document each stage of the project as he reaches it, providing regular updates on his progress. Those interested in following the series include:
- Systems programmers and compiler developers
- Graphics programmers interested in SPIR-V generation
- Developers curious about what "building from scratch" actually entails
- Anyone interested in seeing whether the project succeeds or fails
The GitHub repository for the project is available at https://github.com/ayndlr, where Gideon has also shared technical specifications for the MREL backend covering x86-64, ARM64, RISC-V, SPIR-V, and WebAssembly with calling conventions, opcode tables, and security passes.
Conclusion
Building a multi-target compiler backend from scratch is an ambitious undertaking that requires deep knowledge of computer architecture, algorithms, and programming language design. Gideon's project offers a fascinating look into this complex process, with its emphasis on practical implementation over theoretical perfection.
As the project progresses, it will be interesting to see how the custom compiler backend compares to established solutions in terms of performance, code quality, and maintainability. Regardless of the ultimate outcome, the documentation of this journey provides a valuable contribution to the field of compiler development.

Comments
Please log in or register to join the discussion