OCaml-Powered Static Search: How One Developer Built a Zero-Dependency Full-Text Engine
Share this article
For developers committed to static site performance, adding dynamic features like full-text search presents a paradox: How to deliver interactive functionality without compromising core principles of speed and simplicity? OCaml developer Alex Leighton tackled this challenge head-on by engineering a custom solution that shares identical code between static generation and client execution—all while keeping dependencies lean and payloads microscopic.
The Static Search Conundrum
Static sites excel at delivering content quickly by pre-rendering HTML. But traditional search implementations often rely on server-side APIs or bulky JavaScript libraries, introducing latency, complexity, or heavy dependencies. Leighton’s requirements were uncompromising:
- Minimal footprint: Total solution under 100KB
- OCaml-native toolchain: Avoid maintaining polyglot build pipelines
- No external services: Preserve the static site’s self-contained nature
Architecture: OCaml Everywhere
Leighton’s approach leverages OCaml’s versatility across environments:
Build-Time Indexing
- Modified the
searchOCaml library to strip prefixes under 3 characters and integrate InnoDB stop words - Added field-specific weighting (5× for titles, 1× for bodies/descriptions)
- Serialized the index using space-efficient Base85 encoding
type document = { id : int; name : string; description : string; body : string; url : string; created : string; (* ISO8601 *) } let add_document t doc = Index_impl.add_document t.core doc.id doc; t.docs <- doc :: t.docs
- Modified the
Client-Side Execution via js_of_ocaml
- Compiled the search library to JavaScript with dead-code elimination (
--opt=3) - Embedded serialized index directly in HTML:
<script id="search-index" type="text/plain">{compressed_index}</script> <script src="/search-client.bc.js"></script>
- Exposed a minimal JavaScript API:
let js_query q = SI.search ~limit:10 !index (to_string q) let () = Js.Unsafe.set Js.Unsafe.global "searchClient" (obj [|"query", wrap_callback js_query|])
- Compiled the search library to JavaScript with dead-code elimination (
Performance Payoff
Through aggressive optimization:
- Search library shrunk from 26k to 2k LOC after minification
- Total search page payload: 76KB (Brotli) / 132KB uncompressed
- Sub-millisecond query execution in-browser
Why This Matters
This implementation demonstrates how compiled languages like OCaml can transcend traditional boundaries between build-time and client-side execution. By unifying the toolchain:
- No abstraction leaks: Identical tokenization/logic at build and runtime
- Dependency hygiene: Entire solution lives within OPAM/dune ecosystem
- Future-proof: Base85 serialization allows index growth without format shifts
As Leighton notes: "It tickles me to use a language-to-language compiler to reuse code across contexts." For static site authors resisting JavaScript framework bloat, this OCaml-centric approach offers a compelling template for adding dynamic features without surrendering to complexity.
Source: Alex Leighton's blog