Parsing JSON typically conjures images of recursive descent or state machines, but in array languages like BQN, a radically different approach emerges—one leveraging flat arrays, depth scans, and strategic indexing. This article walks through building a JSON parser for a practical subset (integers, strings, objects, arrays) using BQN's array-oriented paradigm, where control flow yields to mathematical transformations.

Why Flat Arrays?

Traditional parsers rely heavily on branching and nested function calls. BQN flips this: operations apply uniformly across entire arrays. The result? Code that's dense, branchless, and often faster. We achieve this through:
1. Tokenization: Converting input into a "blueprint" of tokens (0 for values, brackets as-is).
2. Depth Ordering: Using cumulative sums to track nesting depth via +``opening - closing.
3. Value Indexing: Assigning positions to literals and sublists via permutation vectors like ⍋⍋.

Tokenization: From Chaos to Structure

Start by identifying numbers, strings, and structural tokens. For numbers:

Tokenise ← {
  e ← »<`'\\'=𝕩             # Handle escapes
  sb ← »⊸< s← ≠`(¬e)∧'"'=𝕩 # String starts
  sr ← (1-˜(s>sb)×+`sb)⊔𝕩  # Extract strings

  ex ← s∨ 𝕩∊@+9‿10‿13‿32   # Exclude whitespace/strings
  nb ← »⊸< n←(¬ex)∧𝕩∊'0'+↕10 # Number starts
  nr ← •ParseFloat¨(1-˜n×+`nb)⊔𝕩 # Parse numbers

  ts ← sb∨(¬ex)∧nb∨𝕩∊"[]{},:" # All tokens
  ⟨ts/ '0'¨⌾(nb⊸/) 𝕩, nr, sr⟩ # Return tokens & values
}

Key insight: Group vectors () and under () elegantly isolate values while preserving structure. For strings, backslash escaping is resolved via a rolling mask (»<'\'=𝕩`).

Parsing: Depth Scans and Value Reconstruction

The parser navigates nested structures using depth permutations and value indices:
1. Depth Ordering: d ← ⍋+``(ts∊"{[")-(ts∊"]}") creates a breadth-first traversal order.
2. Sublist Identification: Nesting levels are tracked via n ← (⍋d)⊏+``(d⊏ts)∊"[{".
3. Object Handling: Keys are separated from values using colon positions and object-depth masks.
4. Value Assembly: Literals and sublists are stitched together bottom-up using fixed indices:

Parse ← {
  # ... (tokenize, compute depth, nesting)
  vs ← nums ∾ ⊑ks                 # Initial values (numbers + non-key strings)
  vi ← ⍋⍋ (lit_mask) + values     # Assign indices
  vi ⊏↩ (↕≠vs)∾(≠vs)+ ≠⊸- c/»n   # Adjust for nesting

  # Build nested structures recursively
  {vs ∾↩ <𝕨Sel𝕩⊏vs}¨○⌽ grouped_indices
  ¯1⊑vs  # Final parsed structure
}

Visualizing depth ordering transforms ][1][ into a traversal path.

Why This Approach Shines

  • No Explicit Branching: Logic encoded in array operations (e.g., +`` for depth).
  • Data Parallelism: Operations apply holistically, leveraging SIMD-like optimizations.
  • Debuggability: Flat arrays make intermediate states inspectable.

Tradeoffs and Joys

As the author notes:

"It’s hard to write, hard to read, hard to modify, but an absolute joy to debug."

While not suited for all parsing tasks, this method offers a refreshing perspective: complexity emerges from composition, not control flow. The complete parser fits in <30 lines—a testament to BQN’s expressive power.


Source: Writing a JSON Parser in BQN by Tony Zorman.