#Dev

Shrubbery Notation and the Quiet Reinvention of How Lisp Reads Code

Tech Essays Reporter
5 min read

Rhombus, the experimental language growing out of Racket, leans on a layered text format called shrubbery notation that does something unusual: it groups code by lines and indentation without committing to what any of it means. The result is a careful negotiation between the macro power that made Lisp famous and the conventional surface syntax that scared most programmers away from it.

Programming language design rarely splits its concerns as cleanly as shrubbery notation does. The notation, which underpins the Rhombus language being built on top of Racket, is not a grammar in the usual sense. It is a set of text-level conventions that produce a partially grouped structure and then deliberately stop, handing the remaining work to a later parsing layer the Rhombus designers call enforestation. That decision to do less, and to draw a precise line around how much less, is the most interesting thing about it.

The core argument behind shrubbery notation is that the two great traditions of language surface syntax have each paid for their strengths with a corresponding weakness, and that a well-chosen intermediate representation can keep more of the strengths than either side usually admits. Lisp and its descendants, Racket among them, expose their syntax as data through S-expressions, which makes macros uniformly powerful because every program is already a tree of parenthesized lists. The cost is a reading experience that most programmers never warm to, where structure is carried almost entirely by nesting parentheses rather than by the visual cues, infix operators, and indentation that the rest of the programming world relies on. Conventional languages invert the tradeoff. They offer familiar notation and pay for it with grammars that resist the kind of deep, hygienic macro extension that Racket treats as ordinary.

What the notation actually commits to

Shrubbery notation tries to occupy the middle by being explicit about what it knows and what it refuses to decide. It is line- and indentation-sensitive, and its job is to impose enough grouping that any later parsing stays consistent with how the code looks on the page. The specification builds this grouping from a small number of mechanisms that interact in regular ways.

Groups form along lines. Opener and closer pairs, the parentheses, brackets, and braces, group their contents regardless of layout. A colon followed by indentation introduces a block, which is how shrubbery represents the bodies that other languages would wrap in braces or mark with keywords. An operator at the start of a continuation line extends the current group rather than starting a new one, so a long expression can wrap without ambiguity. The vertical bar introduces alternatives, the shape you want for conditionals and pattern matching. Semicolons and commas separate groups on a single line when indentation alone would be awkward.

What the notation produces from all of this is a parsed representation that is still itself an S-expression. This is the move that makes the whole design coherent. Shrubbery does not abandon the Lisp insight that code should be available as data. It changes the surface a programmer types and reads, then lowers that surface into a tree that tools and macros can manipulate with the same uniformity Racket programmers expect. A shrubbery is a tree with the parentheses mostly gone from the page but fully present in the structure underneath.

The inclusion of escape mechanisms is telling about the design philosophy. The guillemet characters, the «and» pair, let a programmer override the column- and line-sensitivity when a construct needs to ignore layout, and the backslash continues a line in the older, layout-insensitive style. A notation confident in indentation alone would not need these. Their presence is an admission that indentation is the right default rather than an absolute law, and that a serious language has to provide an exit for the cases where the default fights the programmer instead of helping.

Why the layering matters beyond Rhombus

The consequence worth sitting with is that shrubbery decouples the question of layout from the question of meaning. The notation knows that a block follows a colon and that an operator continues a group, but it does not know what a colon means, what any operator does, or whether a given identifier names a function, a macro, or a type. All of that lives in enforestation and the Rhombus language layered on top. Because the grouping layer is fixed and shared, different language variants can be built on the same notation, and editor tooling can reason about structure without understanding semantics.

That tooling story is not an afterthought in the documentation, which devotes real attention to syntax coloring, indentation behavior, and term and group navigation, along with concrete editor support in DrRacket. An editor that knows the grouping rules can indent correctly and navigate by structure even for a language extension it has never seen, because the grouping never changes. This is a quieter version of the same benefit S-expressions gave Lisp editors, recovered for a syntax that no longer looks like Lisp. The parser also preserves source locations and raw-text properties, which means a shrubbery can be read, transformed, and written back out while retaining the programmer's original formatting, a property that matters enormously for refactoring tools and for macros that need to report errors against code the programmer actually wrote.

The honest counter-perspective is that intermediate layers have a way of leaking. Every additional stage between what a programmer types and what the machine runs is another place where the mental model can diverge from behavior, and indentation-sensitive grouping has a long history of subtle failures, the kind where a misaligned line changes meaning without any visible error. Shrubbery's escape hatches reduce but do not eliminate this, and a programmer debugging an indentation surprise now has to reason about grouping rules, enforestation, and language semantics as separate but interacting systems. There is also a reasonable skepticism about whether the broader world wants a more approachable Lisp at all, given how many attempts have come before. The design considerations section of the documentation engages directly with this prior art, which suggests the authors are aware they are walking a path littered with earlier efforts.

Still, the shape of the idea is sound, and it is more general than its current home. Separating the grouping of text from the interpretation of that grouping is a useful boundary for any language that wants both macro extensibility and a conventional reading experience. Shrubbery notation is a concrete, carefully specified argument that you do not have to choose between the tree and the page, and that the seam between them can be drawn deliberately rather than left to accident. Whether Rhombus succeeds as a language, that argument is the part most likely to outlast it.

Comments

Loading comments...