#Dev

When Code Is Data: What SCSS Teaches Us About Designing Domain Specific Languages

Tech Essays Reporter
7 min read

A Scheme programmer's attempt to represent CSS as first-class data turns into a meditation on the deeper question of what separates a good domain specific language from a clumsy one. The answer lives in the gap between generating output and actually modeling a problem.

Most of us encounter CSS preprocessors as conveniences. We reach for Sass or Less because plain CSS makes us repeat color values, because a single brand color scattered across a stylesheet becomes a find-and-replace minefield, and because the language offers no native way to say that two style rules are logically related even when they share everything. These tools paper over real deficiencies. But the more interesting story is not that they exist; it is what their design choices reveal about the act of wrapping one language inside another.

The author of the post under discussion, a self-described "smug Scheme weenie," starts from a position that most web developers never consider. To a Lisp programmer, code and data are the same substance. So the goal is not merely to generate CSS from a friendlier syntax, which is what Sass and Less do, but to represent CSS itself as a first-class value, a structure you can hold, inspect, and transform with the same machinery you use for any other data. This is the promise of SCSS, a Scheme library that expresses stylesheets as s-expressions. The post is the first in a series about Lispy DSLs, and it uses SCSS as a cautionary opening example: a design that aimed for first-class representation but stopped short of delivering its full benefit.

The thesis: representation is not the same as modeling

The central argument is deceptively simple. Putting CSS into parentheses is not the same as modeling CSS. SCSS represents a rule set as a list of selectors paired with a list of property/value declarations, which looks elegant at first glance. A rule like #my-id, p.my-class, div { background-color: green; } maps cleanly onto nested Scheme lists. The trouble begins the moment you look past the surface structure into the values themselves.

In SCSS, property values are stored as flat strings or symbols. The string "1px solid rgb(0, 128, 0)" is opaque. The library knows it is a value for the border property, but it knows nothing about what is inside it. This single decision quietly cancels most of the reasons you wanted first-class data in the first place.

The evidence: three failures that compound

The post builds its case through a sequence of concrete problems, and the cleverness is in showing how each one is a different symptom of the same root cause.

The first is that you cannot query or compose values without resorting to string manipulation. Because green, rgb(0,128,0), and #008000 are three spellings of the identical color, finding every rule that uses that color requires parsing the strings back into something structured. Composing a value from variables means building strings with sprintf, which the author rightly notes mostly defeats the point of a first-class representation. If you are concatenating text to produce your output, you have a templating system wearing the costume of a data structure.

The second failure is sharper because it crosses into safety. Strings injected directly into CSS are not escaped, so any user-supplied value, a font name or a color, can carry a stray semicolon or curly brace that breaks the layout or, worse, opens a security hole. The author drives the point home with a single line where an entire malicious declaration hides inside what looks like a color value. A representation that cannot distinguish data from syntax cannot protect you from injection, and this is the same lesson the web learned painfully with SQL and HTML.

The third problem is framed as a problem in scare quotes, because it actually points toward the solution. CSS is riddled with shorthand properties. border expands into border-top, border-right, border-bottom, and border-left. Each of those expands again into width, style, and color. Decomposing these shorthands is exactly the kind of transformation a first-class representation should make trivial, and it is exactly what SCSS cannot do, because the information lives inside unparsed strings. The author connects this to Scheme macros, which rewrite convenient surface notation down to a small core language. The better design would compile CSS shorthands the same way, assembling complex properties from simple atomic ones rather than treating the complex forms as primitive.

The implication: find the atoms, then build up

What saves the situation, and what generalizes beyond CSS, is the observation that all of CSS's baroque complexity is constructed from a small and relatively stable set of atomic types. Lengths with their units, URIs, colors, and a handful of others. The property syntaxes change constantly. CSS3 added gradients, elaborate background shorthands, and a genuinely new nested syntax for animation keyframes, the only place in the language where curly braces nest inside curly braces. Trying to model every property exhaustively is a losing race against a moving specification. Modeling the atoms is tractable, because the atoms barely move.

This is the same insight the W3C reached, somewhat accidentally, when it designed the CSS DOM API. The post is candid that the DOM API is unwieldy and un-Lispy in its object-oriented heaviness, but its underlying instinct, expose the atomic value types as manipulable objects, is correct. The redesign sketched in the post follows this path. Strings are banished except where genuinely appropriate, where they are always quoted and escaped. Lengths become expressions like (em 1) and (px 1). Colors become either symbols that alias constructors or full color objects you can operate on, so that (darken green .5) is a meaningful expression rather than impossible string surgery. Vectors express ordered sequences like font stacks. Composite properties become expressions with extra subexpressions instead of opaque text.

The counter-perspective the author respects, and the one he doesn't

The post pauses to acknowledge Bert Bos, one of the creators of CSS, who argues that real programming features, even constants, do not belong in CSS at all. The author is dismissive, characterizing the argument as paternalism toward less sophisticated authors, and the heat of that reaction is worth examining. Bos has a defensible position that the author undersells: a styling language used by millions, including non-programmers, gains something real from being declarative and inert. Power and abstraction carry costs in comprehensibility and in the surface area for mistakes. The preprocessor ecosystem exists precisely because that trade-off does not satisfy everyone, but it is a trade-off, not a settled question.

The author is more honest when he turns the critical lens on his own work. He brings up intarweb, his Scheme library for HTTP headers, as an example of the opposite mistake. There, every header type has its own bespoke parser and constructor, with no native syntax binding them together. Users kept asking how to do simple things, or just begged for a way to write a raw header. He counts this as a DSL failure, while also defending the genuine value of never having to parse a cookie or an authentication attribute by hand twice. That tension, between a unified small set of atoms and a sprawl of special-case parsers, is the practical heart of the whole essay.

What carries over to any DSL

The closing rules are where the post earns its place as the start of a series rather than a one-off critique. Identify the atomic building blocks first, and treat a large number of them as a warning sign. Decide deliberately which atoms deserve structured first-class representation and which can remain unstructured symbols or strings, because not everything needs a type; the author cannot think of a useful operation on font names, so he leaves sans-serif a plain symbol. Study how the language has evolved and extended itself in the past, because your representation must survive the directions it will keep growing. And spend parentheses and noise symbols carefully, balancing the convenience of writing against the convenience of programmatic manipulation, since those two goals frequently pull against each other.

The most valuable admission comes last. Some of the advice is vague, some of it is contradictory, and some of it is obvious to anyone who has built one of these things. Design is hard and resolves into trade-offs rather than answers. The practical counsel that survives is to decide what use cases the DSL must support before deciding how to answer any specific design question. SCSS, in the author's reading, suffered from too little design, while it is just as easy to overbuild a language into something nobody can extend. The discipline lies in knowing which failure you are closer to, and the SCSS case suggests that the line between a structure you can compute over and a structure that merely looks structured is exactly the line between a working DSL and a decorated string template.

Comments

Loading comments...