The Boilerplate Paradox: How 60-Year-Old Lisp Concepts Outperform Modern Languages in Code Density

A groundbreaking analysis of 400 million lines of code reveals that Clojure and Haskell produce the most expressive code per keystroke, while modern languages like Go and Rust show surprisingly high boilerplate levels—challenging assumptions about programming language evolution.

The Measurement That Changes How We Judge Language Efficiency

For decades, software engineers have debated programming language efficiency through subjective lenses of syntax preference and ecosystem maturity. Ben Boyter's ULOC (Unique Lines of Code) metric introduces an objective framework, analyzing 400 million lines from the top 100 GitHub repositories across 34 languages. The results upend conventional wisdom about language expressiveness.

Methodology: Beyond Line Counts

Traditional SLOC (Source Lines of Code) measurements fail to account for:

Structural repetition (closing braces, mandatory imports)
License header inflation
Comment maintenance costs

ULOC addresses these by:

Counting only unique logical units
Including comments as maintainable artifacts
Excluding universal boilerplate

The analysis used Boyter's scc tool with a custom automation script to process 2,703,656 files from 3,418 repositories, calculating "dryness" percentages (ULOC / total lines).

The Density Hierarchy

Languages ranked by expressiveness (ULOC/total):

Tier	Dryness	Languages	Characteristics
High	75%+	Clojure, Haskell, MATLAB	Functional paradigms, minimal syntax
Standard	60-70%	Java, Python, TypeScript	Balanced logic/structure
Boilerplate	<55%	C#, Go, CSS	Mandatory ceremonies, config bloat

Surprising Findings:

Java (65.72%) outperforms Kotlin (67.72%) and Scala (66.1%) in JVM ecosystem dryness
CoffeeScript (70.05%) beats modern alternatives like TypeScript (63.34%)
Go (58.78%) and Rust (60.5%) show nearly identical boilerplate levels

The Lisp Renaissance

Clojure's 77.91% dryness demonstrates Lisp's enduring advantage: every line expresses business logic rather than structural ceremony. Compared to C#'s 58.4%, Clojure developers write 20% less boilerplate—equivalent to saving one workday per week on redundant code.

Modern Language Tradeoffs

Despite improvements in:

Memory safety (Rust)
Concurrency (Go)
Type systems (TypeScript)

these languages introduce new forms of boilerplate:

Go's explicit error handling
Rust's trait implementations
TypeScript's type guards

As Boyter notes: "We spent decades building modern languages to solve old mistakes, but increased our noise-to-signal ratio."

The LLM Wildcard

Large Language Models could neutralize boilerplate disadvantages by:

Auto-generating repetitive patterns
Abstracting ceremony behind natural language prompts
Compressing verbose syntax

However, this creates new challenges in:

Code review effectiveness
Architectural coherence
Cognitive load from generated code

Implications for Engineering Leaders

Language Selection: Projects requiring rapid iteration benefit more from high-dryness languages
Tooling Investments: Linters/IDEs must target language-specific boilerplate hotspots
Training: Engineers need conscious boilerplate recognition skills

As Boyter concludes: "If you want the highest ratio of human thought to keystrokes, the winner is the 60-year-old concept running as a modern JVM language." This paradox forces reevaluation of what constitutes true progress in language design.

Data and methodology: Full technical write-up, Analysis script, scc tool