#Trends

The Boilerplate Paradox: How 60-Year-Old Lisp Concepts Outperform Modern Languages in Code Density

Tech Essays Reporter
2 min read

A groundbreaking analysis of 400 million lines of code reveals that Clojure and Haskell produce the most expressive code per keystroke, while modern languages like Go and Rust show surprisingly high boilerplate levels—challenging assumptions about programming language evolution.

The Measurement That Changes How We Judge Language Efficiency

For decades, software engineers have debated programming language efficiency through subjective lenses of syntax preference and ecosystem maturity. Ben Boyter's ULOC (Unique Lines of Code) metric introduces an objective framework, analyzing 400 million lines from the top 100 GitHub repositories across 34 languages. The results upend conventional wisdom about language expressiveness.

Methodology: Beyond Line Counts

Traditional SLOC (Source Lines of Code) measurements fail to account for:

  1. Structural repetition (closing braces, mandatory imports)
  2. License header inflation
  3. Comment maintenance costs

ULOC addresses these by:

  • Counting only unique logical units
  • Including comments as maintainable artifacts
  • Excluding universal boilerplate

The analysis used Boyter's scc tool with a custom automation script to process 2,703,656 files from 3,418 repositories, calculating "dryness" percentages (ULOC / total lines).

The Density Hierarchy

Languages ranked by expressiveness (ULOC/total):

Tier Dryness Languages Characteristics
High 75%+ Clojure, Haskell, MATLAB Functional paradigms, minimal syntax
Standard 60-70% Java, Python, TypeScript Balanced logic/structure
Boilerplate <55% C#, Go, CSS Mandatory ceremonies, config bloat

Surprising Findings:

  1. Java (65.72%) outperforms Kotlin (67.72%) and Scala (66.1%) in JVM ecosystem dryness
  2. CoffeeScript (70.05%) beats modern alternatives like TypeScript (63.34%)
  3. Go (58.78%) and Rust (60.5%) show nearly identical boilerplate levels

The Lisp Renaissance

Clojure's 77.91% dryness demonstrates Lisp's enduring advantage: every line expresses business logic rather than structural ceremony. Compared to C#'s 58.4%, Clojure developers write 20% less boilerplate—equivalent to saving one workday per week on redundant code.

Modern Language Tradeoffs

Despite improvements in:

  • Memory safety (Rust)
  • Concurrency (Go)
  • Type systems (TypeScript)

these languages introduce new forms of boilerplate:

  1. Go's explicit error handling
  2. Rust's trait implementations
  3. TypeScript's type guards

As Boyter notes: "We spent decades building modern languages to solve old mistakes, but increased our noise-to-signal ratio."

The LLM Wildcard

Large Language Models could neutralize boilerplate disadvantages by:

  1. Auto-generating repetitive patterns
  2. Abstracting ceremony behind natural language prompts
  3. Compressing verbose syntax

However, this creates new challenges in:

  • Code review effectiveness
  • Architectural coherence
  • Cognitive load from generated code

Implications for Engineering Leaders

  1. Language Selection: Projects requiring rapid iteration benefit more from high-dryness languages
  2. Tooling Investments: Linters/IDEs must target language-specific boilerplate hotspots
  3. Training: Engineers need conscious boilerplate recognition skills

As Boyter concludes: "If you want the highest ratio of human thought to keystrokes, the winner is the 60-year-old concept running as a modern JVM language." This paradox forces reevaluation of what constitutes true progress in language design.


Data and methodology: Full technical write-up, Analysis script, scc tool

Comments

Loading comments...