#Dev

Goal-Directed Evaluation: What Icon Still Teaches Us About Language Design

Tech Essays Reporter
6 min read

David Hanson's compact 1993 introduction to the Icon programming language describes a system where failure is a first-class outcome and expressions can produce many values. Decades later, the ideas behind its goal-directed evaluation feel more relevant than the language's modest profile would suggest.

When David Hanson wrote his two-page introduction to Icon for the second History of Programming Languages conference, he was documenting a language that already had a reputation among a small circle of admirers and almost no presence in the mainstream. Reading it now, the brevity is the point. Icon was designed by Ralph Griswold and his collaborators at the University of Arizona to make text and structure manipulation so direct that a thirteen-line program could replace something many times longer in C. The deeper lesson, though, has less to do with line counts and more to do with a single design decision that reorganizes how an entire language behaves.

The central claim Hanson makes is that Icon's most distinguishing characteristic is its goal-directed expression evaluation. In most languages, an expression computes a value, and that value either exists or causes an error. Icon takes a different stance. An expression either succeeds, producing at least one value, or it fails, producing none. Failure is not an exception or a sentinel return code. It is an ordinary outcome that the language knows how to react to, and that reaction is woven through every control structure. This is the thesis around which everything else in the language arranges itself, and it is worth taking seriously because so few languages before or since have committed to it as fully.

Generators and the idea that expressions yield sequences

The first supporting idea is the generator. A conventional expression like x + 5 produces one value. An Icon expression like x | 5 produces x, and then, if the surrounding computation demands another value, it produces 5. Built-in generators include x to y, which yields the integers in that range, and !x, which yields each character of a string or each element of an aggregate. Procedures can themselves be written as generators using suspend, which behaves like a return that can be resumed to deliver further values.

This turns a familiar mental model on its head. An expression is no longer a recipe for one answer but a potential sequence of answers, evaluated lazily and only as far as the context requires. The evaluation engine tries combinations of generated values in pursuit of success. Consider y < (x | 5). The comparison first tests y against x. If that fails, the subexpression offers its next value, 5, and the comparison tries again. The whole thing succeeds if either branch works, and a comparison operator, when it succeeds, returns the value of its right operand. The control flow that a programmer would normally spell out with nested conditionals is absorbed into the way values flow through the expression.

Failure as a control mechanism

The second supporting idea follows directly. Because failure is an ordinary signal, it can drive control structures and suppress further computation. The assignment max := max < x updates max only when the comparison succeeds, which is only when max is less than x. There is no separate if. Procedures inherit the same discipline: a call happens only when all of its arguments succeed, so write("y=", (x | 5) > y) prints nothing at all if neither comparison holds. The backtracking that makes this work is deliberately bounded. It stays inside the expression where it originates, so a failure in one statement does not unwind a successful assignment that preceded it.

The every control structure makes the generator and failure machinery explicit. every e1 do e2 drives the first expression, evaluating the second for each value the first produces. Hanson's example of printing the first and last ten elements of a hundred-element list, every write(p[(1 to 10) | (91 to 100)]), captures the economy of the approach. The index variable that an imperative loop would require simply disappears, because the generator supplies the sequence of positions and the evaluation mechanism walks through them.

String scanning as a domain inside the language

Icon's reputation rested heavily on string processing, the lineage it inherited from its predecessor SNOBOL4. The string scanning facility, written s ? e, establishes a subject string and a position, then applies analysis operations within e. Functions like find, move, tab, upto, and many work against this implicit subject and return positions, which are locations between characters rather than the characters themselves. Hanson's getword procedure shows how these compose: tab(upto(&letters)) advances to the next letter, tab(many(&letters)) consumes the run of letters that follows, and suspend hands back each word in turn while remaining ready to resume.

The common words program ties the pieces together. It reads input line by line, generates words with getword, accumulates frequencies in an associative table whose entries default to zero, sorts the table by count, and pulls the most frequent entries off the end of the resulting list. Thirteen lines. Hanson points the reader to the equivalent C program published in Communications of the ACM in 1987, where the same task sprawls across far more code. The comparison is not bragging for its own sake. It demonstrates that when a language internalizes the patterns a programmer keeps reaching for, generation, search, and graceful failure, the surface area of a program shrinks toward the actual problem.

What the design implies for language thinking

Icon never became a popular production language, and its successor Unicon extends it with objects and additional libraries while keeping the core evaluation model intact. Yet the ideas migrated. Generators in Python, sequences and lazy evaluation in functional languages, and the broad category of pattern-matching and parser-combinator libraries all echo the recognition that expressions can usefully yield more than one value and that failure can be a routine, composable outcome rather than a catastrophe. Icon argued, more completely than most, that control flow and value production do not have to be separate concerns.

There is a counter-perspective that deserves honest weight. Goal-directed evaluation asks a programmer to reason about backtracking and implicit search, and that reasoning can become difficult precisely when programs grow large. The same mechanism that makes a ten-line text utility elegant can make a sprawling system harder to trace, because the path of execution depends on chains of success and failure that are not always visible at a glance. The bounded backtracking Hanson describes was, in part, a concession to this risk, keeping the search local so that programmers could still predict behavior. Languages that adopted generators later tended to expose them as an explicit feature rather than as the substrate of all evaluation, which suggests the broader community found value in the idea but caution in its totality.

What endures from Hanson's brief introduction is a reminder that the defaults of a language are arguments about how people should think. By making failure ordinary and values plural, Icon proposed that much of the bookkeeping we treat as essential is really an artifact of languages that insist every expression must produce exactly one answer. That proposition remains a useful provocation for anyone designing a language today, even if the answer most of us arrived at was to borrow the idea rather than to adopt it wholesale.

Comments

Loading comments...