Lisp and the Idea of Code as Data

In 1960, John McCarthy published a paper with a forbidding title — Recursive Functions of Symbolic Expressions and Their Computation by Machine, Part I — that turned out to contain one of the most beautiful ideas in the history of computing. The language it described, Lisp, is now over sixty years old, which makes it the second-oldest high-level language still in use. But its age is the least interesting thing about it. What matters is a single structural decision, and everything that decision made possible.

Programs are made of the same stuff as data#

In most programming languages there is a firm wall between code and data. Code is the active stuff — instructions, the program. Data is the passive stuff — the numbers and strings the program pushes around. They look different, they're written differently, and a program can freely manipulate data but cannot easily get its hands on code.

Lisp tears the wall down. In Lisp, a program is written as a list — and lists are exactly the kind of data Lisp is built to manipulate. The expression (+ 1 2) is, at the same time, a function call that adds one and two, and an ordinary three-element list whose first element is the symbol +. There's no conversion, no special parsing you have to do yourself. Code is data, in the most literal sense: a program is a data structure your program can take apart and build up with the same tools it uses for everything else.

This property — a language whose code is represented in its own primary data structure — later got a name: homoiconicity. But the name came afterward. McCarthy just needed a simple, uniform way to write expressions, chose lists, and in doing so changed what programs could do to programs.

Why that changes everything#

Once code is data, a cascade of powers falls out, almost for free.

You can write programs that write programs — because generating code is just generating a list, which is something you already know how to do. This is what Lisp macros are: functions that run while your program is being compiled, take the code you wrote as data, and return new code in its place. A macro lets you extend the language itself, adding new control structures and new notation that look exactly like built-in features. In most languages the syntax is a fixed thing handed down from on high; in Lisp it's clay.

You can treat functions as values — pass them around, store them, build them — which made Lisp the early home of functional programming and ideas like higher-order functions and closures, decades before they reached the mainstream.

And you can do the thing that gives this essay its punchline: you can write an interpreter for Lisp, in Lisp, in about a page.

The Maxwell's equations of software#

In his 1960 paper, McCarthy defined a function called eval — a Lisp program that takes a Lisp expression (as data, naturally) and evaluates it, returning its value. But to define eval is to define what Lisp means: it spells out, in Lisp itself, exactly how every kind of expression should be computed. It is Lisp explaining Lisp to itself, with nothing left over. This is the original metacircular evaluator.

Alan Kay tells the story of encountering this and being stunned. He had a physics background, and he recognized the feeling: a few lines of definition from which an entire universe of behavior unfolds. He called McCarthy's eval

the Maxwell's equations of software.

The comparison is exact. James Clerk Maxwell compressed all of classical electromagnetism into a handful of equations; from those few lines, every electric and magnetic phenomenon follows. McCarthy's half-page of Lisp does the same for a whole model of computation: from it, the entire language follows. To read it is to see a language hold up a mirror to itself and lose nothing in the reflection — which is only possible because, in Lisp, the thing being described (code) and the thing doing the describing (data) are made of the same substance.

The long shadow#

Lisp's direct descendants are a distinguished family — Scheme, Common Lisp, Clojure, and, by way of its list-processing and clean recursion, Papert's Logo, the language that put these ideas in front of children. Through the textbook Structure and Interpretation of Computer Programs, which builds the metacircular evaluator as its climax, generations of programmers met the idea that a program is a medium for expressing ideas, and that the most powerful thing a language can do is let you reshape it.

But Lisp's deeper influence is sneakier: half of what's praised as "new" in modern languages is Lisp arriving late. Garbage collection, first-class functions, closures, dynamic typing, the REPL, interactive development — all were in Lisp early, and trickled into the mainstream over the following decades, often without credit. It's a recurring joke that any sufficiently advanced language is slowly reinventing a subset of Lisp.

There's a melancholy version of this story, too. For all its power, Lisp never dominated the way its admirers thought it should, losing ground to cruder, more static languages — a textbook case of Richard Gabriel's worse is better, and not unrelated to why Smalltalk's elegant dynamism also stayed marginal. The most flexible ideas were often not the ones that spread.

But the idea itself is indestructible, because it's not really about a language. It's about a stance: that the boundary between the program and the things the program works on is not fixed by nature, and that enormous power is waiting on the other side of erasing it. McCarthy wrote that down in 1960, in a page of lists. We're still unpacking it.

LispJohn McCarthycode as datahomoiconicitymetacircular evaluator