August 02, 2004

The Early History of Smalltalk

Alan C. Kay [acm]
[PDF]

The Early History of Smalltalk
Alan C. Kay

\begin Unknown LaTeX command \begin abstract

Most ideas come from previous ideas. The sixties, particularly in the arpa community, gave rise to a host of notions about “human-computer symbiosis” through interactive time-shared computers, graphics screens and pointing devices. Advanced computer languages were invented to simulate complex systems such as oil refineries and semi-intelligent behavior. The soon-to-follow paradigm shift of modern personal computing, overlapping window interfaces, and object-oriented design came from seeing the work of the sixties as something more than a “better old thing.” This is, more than a better way: to do mainframe computing; for end-users to invoke functionality; to make data structures more abstract. Instead the promise of exponential growth in computing volume demanded that the sixties be regarded as “almost a new thing” and to find out what the actual “new things” might be. For example, one would computer with a handheld “Dynabook” in a way that would not be possible on a shared mainframe; millions of potential users meant that the user interface would have to become a learning environment along the lines of Montessori and Bruner; and needs for large scope, reduction in complexity, and end-user literacy would require that data and control structures be done away with in favor of a more biological scheme of protected universal cells interacting only through messages that could mimic any desired behavior.

Early Smalltalk was the first complete realization of these new points of view as parented by its many predecessors in hardware, language and user interface design. It became the exemplar of the new computing, in part, because we were actually trying for a qualitative shift in belief structures—a new Kuhnian paradigm in the same spirit as the invention of the printing press-and thus took highly extreme positions which almost forced these new styles to be invented.

Introduction

I’m writing this introduction in an airplane at 35,000 feet. On my lap is a five pound notebook computer—1992’s “Interim Dynabook”—by the end of the year it sold for under $700. It has a flat, crisp, high-resolution bitmap screen, overlapping windows, icons, a pointing device, considerable storage and computing capacity, and its best software is object-oriented. It has advanced networking built-in and there are already options for wireless networking. Smalltalk runs on this system, and is one of the main systems I use for my current work with children. In some ways this is more than a Dynaboo (quantitatively), and some ways not quite there yet (qualitatively). All in all, pretty much what was in mind during the late sixties.

Smalltalk was part of this larger pursuit of arpa, and later of Xerox parc, that I called personal computing. There were so many people involved in each stage from the research communities that the accurate allocation of credit for ideas in intractably difficult. Instead, as Bob Barton liked to quote Goethe, we should “share in the excitement of discover without vain attempts to claim priority.”

I will try to show where most of the influences came from and how they were transformed in the magnetic field formed by the new personal computing metaphor. It was the attitudes as well as the great ideas of the pioneers that helped Smalltalk get invented. Many of the people I admired most at this time—such as Ivan Sutherland, Marvin Minsky, Seymour Papert, Gordon Moore, Bob Barton, Dave Evans, Butler Lampson, Jerome Bruner, and others—seemed to have a splendid sense that their creations, though wonderful by relative standards, were not near to the absolute thresholds that had to be crossed. Small minds try to form religions, the great ones just want better routes up the mountain. Where Newton said he saw further by standing on the shoulders of giants, computer scientists all too often stand on each other’s toes. Myopia is still a problem where there are giants’ shoulders to stand on—“outsight” is better than insight—but it can be minimized by using glasses whose lenses are highly sensitive to esthetics and criticism.

Programming languages can be categorized in a number of ways: imperative, applicative, logic-based, problem-oriented, etc. But they all seem to be either an “agglutination of features” or a “crystallization of style.” cobol, pl/1, Ada, etc., belong to the first kind; lisp, apl— and Smalltalk—are the second kind. It is probably not an accident that the agglutinative languages all seem to have been instigated by committees, and the crystallization languages by a single person.

Smalltalk’s design—and existence—is due to the insight that everything we can describe can be represented by the recursive composition of a single kind of behavioral building block that hides its combination of state and process inside itself and can be dealt with only through the exchange of messages. Philosophically, Smalltalk’s objects have much in common with the monads of Leibniz and the notions of 20th century physics and biology. Its way of making objects is quite Platonic in that some of them act as idealisations of concepts—Ideas—from which manifestations can be created. That the Ideas are themselves manifestations (of the Idea-Idea) and that the Idea-Idea is a-kind-of Manifestation-Idea—which is a-kind-of itself, so that the system is completely self-describing— would have been appreciated by Plato as an extremely practical joke [Plato].

In computer terms, Smalltalk is a recursion on the notion of computer itself. Instead of dividing “computer stuff” into things each less strong than the whole—like data structures, procedures, and functions which are the usual paraphernalia of programming languages—each Smalltalk object is a recursion on the entire possibilities of the computer. Thus its semantics are a bit like having thousands and thousands of computer all hooked together by a very fast network. Questions of concrete representation can thus be postponed almost indefinitely because we are mainly concerned that the computers behave appropriately, and are interested in particular strategies only if the results are off or come back too slowly.

Though it has noble ancestors indeed, Smalltalk’s contribution is anew design paradigm—which I called object-oriented—for attacking large problems of the professional programmer, and making small ones possible for the novice user. Object-oriented design is a successful attempt to qualitatively improve the efficiency of modeling the ever more complex dynamic systems and user relationships made possible by the silicon explosion.

“We would know what they thought
when the did it.”
—Richard Hamming

“Memory and imagination are but two
words for the same thing.”
—Thomas Hobbes

In this history I will try to be true to Hamming’s request as moderated by Hobbes’ observation. I have had difficulty in previous attempts to write about Smalltalk because my emotional involvement has always been centered on personal computing as an amplifier for human reach—rather than programming system design—and we haven’t got there yet. Though I was the instigator and original designer of Smalltalk, it has always belonged more to the people who make it work and got it out the door, especially Dan Ingalls and Adele Goldberg. Each of the LRGers contributed in deep and remarkable ways to the project, and I wish there was enough space to do them all justice. But I think all of us would agree that for most of the development of Smalltalk, Dan was the central figure. Programming is at heart a practical art in which real things are built, and a real implementation thus has to exist. In fact many if not most languages are in use today not because they have any real merits but because of their existence on one or more machines, their ability to be bootstrapped, etc. But Dan was far more than a great implementer, he also became more and more of the designer, not just of the language but also of the user interface as Smalltalk moved into the practical world.

Here, I will try to center focus on the events leading up to Smalltalk-72 and its transition to its modern form as Smalltalk-76. Most of the ideas occurred here, and many of the earliest stages of oop are poorly documented in references almost impossible to find.

This history is too long, but I was amazed at how many people and systems that had an influence appear only as shadows or not at all. I am sorry not to be able to say more about Bob Balzer, Bob Barton, Danny Bobrow, Steve Carr, Wes Clark, Barbara Deutsch, Peter Deutsch, Bill Duvall, Bob Flegal, Laura Gould, Bruce Horn, Butler Lampson, Dave Liddle, William Newman, Bill Paxton, Trygve Reenskaug, Dave Robson, Doug Ross, Paul Rovner, Bob Sproull, Dan Swinehart, Bert Sutherland, Bob Taylor, Warren Teitelman, Bonnie Tennenbaum, Chuck Thacker, and John Warnock. Worse, I have omitted to mention many systems whose design I detested, but that generated considerable useful ideas and attitudes in reaction. In other words, histories” should not be believed very seriously but considered as “feeble gestures pff” done long after the actors have departed the stage.

Thanks to the numerous reviewers for enduring the many drafts they had to comment on. Special thanks to Mike Mahoney for helping so gently that I heeded his suggestions and so well that they greatly improved this essay—and to Jean Sammet, an old old friend, who quite literally frightened me into finishing it—I did not want to find out what would happen if I were late. Sherri McLoughlin and Kim Rose were of great help in getting all the materials together.

I. 1960-66—Early oop and other formative ideas of the sixties

Though oop came from many motivations, two were central. The large scale one was to find a better module scheme for complex systems involving hiding of details, and the small scale one was to find a more flexible version of assignment, and then to try to eliminate it altogether. As with most new ideas, it originally happened in isolated fits and starts.

New ideas go through stages of acceptance, both from within and without. From within, the sequence moves from “barely seeing” a pattern several times, then noting it but not perceiving its “cosmic” significance, then using it operationally in several areas, then comes a “grand rotation” in which the pattern becomes the center of a new way of thinking, and finally, it turns into the same kind of inflexible religion that it originally broke away from. From without, as Schopenhauer noted, the new idea is first denounced as the work of the insane, in a few years it is considered obvious and mundane, and finally the original denouncers will claim to have invented it.

True to the stages, I “barely saw” the idea several times ca. 1961 while a programmer in the Air Force. The first was on the Burroughs 220 in the form of a style for transporting files from one Air Training Command installation to another. There were no standard operating systems or file formats back then, so some (t this day unknown) designer decided to finesse the problem by taking each file and dividing it into three parts. The third part was all of the actual data records of arbitrary size and format. The second part contained the b220 procedures that knew how to get at records and fields to copy and update the third part. And the first part was an array or relative pointers into entry points of the procedures in the second part (the initial pointers were in a standard order representing standard meanings). Needless to say, this was a great idea, and was used in many subsequent systems until the enforced use of cobol drove it out of existence.

The second barely-seeing of the idea came just a little later when atc decided to replace the 220 with a b5000. I didn’t have the perspective to really appreciate it at the time, but I did take note of its segmented storage system, its efficiency of hll compilation and byte-coded execution, its automatic mechanisms for subroutine calling and multiprocess switching, its pure code for sharing, its protected mechanisms, etc. And, I saw that the access to its Program Reference Table corresponded to the 220 file system scheme of providing a procedural interface to a module. However, my big hit from this machine at this time was not the oop idea, but some insights into hll translation and evaluation. [Barton, 1961] [Burroughs, 1961]

After the Air Force, I worked my way through the rest of college by programming mostly retrieval systems for large collections of weather data for the National Center for Atmospheric Research. I got interested in simulation in general—particularly of one machine by another—but aside from doing a one-dimensional version of a bit-field block transfer (bitblt) on a cdc 6600 to simulate word sizes of various machines, most of my attention was distracted by school, or I should say the theatre at school. While in Chippewa Falls helping to debug the 6600, I read an article by Gordon Moore which predicted that integrated silicon on chips was going to exponentially improve in density and cost over many years [Moore 65]. At the time in 1965, standing next to the room-sized freon-cooled 10 mip 6600, his astounding predictions had little projection into my horizons.

Sketchpad and Simula

Through a series of flukes, I wound up in graduate school at the University of Utah in the Fall of 1966, “knowing nothing.” That is to say, I had never heard of arpa or its projects, or that Utah’s main goal in this community was to solve the “hidden line” problem in 3d graphics, until I actually walked into Dave Evans’ office looking for a job and a desk. On Dave’s desk was a foot-high stack of brown covered documents, one of which he handed to me: “Take this and read it.”

Every newcomer got one. The title was “Sketchpad: A man-machine graphical communication system” [Sutherland, 1963]. What it could do was quite remarkable, and completely foreign to any use of a computer I had ever encountered. The three big ideas that were easiest to grapple with were: it was the invention of modern interactive computer graphics; things were described by making a “master drawing” that could produce “instance drawings”; control and dynamics were supplied by “constraints,” also in graphical form, that could be applied to the masters to shape an inter-related parts. Its data structures were hard to understand—the only vaguely familiar construct was the embedding of pointers to procedures and using a process called reverse indexing to jump through them to routines, like the 22- file system [Ross, 1961]. It was the first to have clipping and zooming windows—one “sketched” on a vitual sheet about 1/3 mile square!

Head whirling, I found my desk. ON it was a pile of tapes and listings, and a note: “This is the Algol for the 1108. It doesn’t work. Please make it work.” The latest graduate student gets the latest dirty task.

The documentation was incomprehensible. Supposedly, this was the Case-Western Reserve 1107 Algol—but it had been doctored to make a language called Simula; the documentation read like Norwegian transliterated into English, which in fact it was. There were uses of words like activity and process that didn’t seem to coincide with normal English usage.

Finally, another graduate student and I unrolled the program listing 80 feet down the hall and crawled over it yelling discoveries to each other. The weirdest part was the storage allocator, which did not obey a stack discipline as was usual for Algol. A few days later, that provided the clue. What Simula was allocating were structures very much like the instances of Sketchpad. There wee descriptions that acted like masters and they could create instances, each of which was an independent entity. What Sketchpad called masters and instances, Simula called activities and processes. Moreover, Simula was a procedural language for controlling Sketchpad-like objects, thus having considerably more flexibility than constraints (though at some cost in elegance) [Nygaard, 1966, Nygaard, 1983].

This was the big hit, and I’ve not been the same since. I think the reason the hit had such impact was that I had seen the idea enough times in enough different forms that the final recognition was in such general terms to have the quality of an epiphany. My math major had centered on abstract algebras with their few operations generally applying to many structures. My biology manor had focused on both cell metabolism and larger scale morphogenesis with its notions of simple mechanisms controlling complex processes and one kind of building block able to differentiate into all needed building blocks. The 220 file system, the b5000, Sketchpad, and finally Simula, all used the same idea for different purposes. Bob Barton, the main designer of the b5000 and a professor at Utah had said in one of his talks a few days earlier: “The basic principal of recursive design is to make the parts have the same power as the whole.” For the first time I thought of the whole as the entire computer and wondered why anyone would want to divide it up into weaker things called data structures and procedures. Why not divide it up into little computers, as time sharing was starting to? But not in dozens. Why not thousands of them, each simulating a useful structure?

I recalled the monads of Leibniz, the “dividing nature at its joints” discourse of Plato, and other attempts to parse complexity. Of course, philosophy is about opinion and engineering is about deeds, with science the happy medium somewhere in between. It is not too much of an exaggeration to say that most of my ideas from then on took their roots from Simula—but not as an attempt to improve it. It was the promise of an entirely new way to structure computations that took my fancy. As it turned out, it would take quite a few years to understand how to use the insights and to devise efficient mechanisms to execute them.

II. 1967-69—The flex Machine, a first attempt at an oop-based personal computer

Dave Evans was not a great believer in graduate school as an institution. As with many of the arpa “contracts” he wanted his students to be doing “real things”; they should move through graduate school as quickly as possible; and their theses should advance the state of the art. Dave would often get consulting jobs for his students, and in early 1967, he introduced me to Ed Cheadle, a friendly hardware genius at a local aerospace company who was working on a “little machine.” It was not the first personal computer—that was the linc of Wes Clark—but Ed wanted it for noncomputer professionals, in particular, he wanted to program it in a higher level language, like basic. I said; “What about joss? It’s nicer.” He said: “Sure, whatever you think,” and that was the start of a very pleasant collaboration we called the flex machine. As we jot deeper into the design, we realized that we wanted to dynamically simulate and extend, neither of which joss (or any existing language that I knew of) was particularly good at. The machine was too small for Simula, so that was out. The beauty of joss was the extreme attention of its design to the end-user—in this respect, it has not been surpassed [Joss 1964, Joss 1978]. joss was too slow for serious computing (but cf. Lampson 65), did not have real procedures, variable scope, and so forth. A language that looked a little like joss but had considerably more potential power was Wirth’s euler [Wirth 1966]. This was a generalization of Algol along lines first set forth by van Wijngaarden [van Wijngaarden 1963] in which types were discarded, different features consolidated, procedures were made into first class objects, and so forth. Actually kind of LISPlike, but without the deeper insights of lisp.

But euler was enough of “an almost new thing” to suggest that the same techniques be applied to simply Simula. The euler compiler was a part of its formal definition and made a simple conversion into 85000-like byte-codes. This was appealing because it s suggested the Ed’s little machine could run byte-codes emulated in the longish slow microcode that was then possible. The euler compiler however, was tortuously rendered in an “extended precedence” grammar that actually required concessions in the language syntax (e.g. “,” could only be used in one role because the precedence scheme had no state space). I initially adopted a bottom-up Floyd-Evans parser (adapted from Jerry Feldman’s original compiler-compiler [Feldman 1977]) and later went to various top-down schemes, several of them related to Shorre’s meta ii [Shorre 1963] that eventually put the translater in the name space of the language.

The semantics of what was now called the flex language needed to be influenced more by Simula than by Algol or euler. But it was not completely clear how. Nor was it clear how the users should interact with the system. Ed had a display (for graphing, etc.) even on his first machine, and the linc had a “glass teletype,” but a Sketchpad-like system seemed far beyond the scope that we could accomplish with the maximum of 16k 16-bit words that our cost budget allowed.

Doug Engelbart and nls

This was in early 1967, and while we were pondering the flex machine, Utah was visited by Doug Engelbart. A prophet of Biblical dimensions, he was very much one of the fathers of what on the flex machine I had started to call “personal computing.” He actually traveled with his own 16mm projector with a remote control for starting a and stopping it to show what was going on (people were not used to seeing and following cursors back then). His notion on the arpa dream was that the destiny of Online Systems (mls) was the “augmentation of human intellect” via an interactive vehicle navigating through “thought vectors in concept space.” What his system could do then—even by today’s standards—was incredible. Not just hypertext, but graphics, multiple panes, efficient navigation and command input, interactive collaborative work, etc. An entire conceptual world and world view [Engelbart 68]. The impact of this vision was to produce in the minds of those who were “eager to be augmented” a compelling metaphor of what interactive computing should be like, and I immediately adopted many of the ideas for the flex machine.

In the midst of the arpa context of human-computer symbiosis and in the presence of Ed’s “little machine”, Gordon Moore’s “Law” again came to mind, this time with great impact. For the first time I made the leap of putting the room-sized interactive tx-2 or even a 10 mip 6600 on a desk. I was almost frightened by the implications; computing as we knew it couldn’t survive—the actual meaning of the word changed—it must have been the same kind of disorientation people had after reading Copernicus and first looked up from a different Earth to a different Heaven.

Instead of at most a few thousand institutional mainframes in the world—even today in 1992 it is estimated that there are only 4000 ibm mainframes in the entire world—and at most a few thousand users trained for each application, there would be millions of personal machines and users, mostly outside of direct institutional control. Where would the applications and training come from? Why should we expect an applications programmer to anticipate the specific needs of a particular one of the millions of potential users? An extensional system seemed to be called for in which the end-users would do most of the tailoring (and even some of the direct constructions) of their tools. arpa had already figured this out in the context of their early successes in time-sharing. Their larger metaphor of human-computer symbiosis helped the community avoid making a religion of their subgoals and kept them focused on the abstract holy grail of “augmentation.”

One of the interested features of nls was that its user interface was a parametric and could be supplied by the end user in the form of a “grammar of interaction given in their compiler-compiler TreeMeta. This was similar to William Newman’s early “Reaction Handler” [Newman 66] work in specifying interfaces by having the end-user or developer construct through tablet and stylus an iconic regular expression grammar with action procedures at the states (nls allowed embeddings via its context free rules). This was attractive in many ways, particularly William’s scheme, but to me there was a monstrous bug in this approach. Namely, these grammars forced the user to be in a system state which required getting out of before any new kind of interaction could be done. In hierarchical menus or “screens” one would have to backtrack to a master state in order to go somewhere else. What seemed to be required were states in which there was a transition arrow to every other state—not a fruitful concept in formal grammar theory. In other words, a much “flatter” interface seemed called for—but could such a thing be made interesting and rich enough to be useful?

Again, the scope of the flex machine was too small for a miniNLS, and we were forced to find alternate designs that would incorporate some of the power of the new ideas, and in some cases to improve them. I decided that Sketchpad’s notion of a general window that viewed a larger virtual world was a better idea than restricted horizontal panes and with Ed came up with a clipping algorithm very similar to that under development at the same time by Sutherland and his students at Harvard for the 3d “virtual reality” helment project [Sutherland 1968].

Object references were handled on the flex machine as a generalization of b5000 descriptors. Instead of a few formats for referencing numbers, arrays, and procedures, a flex descriptor contained two pointers: the first to the “master” of the object, and the second to the object instances (later we realized that we should put the master pointer in the instance to save space). A different method was taken for handling generalized assignment. The b5000 used l-values and r-values [Strachey*] which worked for some cases but couldn’t handle more complex objects. For example: a[55] := 0 if a was a sparse array whose default element was—would still generate an element in the array because := is an “operator” and a[55] is dereferenced into an l-value before anyone gets to see that the r-value is the default element, regardless of whether a is an array or a procedure fronting for an array. What is needed is something like: a(55 := 0), which can look at all relevant operands before any store is made. In other words, := is not an operator, but a kind of a index that can select a behavior from a complex object. It took me a remarkably long time to see this, partly I think because one has to invert the traditional notion of operators and functions, etc., to see that objects need to privately own all of their behaviors: that objects are a kind of mapping whose values are its behaviors. A book on logic by Carnap [Ca *] helped by showing that “intentional” definitions covered the same territory as the more traditional extensional technique and were often more intuitive and convenient.

As in Simula, a coroutine control structure [Conway, 1963] was used as a way to suspend and resume objects. Persistent objects like files and documents were treated as suspended processes and were organized according to their Algol-like static variable scopes. These were shown on the screen and could be opened by pointing at them. Coroutining was also used as a control structure for looping. A single operator while was used to test the generators which returned false when unable to furnish a new value. Booleans were used to link multiple generators. So a “for-type” loop would be written as:

\begin Unknown LaTeX command \begin alltt while i <= 1 to 30 by 2 \ j <= 2 to k by 3 do j<-j * i;

where the … to … by … was a kind of coroutine object. Many of these ideas were reimplemented in a stronger style in Smalltalk later on.

Another control structure of interest in flex was a kind of event-driven “soft interrupt” called when. Its boolean expression was compiled into a “tournement soft” tree that cached all possible intermediate results. The relevant variables were threaded through all of the sorting trees in all of the whens so that any change only had to compute through the necessary parts of the booleans. The efficiency was very high and was similar to the techniques now used for spreadsheets. This was an embarrassment of riches with difficulties often encountered in event-driven systems. Namely, it was a complex task to control the context of just when the whens should be sensitive. Part of the boolean expression had to be used to check the contexts, where I felt that somehow the structure of the program should be able to set and unset the event drivers. This turned out to beyond the scope of the flex system and needed to wait for a better architecture.

Still, quite a few of the original flex ideas in their proto-object form did turn out to be small enough to be feasible on the machine. I was writing the first compiler when something unusual happened: the Utah graduate students got invited to the arpa contractors meeting held that year at Alta, Utah. Towards the end of the three days, Bob Taylor, who had succeeded Ivan Sutherland as head of arpa-ipto asked the graduate students (sitting in a ring around the outside of the 20 or so contractors) if they had any comments. John Warnock raised his hand and pointed out that since the arpa grad students would all soon be colleagues (and since we did all the real work anyway), arpa should have a contractors-type meeting each user for the grad students. Taylor thought this was a great idea and set it up for the next summer.

Another ski-lodge meeting happened in Park City later that spring. The general topic was education and it was the first time I heard Marvin Minsky speak. He put forth a terrific diatribe against traditional education methods, and from him I heard the ideas of Piaget and Papert for the first time. Marvin’s talk was about how we think about complex situations and why schools are really bad places to learn these skills. He didn’t have to make any claims about computer+kids to make his point. It was clear that education and learning had to be rethought in the light of 20th century cognitive psychology and how good thinkers really think. Computing enters as a new representation system with new and useful metaphors for dealing with complexity, especially of systems [Minsky 70].

For the summer 1968 arpa grad students meeting at Allerton House in Illinois, I boiled all the mechanisms in the flex machine down into one 2’x3’ chart. This included all the “object structures.” the compiler, the byte-code interpreter, i/o handlers, and a simple display editor for text and graphics. The grad students were a distinguished group that did indeed become colleagues in subsequent years. My flex machine talk was a success, but the big whammy for me came during a tour of U of Illinois whee I saw a 1” square lump of class and neon gas in which individual spots would light up on command—it was the first flat-panel display. I spent the rest of the conference calculating just when the silicon of the flex machine could be put on the back of the display. According to Gordon Moore’s “Law”, the answer seemed to be sometime in the late seventies or early eighties. A long time off—it seemed to long to worry much about it then.

But later that year at rand i saw a truly beautiful system. This was grail, the graphical followin to joss. The first tablet (the famous rand tablet) was invented by Tom Ellis [Davvis 1964] in order to capture human gestures, and Gave Groner wrote a program to eficiently recognize and respnd to them [Groner 1966]. Through everything was fastned with bubble gum and the stem crashed often, I have never forgotton my fist interactions with this system. It was direct manipulation, it was analogical, it was modeless, it was beautiful. I reallized that the flex interface was all wrong, but how could something like graiol be stuffed intosuch a tiny machine since it required all of a stand-alone 360/44 to run in?

A month later, I finally visited Semour Papert, Wally Feurzig, Cynthia Solomon and some of the other original reserachers who had built logo and were using it with children in the Lexington schools. Here were children doing real pogramming with a specially designed language and environment. As with Simulas leading to oop, this enoucnter final hit me with what the destiny of personal computing really was going to be. Not a personal dynamic vehicle, as in Engelbart’s metaphor opposed to the ibm “railroads”, but something much more profound: a personal dynamic medium. With a vehicle on could wait until high school and give “drivers ed”, but if it was a medium, it had to extend into the world of childhood.

Now the collision of the flex machine, the flat-screen display, grail, Barton’s “communications” talk, McLuhan, and Papert’s work with children all came together to form an image of what a personal computer really should be. I remembered Aldus Manutius who 40 years after the printing press put the book into its modern dimensions by making it fit into saddlebags. It had to be no larger than a notebook, and needed an interface as friendly as joss’, grail’s, and logo’s, but with the reach of Simula and flex. A clear romantic vision has a marvelous ability to focus thought and will. Now it was easy to know what to do next. I built a cardboard model of it to see what if would look and feel like, and poured in lead pellets to see how light it would have to be (less than two pounds). I put a keyboard on it as well as a stylus because, even if handprinting and writing were recognized perfectly (and there was no reason to expect that it would be), there still needed to be a blance between the lowspeed tactile degrees of freedom offered by the stylus and the more limited but faster keyboard. Since arpa was starting to experiment with packet radio, I expected that the Dynabook when it arrived a decade or so hence, wouldhave a wireless networking system.

Early next year (1969) there was a conference on Extensible Languages in which alnost every famous name in the field attended. The debate was great and wighty—it was a religious war of unimplemented poorly though out ideas. As Alan Perlis, one of the great men in Computer Science, put it with characteristic wit:

It has been such a long time since aI have seen so many familiar faces shouting among so many familiar ideas. Discover of something new in programming languages, like any discovery, has somewhat the same sequence of emotions as falling in love. A sharp eleation followed by euphoria, a feeling of uniuqeness, and ultimately the wandering eye (the urge to generalize) [acm 69].

But it was all talk—no one had done anything yet. In the midst of all this, Ned Irons got up and presented imp, a system that had already been working for several years that was more elegant than most of the nonworking proposals. The basic idea of imp was that you coulduse any phrase in the grammar as a procedur heading and write a semantic definition in terms of the language as extended so far [Irons 1970].

I had already made the first version of the flex machine syntax driven, but where the meaning of a phrase was defned in the more usual way as the kind of code that was emitted. This separated the compiiler-extensor part of the system from the end-user. In Irons’ approach, every procedure in the system define dits own syntax in a natural and useful manner. I cinorporated these ideas into the second verions of the flex machine and started to experiment with the idea of a direct interpreter rather than a syntax directed compiler. Somewhere in all of this, I realized that the bridge to an object-based system could be in terms of each object as a syntax directed interpreter of messages sent to it. In one fell swoop this would unify object-oriented semantics with the ideal of a completely extensible language. The mental image was one of separate computers sending requests to other computers that had to be accepted and understood by the receivers beofre anything could happen. In todya’s terms every object would be a server offering services whose deployment and discretion depended entirely on the server’s notion of relationsip with the servee. As Liebniz said: “To get everything out of nothing, you only need to find one principle.” This was not well thought out enough to do the flex machine any good, but formed a good point of departure for my thesis [Kay 69], which as Ivan Sutherland liked to say was “anything you can get three people to sign.”

After three people signed it (Ivan was one off them), I went to the Stanford AI project and spent much more time thinking about notebook KiddyKomputers than AI. But there were two AI designs that were very intriguing. The first was Carl Hewitt’s planner, a programmable logic system that formed the deductive basis of Winograd’s shrdlu [Sussman 69, Hewitt 69] I designed several languages based on a combination of the pattern matching schemes of flex and planner [Kay 70]. The second design was Pat Winston’s concept formation system, a scheme for building semantic networks and comparing them to form analogies and learning processes [Winston 70]. It was kind of “object-oriented”. One of its many good ieas was that the arcs of each net which served as attributes in aov triples should themsleves be modeled as nets. Thus, for example a first order arc called left-of could be asked a higher order questions such as “What isyour converse?” and its net could answer: right-of. This point of view later formed the basis for Minsky’s frame systems [Minsky 75]. A few years later I wished I had paid more attention to this idea.

That fall, I heard a wonderful talk by Butler Lampson about cal-tss, a capability-based operating system that seemed very “object-oriented” [Lampson 69]. Unfogable pointers (ala 85000) were extended by bit-masks that restriected access to the object’s internal operations. This confirmed my “objects as server” metaphor. There was also a very nice approach to exception handling which reminded me of the way failure was often handled in pattern matching systems. The only problem— which the cal designers did not wsee as a problam at all—was that only certain (usually large and slow) things were “objects”. Fast things and small things, etc., weren’t. This needed to be fixed.

The biggest hit for me while at sail in late ’69 was to really understand lisp. Of course, every student knew about car, cdr, and cons, but Utah was impoverished in that no one there used lisp and hence, no one had penetrated thye mysteries of eval and apply. I could hardly believe how beautiful and wonderful the idea of lisp was [McCarthy 1960]. I say it this way because lisp had not only been around enough to get some honest barnacles, but worse, there wee deep falws in its logical foundations. By this, I mean that the pure language was supposed to be based on functions, but its most important components—such as lambda expressions quotes, and conds—where not functions at all, and insted ere called special forms. Landin and others had been able to get quotes and cons in terms of lambda by tricks that were variously clever and useful, but the flaw remained in the jewel. In the practical language things were better. There were not just exprs (which evaluated their arguments0, but fexprs (which did not). My next questions was, why on earth call it a functional language? Why not just base everuything on fexprs and force evaluation on the receiving side when needed? I could never get a good answer, but the question was very helpful when it came time to invent Smalltalk, because this started a line of thought that said “take the hardest and most profound thing you need to do, make it great, an then build every easier thing out of it”. That was the promise of LiSP and the lure of lambda—needed was a better “hardest and most profound” thing. Objects should be it.

III. 1970-72—Xerox parc: The KiddiKomp, miniCOM, and Smalltalk-71

In July 1970, Xerox, at the urgin of its chief scientist Jack Goldman, decdided to set up a long range reserach center in Palo Alo, California. In September, George Pake, the former chancellor at Washington University where Wes Clark’s arpa project was sited, hired Bob Taylor (who had left the arpa office and was taling a sabbatical year at Utah) to start a “Computer Science Laboratory.” Bob visited Palo Alto and we stayed up all night talking about it. The mansfield Amendment was threatening to blinkdly muzzle the most enlightened arpa funding in favor of directly military reserach, and this new opportunity looked like a promising alternative. But work for a company? He wanted me to consult and I asked for a direction. He said: follow your instincts. I immediately started working up a new versio of the KiddiKimp tha could be made in enough quantity to do experiments leading to the user interface design for the eventual notebook. Bob Barton liked to say that “good ideas don’t often scale.” He was certainly right when applied to the flex machine. The b5000 just didn’t directly scale down into a tiny machine. Only the byte-codes did. and even these needed modification. I decided to take another look at Wes Clark’s linkx, and was ready to appreciate it much more this time [Clark 1965].

I still liked pattern-directed approaches and oop so I came up with a language design called “Simulation logo” or slogo for short *(I had a feeling the first versions migh run nice and slow). This was to be built into a sony “tummy trinitron” and ould use a coarse bit-map display and the flex machine rubber tablet as a pointing device.

Another beautiful system that I had come across was Petere Deutsch’s pdp-1 lisp (implemented when he was only 15) [Deutsch 1966]. It used onl 2k (18-bit words) of code and could run quite well in a 4k mahcine (it was its own operating system and interface). It seemed that even more could be done if the system were byte-coded, run by an architectural that was hoospitable to dynamic systems, and stuck into the ever larger roms that were becoming available. One of the basic insights I had gotten from Seymour was that you didn’t have to do a lot to make a computer an “object for thought” for children, but what you did had to be done well and be able to apply deeply.

Right after New Years 1971, Bob Taylor scored an enourmous coup by attracting most of the struggling Berkeley computer corp to parc. This group included Butler Lampson, Check Thacker, Peter Deutsch, Jim Mitchell, Dick Shoup, Willie Sue Haugeland, and Ed Fiala. Him Mitchell urged the group to hire Ed McCreight from CM and he arrived soon after. Gar Starkweather was there already, having been thrown out of the Xerox Rochester Labs for wanting to build a laser printer (which was against the local religion). Not long after, many of Doug Englebart’s people joined up—part of the reason was that they want to reimplement nls as a distributed network system, and Doug wanted to stay with time-sharing. The group included Bill English (the co-inventor of the mouse), Jeff Rulifson, and Bill Paxton.

Almost immediately we got into trouble with Xerox when the group decided that the new lab needed a pdp-10 for continuity with the arpa community. Xerox (which has bought sds essentially sight unseend a few years before) was horrified at the idea of their main compeititor’s computer being used in the lab. They balked. The newly formed parc group had a metting in which it was decided that it would take about three years to do a good operating system for the xds sigma-7 but that we could build “our own pdp-10” in a year. My reactopn was “Holy cow!” In fact, they pullit it off with considerable pnache. maxc was actually a microcoded emeulation of the pdp-10 that used for the first time the new integrated chip memeoris (1k bits!) instead of core memory. Having practicalin house experience with both of these new technologies was critical for the more radical systems to come.

One little incident of lisp eauty happened when Allen Newell visited parc with his theory of hierarchical thinking and was challenged to prove it. He was given a programming problem to solve while the protocol was collected. The problem was: given a list of items, produce a list consisteing of all of the odd indexed items followed by all of the even indexed items. Newel’s internal programming langage resembple ipl-v in which pointers are manipulated explicitly, and he got into quite a struggle to do the program. In 2 seconds I wrote down:

oddsEvens(x) = append(odds(x), evens(x))

the statement of the problem in Landin’s lisp syntax—and also the first part of the solution. Then a few seconds later:

where odds(x) = if null(x) v null(tl(x)) then x else hd(x) & odds(ttl(x)) evens(x) = if null(x) v null(tl(x)) then nil else odds(tl(x))

This characteristic of writing down many solutions in declarative form and have them also be the programs is part of the appeal and beauty of this kind of language. Watching a famous guy much smarter than I struggle for more than 30 minutes to not quite solve the problem his way (there was a bug) made quite an impression. It brought home to me once again that “point of view is worth 80 IQ points.” I wasn’t smarter but I had a much better internal thinking tool to amplify my abilities. This incident and others like it made paramount that any tool for children should have great thinking patterns and deep beeauty “built-in.”

Right around this time we were involved in another conflict with Xerox management, in particular with Don Pendery the head “planner”. He really didn’t understand what we were talking about and instead was interested in “trends” and “what was the future going to be like” and how could Xerox “defend against it.” I got so upset I said to him, “Look. The best way to predict the future is to invent it. Don’t worry about what all those other people might do, this is the century in which almost any clear vision can be made!” He remained unconvinced, and that led to the famous “Pendery Papers for parc Planning Purposese,” a collection of essays on various aspects of the future. Mine proposed a version of the notebook as a “Display Transducer.” and Jim Mitchell’s was entitled “nls on a Minicomputer.”

Bill English took me under his wing and helped me start my group as I had always been a lone wolf and had no idea how to do it. One of his suggestions was that I should make a budget. I’m afraid that I really did ask Bill, “What’s a budget?” I remembered at Utag, in pre-Mansfield Amendment days, Dave Evans saying to me as hwent off on a trip to arpa, “We’re almost out of money. Got to go get some more.” That seemed about right to me. They give you some money. You spend it to find out what to do next. You run out. They give you some more. And so on. parc never quite made it to that idyll Posted by cds at August 2, 2004 04:23 PM