Unix and Computer Science

Ronda Hauben

Issue #4, August 1994

Although the word Linux does not appear in this article, it does offer a lot of interesting background on Unix. That background tells us how we got to the point where Linx is a reality.

This year is the 25th anniversary of the invention of the Unix kernel in 1969 at Bell Labs. The following Work In Progress was presented at the Usenix Summer 1993 Conference in Cincinnati, Ohio. A longer paper based on this research has been proposed for the June 1994 Usenix Conference as a contribution to a 25th year commemorative discussion about the significance of the Unix breakthrough and lessons to be learned from it for making the next step forward.

The Multics (1965-68) project had been created to “show that general-purpose, multiuser, timesharing systems were viable”1. Based on the results of research gained at MIT using the MIT Compatible Time-Sharing System (CTSS), AT&T and G.E. agreed to work with MIT to build a “new hardware, a new operating system, a new file system, and a new user interface”. Though the project proceeded slowly and it took years to develop Multics, Doug Comer, a Professor of Computer Science at Purdue University, explains that “fundamental issues were uncovered” in the process of the research on Multics, “new approaches were explored and new mechanisms were invented”. The most important, he explains, was that “participants and observers alike became devoted to a new form of computing (the interactive, multiuser, timesharing system.). As a result, the Multics project dominated computer systems research for many years, and many of its results are still considered seminal.”

By 1969, however, AT&T made a decision to withdraw from the project. Describing that period, Dennis Ritchie, one of the inventors of Unix at Bell Labs writes, “By 1969, Bell Labs management, and even the researchers came to believe that the promises of Multics could be fulfilled only too late and too expensively.”2

“Even before the GE-645 Multics machine was removed from the premises,” Ritchie explains, “an informal group led primarily by Ken Thompson, had begun investigating alternatives.”

Thompson and Ritchie presented Bell Labs with proposals to buy them a computer so they could build their own interactive, time sharing operating system. Their proposals weren't acted on. Eventually, Ken Thompson found a little-used and obsolete PDP-7 computer, a tiny machine in the class of a Commodore 64 computer.

The environment Thompson was attempting, explains Ritchie, included “many of the innovative aspects of Multics”, such as “an explicit notion of a process as a locus of control, a tree-structured file system, a command interpreter as a user-level program, simple representation of text files, and generalized access to devices”3. Describing the primitive conditions that Thompson faced when attempting to create his desired programming environment, Ritchie writes:

At the start, Thompson did not even program on the PDP itself, but instead used a set of macros for the GEMAP assembler on a GE-635 machine. A postprocesser generated a paper tape readable by the PDP-7. These tapes were carried from the GE machine to the PDP-7 for testing until a primitive Unix kernel, an editor, an assembler, a simple shell (command interpreter), and a few utilities (like the Unix rm, cat, cp commands) were completed. At this point, the operating system was self-supporting; programs could be written and tested without resort to paper tape, and development continued on the PDP-7 itself.4

The result, Ritchie explains, was that,

Thompson's PDP-7 assembler outdid even DEC's in simplicity; it evaluated expressions and emitted the corresponding bits. There were no libraries, no loader or link editor: the entire source of a program was presented to the assembler, and the output file --with a fixed name—that emerged was directly executable.5

The operating system was named Unix, to distinguish it from the complexity that burdened Multics.

As work continued on the Bell Labs operating system, the researchers developed a set of principles to guide their work. Among these principles were:

  1. Make each program do one thing well. To do a new job, build afresh rather than complicate old programs by adding new features.

  2. Expect the output of every program to become the input to another, as yet unknown, program. Don't clutter output with extraneous information. Avoid stringently columnar or binary input formats. Don't insist on interactive input.

  3. Design and build software, even operating systems, to be tried early, ideally within weeks. Don't hesitate to throw away the clumsy parts and rebuild them.

  4. Use tools in preference to unskilled help to lighten a programming task, even if you have to detour to build the tools and expect to throw some of them out after you've finished using them.6

By 1970, Ritchie writes, the Unix researchers were “able to acquire a new DEC PDP-11. The processor”, he remembers, “was among the first of its line delivered by DEC, and three months passed before its disk arrived.” Soon after the machine's arrival and while “still waiting for the disk,” Ritchie recalls, “Thompson recoded the Unix kernel and some basic commands in PDP assembly language. Of the 24K bytes of memory on the machine, the earliest PDP-11 Unix system used 12K bytes for the operating system, a tiny space for user programs, and the remainder as a RAM disk.”8 Ritchie explains, “By early 1973, the essentials of modern C were complete. The language and compiler were strong enough to permit us to rewrite the kernel for the PDP-11 in C during the summer of that year.” 9 Each program they built developed some simple capability and they called that program a tool. They wanted the programs to be fun to use and to be helpful to programmers. Describing the achievements of the lab, Doug McIlroy, one of the researchers and Thompson's Department Head when he created the Unix kernel, describes the atmosphere at the lab:

Constant discussions honed the system.... Should tools usually accept output file names? How to handle demountable media? How to manipulate addresses in a higher level language? How to minimize the information deducible from a rejected login? Peer pressure and simple pride in workmanship caused gobs of code to be rewritten or discarded as better or more basic ideas emerged. Professional rivalry and protection of turf were practically unknown: so many good things were happening that nobody needed to be proprietary about innovations.10

The research done at the Labs was concerned with using the computer to automate programming tasks. By a scientific approach to their work and careful attention to detail, Bell Labs researchers determined the essential elements in a design and then created a program to do as simple a job as possible. These simple computer automation tools would then be available to build programs to do more complicated tasks. They created a Unix kernel accompanied by a toolbox of programs that could be used by others at Bell Labs. The kernel consisted of about 11,000 lines of code. Eventually, 10,000 lines of the code were rewritten in C and thus could be transported to other computer systems. “The kernel,” Ken Thompson writes, “is the only Unix code that cannot be substituted by a user to his own liking. For this reason, the kernel should make as few real decisions as possible.”11 Thompson describes creating the kernel:

What is or is not implemented in the kernel represents both a great responsibility and a great power. It is a soap-box platform on `the way things should be done.' Even so, if `the way' is too radical, no one will follow it. Every important decision was weighed carefully. Throughout, simplicity has been substituted for efficiency. Complex algorithms are used only if their complexity can be localized. (12)

The kernel was conceived as what was essential and other features were left to be developed as part of the tools or software that would be available. Thompson explains:

The Unix kernel is an I/O multiplexer more than a complete operating system. This is as it should be. Because of this outlook, many features found in most other operating systems are missing from the Unix kernel. For example, the Unix kernel does not support file access methods, file disposition, file formats, file maximum sizes, spooling, command language, logical records, physical records, assignment of logical file names, logical file names, more than one character set, an operator's console, an operator, log-in, or log-out. Many of these things are symptoms rather than features. Many of these things are implemented in user software using the kernel as a tool. A good example of this is the command language. Maintenance of such code is as easy as maintaining user code. The idea of implementing `system' code and general user primitives comes directly from Multics.13

During the same period that Bell Labs researchers were doing their early work on Unix, the Bell System was faced with the problem of automating their telephone operations using minicomputers.

“The discovery that we had the need—or actually, the opportunity—in the early `70s to use these minis to support telephone company operations encouraged us to work with the Unix system,” writes Berkley Tague.14 “We knew we could do a better job with maintenance, traffic control, repair, and accounting applications. The existing systems were made up of people and paper,” he relates. “The phone business was in danger of being overwhelmed in the early `70s with the boom of the `60s. There was a big interest then in using computers to help manage that part of the business. We wanted to get rid of all of those Rolodex files and help those guys who had to pack instruments and parts back and forth just to keep things going”.

He goes on to describe the kind of operations that Bell Systems needed to automate. Just as Operating Systems people in the Bell system had come to recognize the need for portability in a computer operating system, Ritchie and Thompson and the other programming researchers at Bell Labs had created the computer language C and rewritten the majority of the Unix kernel in C and thus had made the important breakthrough in creating a computer operating system that was not machine dependent. Describing their breakthrough with Unix, Thompson and Ritchie presented their first paper on Unix at the Symposium on Operating Systems Principles at the Fourth ACM Symposium on Operating Systems Principles, IBM Thomas J. Watson Research Center, Yorktown Heights, New York, October 15-17, 1973.15

With the research breakthrough of a portable computer operating system, “the first Unix applications”, writes Mohr, in an article in Unix Review, “were installed in 1973 on a system involved in updating directory information and intercepting calls to numbers that had been changed. The automatic intercept system was delivered for use on early PDP-11s. This was essentially the first time Unix was used to support an actual, ongoing operating business.”16

The labs made the software available to academic institutions at a very small charge. For example, John Lions, a faculty member in the Department of Computer Science at the University of New South Wales, in Australia, reported that his school was able to acquire a copy of research Unix Edition 5 for $150 ($110 Australian) in December, 1974, including tape and manuals.17

By 1979, the automation at AT&T had the benefit of research done not only at Bell Labs, but also by researchers in the academic community. Early in its development, word of the Unix operating system and its advantages spread outside of Bell Labs. (Several sources attribute this to the paper that Ritchie and Thompson presented on Unix at the Symposium on Operating Principles in 1973.18)

Unix was attractive to the academic Computer Science community for several reasons. After describing the more obvious advantages like its price, that it could be shaped to the installation, that it was written in C which was attractive when compared with assembly language, that it was sufficiently small that an individual could study and understand it, John Stoneback writes:

Unix had another appealing virtue that many may have recognized only after the fact—its faithfulness to the prevailing mid-'70s philosophy of software design and development. Not only was Unix proof that real software could be built the way many said it could, but it lent credibility to a science that was struggling to establish itself as a science. Faculty could use Unix and teach about it at the same time. In most respects, the system exemplified good computer science. It provided a clean and powerful user interface and tools that promoted and encouraged the development of software. The fact that it was written in C allowed actual code to be presented and discussed, and made it possible to lift textbook examples into the real world. Obviously, Unix was destined to grow in the academic community.19

In trying to teach his students the essentials of a good operating system, John Lions of the University of New South Wales in Australia describes how he prepared a booklet containing the source files for a version of Edition 6 of research Unix in 1976 and the following year completed a set of explanatory notes to introduce students to the code. “Writing these,” he recounts, “was a real learning exercise for me. By slowly and methodically surveying the whole kernel, I came to understand things that others had overlooked.”20

This ability to present his students with a real example of an operating system kernel was a breakthrough. Lions writes:

Before I wrote my notes on Unix, most people thought of operating systems as huge and inaccessible. Because I had been at Burroughs, I knew that people could get to learn a whole program if they spent some time working at it. I knew it would be possible for one person to effectively become an expert on the whole system. The Edition 6 Unix code contained less than 10,000 lines, which positioned it nicely to become the first really accessible operating system.21

In keeping true to the Unix community spirit of helping each other, Lions wrote a letter to Mel Ferentz, Lou Katz and others from Usenix and offered to make copies of his notes available to others. After some negotiation with Western Electric over the patent licensing, he distributed the notes titled “A Commentary on the Unix Operating System” to others with Unix licenses on the conditions that Western Electric had set out.22

Describing how research Unix and its adoption at academic institutions has served to develop computer science, Doug Comer writes:

The use of Unix as a basis for operating systems research has produced three highly desirable consequences. First, the availability of a common system allowed researchers to reproduce and verify each other's experiments. Such verification is the essence of science. Second, having a solid base of systems software made it possible for experimenters to build on the work of others and to tackle significant ideas without wasting time developing all the pieces from scratch. Such a basis is prerequisite to productive research. Third, the use of a single system as both a research vehicle and a conventional source of computing allowed researchers to move results from the laboratory to the production environment quickly. Such quick transition is mandatory of state-of-the-art computing.23

Not only did research Unix serve the academic community, but the contributions of the academic community were incorporated into research Unix. An example is the work at UC Berkeley of designing a virtual memory version of Unix for the VAX computer which was later optimized and incorporated into a release of Unix. “A tide of ideas,” explains Comer, “had started a new cycle, flowing from academia to an industrial laboratory, back to academia, and finally moving on to a growing number of commercial sites.”24

Summarizing the relationship between Bell Labs and the academic community in developing Unix, Comer concludes:

Unix was not invented by hackers who were fooling around, nor did it take shape in a vacuum. It grew from strong academic roots and it has both nurtured and taken nourishment from academia throughout its development. The primary contributors to Unix were highly educated mathematicians and computer scientists employed by what many people feel is the world's premier industrial research center, Bell Laboratories. Although they were knowledgeable and experienced in their own right, these developers maintained professional contacts with researchers in academia, leading to an exchange of ideas that proved beneficial for both sides. Understanding the symbiotic relationship between Unix and the academic community means understanding the background of the system's inventors and the history of interactions between universities and Bell Laboratories.25

John Lions, reviewing his experience as part of the Unix community, concludes, “We have made a large number of contacts and exchanged a great deal of information around the world through this Unix connection. Possibly that is the nicest thing about Unix: it is not so much that the system itself is friendly but that the people who use it are.”26

It is a rare and wonderful event in the development of human society when a scientific and technological breakthrough is made which will certainly affect the future course of social development and which becomes known when its midwives are still alive to tell us about it. Unix, the product of researcher at Bell Labs, the then regulated AT&T system, and academic computer science, and a valuable invention for computer science, for computer education and for the education of the next generation of computer scientists and engineers, is such an event.

Unix and Computer Science© by Ronda Hauben, is adapted from an article to appear in the Spring 1994 issue of The Amateur Computerist newsletter and is being reproduced by permission of the editors of The Amateur Computerist. Electronic copies of the Amateur Computerist are available free from ronda@umcc.umich.edu or au329@cleveland.freenet.edu. Printed copies are available by subscription at $5.00 for a 4 issue subscription (add $2.50 for foreign postage). Checks are to be made payable to R. Hauben. For printed copies, write: The Amateur Computerist, c/o R. Hauben, P.O. Box 4344, Dearborn, MI 48126.

Also, comments are welcome on this work in progress, as a longer work is in process and there will be an effort to respond to comments and suggestions in the longer work.

References