A Lisp Operating System (LispOS for short) is not just another operating system that happens to be written in Lisp (although that would be a good thing in itself). A LispOS is also an operating system that uses the Lisp interactive environment as an inspiration for the interface between the user and the system, and between applications and the system.
Below, we give some ideas on what a LispOS might contain, how it would be different from existing operating systems, and how such a system might be created.
Most popular existing operating systems are derived from Unix which was written in the 1970s. The computers for which Unix was intended has a very small address space; too small for most usable end-user applications. To solve this problem, the creators of Unix used the concept of a process. A large application was written so that it consisted of several smaller programs, each of which ran in its own address space. These smaller programs would communicate by having one application write text to its output stream for another application to read. This method of communication was called a pipe and a sequence of small applications was called a pipeline. As a typical example of a chain of applications, consider the pipeline for producing a typeset document (one of the main applications for which Unix was designed). This chain had a program for creating tables (called tbl), a program for generating pictures (called pic), a program for generating equations (called eqn), and of course the typesetting program itself (called troff).
Using pipes to communicate between different components of an application has several disadvantages:
It is an interesting observation that in most text books on operating systems, the concept of a process is presented as playing a central role in operating-system design, whereas it ought to be presented as an unfortunate necessity due to the limited address space of existing minicomputers in the 1970s. It is also presented as the method for obtaining some kind of security, preventing one application from intentionally or accidentally modifying the data of some other application. In reality, there are several ways of obtaining such security, and separate address spaces should be considered to be a method with too many disadvantages.
Nowadays, computers have addresses that are 64 bit wide, making it possible to address almost 20 exabytes of data. To get an idea of the order of magnitude of such a number, consider a fairly large disc that can hold a terabyte of data. Then 20 million such discs can be directly addressed by the processor. We can thus consider the problem of too small an address space to be solved.
Existing operating system come with a hierarchical file system. There are two significant problems, namely hierarchical and file.
The hierarchy is also a concept that dates back to the 1970s, and it was considered a vast improvement on flat file systems. However, as this article clearly explains, most things are not naturally hierarchical. A hierarchical organization imposes an artificial order between names. Whether a document is called Lisp/Programs/2013/stuff, Programs/Lisp/2013/stuff, 2013/Programs/Lisp/stuff, etc, is usually not important.
The problem with a file is that it is only a sequence of bytes with no structure. This lack of structure fits the Unix pipe model very well, because intermediate steps between individual software components can be saved to a file without changing the result. But it also means that in order for complex data structures to be stored in the file system, they have to be transformed into a sequence of bytes. And whenever such a structure needs to be modified by some application, it must again be parsed and transformed into an in-memory structure.
Current system (at least for desktop computers) make a very clear distinction between primary and secondary memory. Not only are the two not the same, but they also have totally different semantics:
Techniques were developed as early in the 1960s for presenting primary and secondary memory as a single abstraction to the user. For example, the Multics system had a single hierarchy of fixed-size byte arrays (called segments) that served as permanent storage, but that could also be treated as any in-memory array by applications. As operating systems derived from Unix became widespread, these techniques were largely forgotten.
The kernel of an operating system is a fairly large, monolithic program that is started when the computer is powered on. The kernel is not an ordinary program of the computer. It executes in a privileged state so that it has direct access to devices and to data structures that must be protected from direct use by user-level programs.
The very existence of a kernel is problematic because the computer needs to be restarted whenever the kernel is updated, and then all existing state is lost, including open files and data structures that reside in volatile memory. Some programs, such as web browsers, compensate somewhat for this problem by remembering the open windows and the links that were associated with each window.
The fact that the kernel is monolithic poses a problem because when code needs to be added to the kernel in the form of a kernel module, such code has full access to the entire computer system. This universal access represents a security risk, of course, but more commonly, it can be defective and then it will fail often by crashing the entire computer.
We have had solutions to this problem for many decades. The Multics system, for example, did not have a kernel at all. An interrupt or a system call was executed by the user-level process that issued the system call or that happened to be executing when the interrupt arrived. The code that executed then was not part of a monolithic kernel, but existed as independent programs that could be added or replaced without restarting the system. The system could still crash, of course, if some essential system-wide data structure was corrupted, but most of the time, only the user-level process that issued the request would crash.
Applications in current operating systems are written in low-level languages such as C or C++. An application is built using techniques from more than half a century ago where source code is compiled to object code and then linked to produce an executable file meant to run in its own dedicated address space.
Aside from system calls, all code in an application is run in one single address space. Together with the fact that low-level languages are used, this organization makes the application vulnerable to viruses and other security-related attacks. A typical attack exploits the vulnerability to buffer overflow which is due to the fact that the programming language used to write the application does not insert boundary checks for arrays, requiring the programmer to do that with explicit code, which is therefore frequently not done.
A buffer overflow in such an application can be exploited in order to modify the execution of the program so that code that was not intended by the application writer is executed in place of the intended application code. Such modification is possible because the execution stack is part of the application address space, and the execution stack contains addresses of code to be executed later, so that the application has direct access to these code addresses.
In a Lisp operating system, the stack is not accessible to application code. It is therefore not possible to alter addresses on the stack representing code to be executed later. Furthermore, the programming language automatically checks boundaries of arrays, so that buffer overflows are not possible.
An application in a Lisp operating system is not built as a monolithic executable meant to execute in its own address space. Instead, an application consists of a large number of objects whose addresses are protected by the system and not directly accessible to application code. The most common techniques for security attacks are therefore not possible in such a system.
The three main objectives of a Lisp operating system correspond to solutions to the two main problems with exiting systems as indicated in the previous section.
Instead of each application having its own address space, we propose that all applications share a single large address space. This way, applications can share data simply by passing pointers around, because a pointer is globally valid, unlike pointers in current operating systems.
Clearly, if there is a single address space shared by all applications, there needs to be a different mechanism to ensure protection between them so that one application can not intentionally or accidentally destroy the data of another application. Most high-level programming languages (in particular Lisp, but also Java, and many more) propose a solution to this problem by simply not allowing users to execute arbitrary machine code. Instead, they allow only code that has been produced from the high-level notation of the language and which excludes arbitrary pointer arithmetic so that the application can only address its own data. This technique is sometimes called "trusted compiler".
It might sometimes be desirable to write an application in a low-level language like C or even assembler, or it might be necessary to run applications that have been written for other systems. Such applications could co-exist with the normal ones, but they would have to work in their own address space as with current operating systems, and with the same difficulties of communicating with other applications.
Instead of a hierarchical file system, we propose an object store which can contain any objects. If a file (i.e. a sequence of bytes) is desired, it would be stored as an array of bytes.
Instead of organizing the objects into a hierarchy, objects in the store can optionally be associated with an arbitrary number of tags. These tags are key/value pairs, such as for example the date of creation of the archive entry, the creator (a user) of the archive entry, and the access permissions for the entry. Notice that tags are not properties of the objects themselves, but only of the archive entry that allows an object to be accessed. Some tags might be derived from the contents of the object being stored such as the sender or the date of an email message. It should be possible to accomplish most searches of the store without accessing the objects themselves, but only the tags. Occasionally, contents must be accessed such as when a raw search of the contents of a text is wanted.
It is sometimes desirable to group related objects together as with directories of current operating systems. Should a user want such a group, it would simply be another object (say instances of the class directory) in the store. Users who can not adapt to a non-hierarchical organization can even store such directories as one of the objects inside another directory.
Here are some examples of possible keyword/value pairs, how they might be used, and what kinds of values are permitted:
Keyword | Possible values |
category | The nature of the object such as movie, music, article, book, user manual, dictionary, course, lecture, recipe, program, bank statement, email. These would be chosen from an editable set that is defined per user. |
name | A string that is displayed with the object, such as "A Dramatic Turn of Events", "Three seasons", "Alternative energy". |
author | An object identifying a person, an organization, a company, etc. |
genre. Can be used for movies, music albums, programs, articles, etc. | progressive metal, science, algorithms, garbage collection, game, programming language implementation, operating system. These would be chosen from an editable set that is defined per user. |
format | This tag can be used to identify the file type of documents such as PDF, ogg/vorbis, MPEG4 PNG, in which case the tag can be assigned automatically, but also to identify the source format of files in a directory containing things like articles or user manuals, for example LaTeX, Texinfo, HTML. These would be chosen from an editable set that is defined per user. |
date of creation | A date interval. |
composer. Used for music. | An object representing a person. On a compilation album there can be more than one tag of this kind. |
language. | An object representing a natural language such as English, Vietnamese, or a programming languages such as Lisp, Python. These would be chosen from an editable set that is defined per user. If appropriate, a document can have several of these tags, for instance if some program uses multiple programming languages, or if a document is written using several languages, such as a dictionary. |
duration. Can be used for things like movies or music albums. | An object representing a duration. |
source control. Can be used for any documents that use source control such a programs, etc. | GIT, SVN, CVS, darks, etc. These would be chosen from an editable set that is defined per user. |
When (a pointer to) an object is returned to a user as a result of a search of the object store, it is actually similar to what is called a "capability" in the operating-system literature. Such a capability is essentially only a pointer with a few bits indicating what access rights the user has to the objects. Each creator may interpret the contents of those bits as he or she likes, but typically they would be used to restrict access, so that for instance executing a reader method is allowed, but executing a writer method is not.
Instead of two different memory abstractions (primary and secondary), the Lisp operating system would contain a single abstraction which looks like any interactive Lisp system, except that data is permanent.
Since data is permanent, application writers are encouraged to provide a sophisticated undo facility.
The physical main (semiconductor) memory of the computer simply acts as a cache for the disk(s), so that the address of an object uniquely determines where on the disk it is stored. The cache is managed as an ordinary virtual memory with existing algorithms.
There is extensive work on crash-proof systems, be it operating systems or data base systems. In our opinion, this work is confusing in that the objective is not clearly stated.
Sometimes the objective is stated as the desire that no data be lost when power is lost. But the solution to that problem already exists in every laptop computer; it simply provides a battery that allow the system to continue to work, or to be shut down in a controlled way.
Other times, the objective is stated as a protection against defective software, so that data is stored at regular intervals (checkpointing) perhaps combined with a transaction log so that the state of the system immediately before a crash can always be recovered. But it is very hard to protect oneself against defective software. There can be defects in the checkpointing code or in the code for logging transactions, and there can be defects in the underlying file system. We believe that it is a better use of developer time to find and eliminate defects than to aim for a recovery as a result of existing defects.
To allow for a user to add methods to standard generic functions (such as print-object) without interfering with other users, we suggest that each user gets a different global environment. The environment maps names to objects such as functions, classes, types, packages, and more. Immutable objects (such as the common-lisp package) can exist in several different environments simultaneously, but objects (such as the generic function print-object would be different in different environments.
Multiple environments would also provide more safety for users in that if a user inadvertently removes some system feature, then it can be recovered from a default environment, and in the worst case a fresh default environment could be installed for a user who inadvertently destroyed large parts of his or her environment.
Finally, multiple environments would simplify experimentation with new features without running the risk of destroying the entire system. Different versions of a single package could exist in different environments.
The most important aspect of a Lisp operating system is not that all the code be written in Lisp, but rather to present a Lisp-like interface between users and the system and between applications and the system. It is therefore legitimate to take advantage of some existing system (probably Linux or some BSD version) in order to provide services such as device drivers, network communication, thread scheduling, etc.
The first step is to create a Common Lisp system that can be used as a basis for the Lisp operating system. It should already allow for multiple environments, and it should be available on 64-bit platforms. Preferably, this system should use as little C code as possible and interact directly with the system calls of the underlying kernel.
The next step is to transform the Common Lisp system into an operating system in the sense of the API for users and applications. This system would contain the object store, but perhaps not access control functionality.
When this step is accomplished, it is possible to write or adapt applications such as text editors, inspectors, debuggers, GUI interface libraries, etc. for the system.
The final step is to replace the temporary Unix kernel with native device drivers for the new system and to turn the system into a full multi-user operating system.