From a519e5320bec2f2df42b7356740e33189dd13d29 Mon Sep 17 00:00:00 2001 From: Andrew Yu Date: Tue, 26 Jul 2022 01:17:29 +0800 Subject: backup public domain --- lispos.html | 597 ------------------------------------------------------------ 1 file changed, 597 deletions(-) delete mode 100644 lispos.html (limited to 'lispos.html') diff --git a/lispos.html b/lispos.html deleted file mode 100644 index cd114fa..0000000 --- a/lispos.html +++ /dev/null @@ -1,597 +0,0 @@ - - -Lisp Operating System - - - -

Lisp Operating System

- -

- A Lisp Operating System (LispOS for short) is not just another - operating system that happens to be written in Lisp (although that - would be a good thing in itself). A LispOS is also an operating - system that uses the Lisp interactive environment as an inspiration - for the interface between the user and the system, and between - applications and the system. -

- -

- Below, we give some ideas on what a LispOS might contain, how it - would be different from existing operating systems, and how such a - system might be created. -

- -

The problem with existing systems

- -

The concept of a process

- -

- Most popular existing operating systems are derived from Unix which - was written in the 1970s. The computers for which Unix was intended - has a very small address space; too small for most usable end-user - applications. To solve this problem, the creators of Unix used the - concept of a process. A large application was written so - that it consisted of several smaller programs, each of which ran in - its own address space. These smaller programs would communicate by - having one application write text to its output stream for another - application to read. This method of communication was called - a pipe and a sequence of small applications was called - a pipeline. As a typical example of a chain of applications, - consider the pipeline for producing a typeset document (one of the - main applications for which Unix was designed). This chain had a - program for creating tables (called tbl), a program for - generating pictures (called pic), a program for generating - equations (called eqn), and of course the typesetting program - itself (called troff). -

- -

- Using pipes to communicate between different components of an - application has several disadvantages: -

-

- -

- It is an interesting observation that in most text books on - operating systems, the concept of a process is presented as playing - a central role in operating-system design, whereas it ought to be - presented as an unfortunate necessity due to the limited address - space of existing minicomputers in the 1970s. It is also presented - as the method for obtaining some kind of security, - preventing one application from intentionally or accidentally - modifying the data of some other application. In reality, there are - several ways of obtaining such security, and separate address spaces - should be considered to be a method with too many disadvantages. -

- -

- Nowadays, computers have addresses that are 64 bit wide, making it - possible to address almost 20 exabytes of data. To get an idea of - the order of magnitude of such a number, consider a fairly large - disc that can hold a terabyte of data. Then 20 million such discs - can be directly addressed by the processor. We can thus consider - the problem of too small an address space to be solved. -

- -

Hierarchical file systems

- -

- Existing operating system come with a hierarchical file - system. There are two significant problems, - namely hierarchical and file. -

- -

- The hierarchy is also a concept that dates back to the - 1970s, and it was considered a vast improvement on flat file - systems. However, - as this - article clearly explains, most things are not naturally - hierarchical. A hierarchical organization imposes an artificial - order between names. Whether a document is called - Lisp/Programs/2013/stuff, Programs/Lisp/2013/stuff, - 2013/Programs/Lisp/stuff, etc, is usually not important. -

- -

- The problem with a file is that it is only a sequence of - bytes with no structure. This lack of structure fits the Unix pipe - model very well, because intermediate steps between individual - software components can be saved to a file without changing the - result. But it also means that in order for complex data structures - to be stored in the file system, they have to be transformed into a - sequence of bytes. And whenever such a structure needs to be - modified by some application, it must again be parsed and - transformed into an in-memory structure. -

- -

Distinction between primary and secondary memory

- -

- Current system (at least for desktop computers) make a very clear - distinction between primary and secondary memory. Not only are the - two not the same, but they also have totally different semantics: -

- This distinction coupled with the semantics of the two memories - creates a permanent conundrum for the user of most applications, in - that if current application data is not saved, then it will - be lost in case of power loss, and if it is saved, then - previously saved data is forever lost. -

- -

- Techniques were developed as early in the 1960s for presenting - primary and secondary memory as a single abstraction to the user. - For example, the Multics system had a single hierarchy of fixed-size - byte arrays (called segments) that served as permanent storage, but - that could also be treated as any in-memory array by applications. - As operating systems derived from Unix became widespread, these - techniques were largely forgotten. -

- -

The concept of a kernel

- -

- The kernel of an operating system is a fairly large, - monolithic program that is started when the computer is powered on. - The kernel is not an ordinary program of the computer. It executes - in a privileged state so that it has direct access to devices and to - data structures that must be protected from direct use by user-level - programs. -

- -

- The very existence of a kernel is problematic because the computer - needs to be restarted whenever the kernel is updated, and then all - existing state is lost, including open files and data structures - that reside in volatile memory. Some programs, such as web - browsers, compensate somewhat for this problem by remembering the - open windows and the links that were associated with each window. -

- -

- The fact that the kernel is monolithic poses a problem because when - code needs to be added to the kernel in the form of a kernel - module, such code has full access to the entire computer - system. This universal access represents a security risk, of - course, but more commonly, it can be defective and then it - will fail often by crashing the entire computer. -

- -

- We have had solutions to this problem for many decades. The Multics - system, for example, did not have a kernel at all. An interrupt or - a system call was executed by the user-level process that issued the - system call or that happened to be executing when the interrupt - arrived. The code that executed then was not part of a monolithic - kernel, but existed as independent programs that could be added or - replaced without restarting the system. The system could still - crash, of course, if some essential system-wide data structure was - corrupted, but most of the time, only the user-level process that - issued the request would crash. -

- -

Monolithic applications

- -

- Applications in current operating systems are written in low-level - languages such as C or C++. An application is built using - techniques from more than half a century ago where source code is - compiled to object code and then linked to produce - an executable file meant to run in its own - dedicated address space. -

- -

- Aside from system calls, all code in an application is run in one - single address space. Together with the fact that low-level - languages are used, this organization makes the application - vulnerable to viruses and other security-related attacks. A - typical attack exploits the vulnerability to buffer overflow - which is due to the fact that the programming language used to write - the application does not insert boundary checks for arrays, - requiring the programmer to do that with explicit code, which is - therefore frequently not done. -

- -

- A buffer overflow in such an application can be exploited in order - to modify the execution of the program so that code that was not - intended by the application writer is executed in place of the - intended application code. Such modification is possible because - the execution stack is part of the application address space, and - the execution stack contains addresses of code to be executed later, - so that the application has direct access to these code addresses. -

- -

- In a Lisp operating system, the stack is not accessible to - application code. It is therefore not possible to alter addresses - on the stack representing code to be executed later. Furthermore, - the programming language automatically checks boundaries of arrays, - so that buffer overflows are not possible. -

- -

- An application in a Lisp operating system is not built as a - monolithic executable meant to execute in its own address space. - Instead, an application consists of a large number of objects whose - addresses are protected by the system and not directly accessible to - application code. The most common techniques for security attacks - are therefore not possible in such a system. -

- -

Objectives for a Lisp operating system

- -

- The three main objectives of a Lisp operating system correspond to - solutions to the two main problems with exiting systems as indicated - in the previous section. -

- -

Single address space

- -

- Instead of each application having its own address space, we propose - that all applications share a single large address space. This way, - applications can share data simply by passing pointers around, - because a pointer is globally valid, unlike pointers in current - operating systems. -

- -

- Clearly, if there is a single address space shared by all - applications, there needs to be a different mechanism to - ensure protection between them so that one application can - not intentionally or accidentally destroy the data of another - application. Most high-level programming languages (in particular - Lisp, but also Java, and many more) propose a solution to this - problem by simply not allowing users to execute arbitrary machine - code. Instead, they allow only code that has been produced from the - high-level notation of the language and which excludes arbitrary - pointer arithmetic so that the application can only address its own - data. This technique is sometimes called "trusted compiler". -

- -

- It might sometimes be desirable to write an application in a - low-level language like C or even assembler, or it might be - necessary to run applications that have been written for other - systems. Such applications could co-exist with the normal ones, but - they would have to work in their own address space as with current - operating systems, and with the same difficulties of communicating - with other applications. -

- -

Object store based on tags

- -

- Instead of a hierarchical file system, we propose an object - store which can contain any objects. If a file (i.e. a - sequence of bytes) is desired, it would be stored as an array of - bytes. -

- -

- Instead of organizing the objects into a hierarchy, objects in the - store can optionally be associated with an arbitrary number - of tags. These tags are key/value pairs, such as for - example the date of creation of the archive entry, the creator (a - user) of the archive entry, and the access permissions for - the entry. Notice that tags are not properties of the objects - themselves, but only of the archive entry that allows an object to - be accessed. Some tags might be derived from the contents of the - object being stored such as the sender or the date of - an email message. It should be possible to accomplish most searches - of the store without accessing the objects themselves, but only the - tags. Occasionally, contents must be accessed such as when a raw - search of the contents of a text is wanted. -

- -

- It is sometimes desirable to group related objects together as - with directories of current operating systems. Should a user - want such a group, it would simply be another object (say instances - of the class directory) in the store. Users who can not - adapt to a non-hierarchical organization can even store such - directories as one of the objects inside another directory. -

- -

Here are some examples of possible keyword/value pairs, how they - might be used, and what kinds of values are permitted:

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- Keyword - - Possible values -
- category - - The nature of the object such - as movie, music, article, book, user - manual, dictionary, course, lecture, - recipe, program, bank statement, - email. These would be chosen from an - editable set that is defined per user. -
- name - - A string that is displayed with the object, such as "A Dramatic - Turn of Events", "Three seasons", "Alternative energy". -
- author - - An object identifying a person, an organization, a company, - etc. -
- genre. Can be used for movies, music albums, programs, - articles, etc. - - progressive - metal, science, algorithms, garbage - collection, game, programming language - implementation, operating system. These would be - chosen from an editable set that is defined per user. -
- format - - This tag can be used to identify the file type of documents such - as PDF, ogg/vorbis, MPEG4 PNG, in - which case the tag can be assigned automatically, but also to - identify the source format of files in a directory containing - things like articles or user manuals, for - example LaTeX, Texinfo, HTML. These would - be chosen from an editable set that is defined per user. -
- date of creation - - A date interval. -
- composer. Used for music. - - An object representing a person. On a compilation album there - can be more than one tag of this kind. -
- language. - - An object representing a natural language such - as English, Vietnamese, or a programming languages - such as Lisp, Python. These would - be chosen from an editable set that is defined per user. If - appropriate, a document can have several of these tags, for - instance if some program uses multiple programming languages, or - if a document is written using several languages, such as a - dictionary. -
- duration. Can be used for things like movies or music - albums. - - An object representing a duration. -
- source control. Can be used for any documents that use - source control such a programs, etc. - - GIT, SVN, CVS, darks, etc. These - would be chosen from an editable set that is defined per user. -
- -

- When (a pointer to) an object is returned to a user as a result of a - search of the object store, it is actually similar to what is called - a "capability" in the operating-system literature. Such a - capability is essentially only a pointer with a few bits indicating - what access rights the user has to the objects. Each creator - may interpret the contents of those bits as he or she likes, but - typically they would be used to restrict access, so that for - instance executing a reader method is allowed, but executing - a writer method is not. -

- -

Single memory abstraction

- -

- Instead of two different memory abstractions (primary and - secondary), the Lisp operating system would contain a single - abstraction which looks like any interactive Lisp system, except - that data is permanent. -

- -

- Since data is permanent, application writers are encouraged to - provide a sophisticated undo facility. -

- -

- The physical main (semiconductor) memory of the computer simply acts - as a cache for the disk(s), so that the address of an object - uniquely determines where on the disk it is stored. The cache is - managed as an ordinary virtual memory with existing - algorithms. -

- -

Other features

- -

Crash proof (maybe)

- -

- There is extensive work on crash-proof systems, be it operating - systems or data base systems. In our opinion, this work is - confusing in that the objective is not clearly stated. -

- -

- Sometimes the objective is stated as the desire that no data be lost - when power is lost. But the solution to that problem already exists - in every laptop computer; it simply provides a battery that - allow the system to continue to work, or to be shut down in a - controlled way. -

- -

- Other times, the objective is stated as a protection against - defective software, so that data is stored at regular intervals - (checkpointing) perhaps combined with a transaction log so - that the state of the system immediately before a crash can always - be recovered. But it is very hard to protect oneself against - defective software. There can be defects in the checkpointing code - or in the code for logging transactions, and there can be defects in - the underlying file system. We believe that it is a better use of - developer time to find and eliminate defects than to aim for a - recovery as a result of existing defects. -

- -

Multiple simultaneous environments

- -

- To allow for a user to add methods to standard generic functions - (such as print-object) without interfering with other - users, we suggest that each user gets a different global - environment. The environment maps names - to objects such as functions, classes, types, packages, and - more. Immutable objects (such as the common-lisp package) - can exist in several different environments simultaneously, but - objects (such as the generic function print-object would be - different in different environments. -

- -

- Multiple environments would also provide more safety for users in - that if a user inadvertently removes some system feature, then it - can be recovered from a default environment, and in the worst case a - fresh default environment could be installed for a user who - inadvertently destroyed large parts of his or her environment. -

- -

- Finally, multiple environments would simplify experimentation with - new features without running the risk of destroying the entire - system. Different versions of a single package could exist in - different environments. -

- -

How to accomplish it

- -

- The most important aspect of a Lisp operating system is not that all - the code be written in Lisp, but rather to present a Lisp-like - interface between users and the system and between applications and - the system. It is therefore legitimate to take advantage of some - existing system (probably Linux or some BSD version) in order to - provide services such as device drivers, network communication, - thread scheduling, etc. -

- -

Create a Lisp system to be used as basis

- -

- The first step is to create a Common Lisp system that can be used as - a basis for the Lisp operating system. It should already allow for - multiple environments, and it should be available on 64-bit - platforms. Preferably, this system should use as little C code as - possible and interact directly with the system calls of the - underlying kernel. -

- -

Create a single-user system as a Unix process

- -

- The next step is to transform the Common Lisp system into an - operating system in the sense of the API for users and - applications. This system would contain the object store, but - perhaps not access control functionality. -

- -

- When this step is accomplished, it is possible to write or adapt - applications such as text editors, inspectors, debuggers, GUI - interface libraries, etc. for the system. -

- -

Create device drivers

- -

- The final step is to replace the temporary Unix kernel with native - device drivers for the new system and to turn the system into a full - multi-user operating system. -

- -
-
-robert.strandh@gmail.com -
- - -- cgit 1.4.1-2-gfad0