about summary refs log tree commit diff stats
path: root/tools/termbox
diff options
context:
space:
mode:
Diffstat (limited to 'tools/termbox')
0 files changed, 0 insertions, 0 deletions
rgin-top: 0; padding-top: 0; } .markdown-body a:first-child h1, .markdown-body a:first-child h2, .markdown-body a:first-child h3, .markdown-body a:first-child h4, .markdown-body a:first-child h5, .markdown-body a:first-child h6 { margin-top: 0; padding-top: 0; } .markdown-body h1+p, .markdown-body h2+p, .markdown-body h3+p, .markdown-body h4+p, .markdown-body h5+p, .markdown-body h6+p { margin-top: 0; } .markdown-body li p.first { display: inline-block; } .markdown-body ul, .markdown-body ol { padding-left: 30px; } .markdown-body ul.no-list, .markdown-body ol.no-list { list-style-type: none; padding: 0; } .markdown-body ul li>:first-child, .markdown-body ul li ul:first-of-type, .markdown-body ul li ol:first-of-type, .markdown-body ol li>:first-child, .markdown-body ol li ul:first-of-type, .markdown-body ol li ol:first-of-type { margin-top: 0px; } .markdown-body ul li p:last-of-type, .markdown-body ol li p:last-of-type { margin-bottom: 0; } .markdown-body ul ul, .markdown-body ul ol, .markdown-body ol ol, .markdown-body ol ul { margin-bottom: 0; } .markdown-body dl { padding: 0; } .markdown-body dl dt { font-size: 14px; font-weight: bold; font-style: italic; padding: 0; margin: 15px 0 5px; } .markdown-body dl dt:first-child { padding: 0; } .markdown-body dl dt>:first-child { margin-top: 0px; } .markdown-body dl dt>:last-child { margin-bottom: 0px; } .markdown-body dl dd { margin: 0 0 15px; padding: 0 15px; } .markdown-body dl dd>:first-child { margin-top: 0px; } .markdown-body dl dd>:last-child { margin-bottom: 0px; } .markdown-body blockquote { border-left: 4px solid #DDD; padding: 0 15px; color: #777; } .markdown-body blockquote>:first-child { margin-top: 0px; } .markdown-body blockquote>:last-child { margin-bottom: 0px; } .markdown-body table th { font-weight: bold; } .markdown-body table th, .markdown-body table td { border: 1px solid #ccc; padding: 6px 13px; } .markdown-body table tr { border-top: 1px solid #ccc; background-color: #fff; } .markdown-body table tr:nth-child(2n) { background-color: #f8f8f8; } .markdown-body img { max-width: 100%; -moz-box-sizing: border-box; box-sizing: border-box; } .markdown-body span.frame { display: block; overflow: hidden; } .markdown-body span.frame>span { border: 1px solid #ddd; display: block; float: left; overflow: hidden; margin: 13px 0 0; padding: 7px; width: auto; } .markdown-body span.frame span img { display: block; float: left; } .markdown-body span.frame span span { clear: both; color: #333; display: block; padding: 5px 0 0; } .markdown-body span.align-center { display: block; overflow: hidden; clear: both; } .markdown-body span.align-center>span { display: block; overflow: hidden; margin: 13px auto 0; text-align: center; } .markdown-body span.align-center span img { margin: 0 auto; text-align: center; } .markdown-body span.align-right { display: block; overflow: hidden; clear: both; } .markdown-body span.align-right>span { display: block; overflow: hidden; margin: 13px 0 0; text-align: right; } .markdown-body span.align-right span img { margin: 0; text-align: right; } .markdown-body span.float-left { display: block; margin-right: 13px; overflow: hidden; float: left; } .markdown-body span.float-left span { margin: 13px 0 0; } .markdown-body span.float-right { display: block; margin-left: 13px; overflow: hidden; float: right; } .markdown-body span.float-right>span { display: block; overflow: hidden; margin: 13px auto 0; text-align: right; } .markdown-body code, .markdown-body tt { margin: 0 2px; padding: 0px 5px; border: 1px solid #eaeaea; background-color: #f8f8f8; border-radius: 3px; } .markdown-body code { white-space: nowrap; } .markdown-body pre>code { margin: 0; padding: 0; white-space: pre; border: none; background: transparent; } .markdown-body .highlight pre, .markdown-body pre { background-color: #f8f8f8; border: 1px solid #ccc; font-size: 13px; line-height: 19px; overflow: auto; padding: 6px 10px; border-radius: 3px; } .markdown-body pre code, .markdown-body pre tt { margin: 0; padding: 0; background-color: transparent; border: none; } pre { line-height: 125%; } td.linenos .normal { color: inherit; background-color: transparent; padding-left: 5px; padding-right: 5px; } span.linenos { color: inherit; background-color: transparent; padding-left: 5px; padding-right: 5px; } td.linenos .special { color: #000000; background-color: #ffffc0; padding-left: 5px; padding-right: 5px; } span.linenos.special { color: #000000; background-color: #ffffc0; padding-left: 5px; padding-right: 5px; } .highlight .hll { background-color: #ffffcc } .highlight { background: #ffffff; } .highlight .c { color: #888888 } /* Comment */ .highlight .err { color: #a61717; background-color: #e3d2d2 } /* Error */ .highlight .k { color: #008800; font-weight: bold } /* Keyword */ .highlight .ch { color: #888888 } /* Comment.Hashbang */ .highlight .cm { color: #888888 } /* Comment.Multiline */ .highlight .cp { color: #cc0000; font-weight: bold } /* Comment.Preproc */ .highlight .cpf { color: #888888 } /* Comment.PreprocFile */ .highlight .c1 { color: #888888 } /* Comment.Single */ .highlight .cs { color: #cc0000; font-weight: bold; background-color: #fff0f0 } /* Comment.Special */ .highlight .gd { color: #000000; background-color: #ffdddd } /* Generic.Deleted */ .highlight .ge { font-style: italic } /* Generic.Emph */ .highlight .ges { font-weight: bold; font-style: italic } /* Generic.EmphStrong */ .highlight .gr { color: #aa0000 } /* Generic.Error */ .highlight .gh { color: #333333 } /* Generic.Heading */ .highlight .gi { color: #000000; background-color: #ddffdd } /* Generic.Inserted */ .highlight .go { color: #888888 } /* Generic.Output */ .highlight .gp { color: #555555 } /* Generic.Prompt */ .highlight .gs { font-weight: bold } /* Generic.Strong */ .highlight .gu { color: #666666 } /* Generic.Subheading */ .highlight .gt { color: #aa0000 } /* Generic.Traceback */ .highlight .kc { color: #008800; font-weight: bold } /* Keyword.Constant */ .highlight .kd { color: #008800; font-weight: bold } /* Keyword.Declaration */ .highlight .kn { color: #008800; font-weight: bold } /* Keyword.Namespace */ .highlight .kp { color: #008800 } /* Keyword.Pseudo */ .highlight .kr { color: #008800; font-weight: bold } /* Keyword.Reserved */ .highlight .kt { color: #888888; font-weight: bold } /* Keyword.Type */ .highlight .m { color: #0000DD; font-weight: bold } /* Literal.Number */ .highlight .s { color: #dd2200; background-color: #fff0f0 } /* Literal.String */ .highlight .na { color: #336699 } /* Name.Attribute */ .highlight .nb { color: #003388 } /* Name.Builtin */ .highlight .nc { color: #bb0066; font-weight: bold } /* Name.Class */ .highlight .no { color: #003366; font-weight: bold } /* Name.Constant */ .highlight .nd { color: #555555 } /* Name.Decorator */ .highlight .ne { color: #bb0066; font-weight: bold } /* Name.Exception */ .highlight .nf { color: #0066bb; font-weight: bold } /* Name.Function */ .highlight .nl { color: #336699; font-style: italic } /* Name.Label */ .highlight .nn { color: #bb0066; font-weight: bold } /* Name.Namespace */ .highlight .py { color: #336699; font-weight: bold } /* Name.Property */ .highlight .nt { color: #bb0066; font-weight: bold } /* Name.Tag */ .highlight .nv { color: #336699 } /* Name.Variable */ .highlight .ow { color: #008800 } /* Operator.Word */ .highlight .w { color: #bbbbbb } /* Text.Whitespace */ .highlight .mb { color: #0000DD; font-weight: bold } /* Literal.Number.Bin */ .highlight .mf { color: #0000DD; font-weight: bold } /* Literal.Number.Float */ .highlight .mh { color: #0000DD; font-weight: bold } /* Literal.Number.Hex */ .highlight .mi { color: #0000DD; font-weight: bold } /* Literal.Number.Integer */ .highlight .mo { color: #0000DD; font-weight: bold } /* Literal.Number.Oct */ .highlight .sa { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Affix */ .highlight .sb { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Backtick */ .highlight .sc { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Char */ .highlight .dl { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Delimiter */ .highlight .sd { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Doc */ .highlight .s2 { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Double */ .highlight .se { color: #0044dd; background-color: #fff0f0 } /* Literal.String.Escape */ .highlight .sh { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Heredoc */ .highlight .si { color: #3333bb; background-color: #fff0f0 } /* Literal.String.Interpol */ .highlight .sx { color: #22bb22; background-color: #f0fff0 } /* Literal.String.Other */ .highlight .sr { color: #008800; background-color: #fff0ff } /* Literal.String.Regex */ .highlight .s1 { color: #dd2200; background-color: #fff0f0 } /* Literal.String.Single */ .highlight .ss { color: #aa6600; background-color: #fff0f0 } /* Literal.String.Symbol */ .highlight .bp { color: #003388 } /* Name.Builtin.Pseudo */ .highlight .fm { color: #0066bb; font-weight: bold } /* Name.Function.Magic */ .highlight .vc { color: #336699 } /* Name.Variable.Class */ .highlight .vg { color: #dd7700 } /* Name.Variable.Global */ .highlight .vi { color: #3333bb } /* Name.Variable.Instance */ .highlight .vm { color: #336699 } /* Name.Variable.Magic */ .highlight .il { color: #0000DD; font-weight: bold } /* Literal.Number.Integer.Long */

Architecture of Chawan

This document describes some aspects of how Chawan works.

Table of contents

Module organization

Explanation for the separate directories found in src/:

Additionally, "adapters" of various protocols and file formats can be found in adapter/:

Process model

Described as a tree:

Main process

The main process runs code related to the pager. This includes processing user input, printing buffer contents to the screen, and managing buffers in general. The complete list of buffers is only known to the main process.

Mailcap commands are executed by the main process. This depends on knowing the content type of the resource, so the main process also reads in all network headers of navigation responses before launching a buffer process. More on this in Opening buffers.

Forkserver

For forking buffer and loader processes, a forkserver process is launched at the very beginning of every 'cha' invocation. The fork server is responsible for forking the loader process, and also buffer processes.

We use a fork server for two reasons:

  1. It helps clean up child processes when the main process crashes. (We open pipes between the main process and the fork server, and kill all child processes from the fork server on EOF.)
  2. It allows us to start new processes without cloning the pager's entire address space. This reduces the impact of memory bugs somewhat, and also our memory usage.

The fork server is not used for mailcap or CGI processes, because their address space is replaced by exec anyway. (Also, it would be slow.)

Loader

The loader process takes requests from the main process and the buffer processes. Then, depending on the scheme, it responds by performing one of the following steps:

The loader process distinguishes between clients (i.e the main process or buffers) through client keys. In theory this should help against rogue clients, though in practice it is still trivial to crash the loader as a client. It also helps us block further requests from buffers that have been discarded by the pager, but still haven't found out yet that their life time has ended.

Buffer

Buffer processes parse HTML, optionally query external resources from loader, run styling, JS, and finally render the page to an internal canvas.

Buffers are managed by the pager through Container objects. A UNIX domain socket is established between each buffer and the pager to enable communication between them.

Opening buffers

Scenario: the user attempts to navigate to https://example.org.

  1. pager creates a new container for the target URL.
  2. pager sends a request for "https://example.org" to the loader. Then, it registers the file descriptor in its selector, and does something else until poll() reports activity on the file descriptor.
  3. loader rewrites "https://example.org" into "cgi-bin:http". It then runs the http CGI script with the appropriate environment variables set to parts of this URL and request headers.
  4. The http CGI script opens a connection to example.org. When connected, it starts printing out headers it receives to stdout.
  5. loader parses these headers, and sends them to pager.
  6. pager reads in the headers, and decides what to do based on the Content-Type.
    • If Content-Type is found in mailcap, then the command in that mailcap entry is executed, with the response body dup2'd onto its stdin. If the entry has x-htmloutput, then the command's stdout is taken instead of the response body, and Content-Type is set to text/html. Otherwise, the container is discarded.
    • If Content-Type is text/html, then a new buffer process is created, which then parses the response body as HTML. If it is any text/* subtype, then the response is simply inserted into a <plaintext> tag.
    • If Content-Type is not a text/* subtype, and no mailcap entry for it is found, then the user is prompted about where they wish to save the file.

Cache

Chawan's caching mechanism is largely inspired by that of w3m, which does not have a network cache. Instead, it simply saves source files to the disk before displaying them, and lets users view/edit the source without another network request.

Chawan improves upon this by simultaneously streaming files to the cache and buffers:

  1. Client (pager or buffer) initiates request by sending a message to loader.
  2. Loader starts CGI script, reads headers, sends a response, and waits.
  3. Client now may send an "addCacheFile" message, which prompts loader to add a cache file for this request.
  4. Client sends resume, now loader will stream the response both to the client and the cache.

Cached items may be shared between clients; this is how rewinding on wrong charset guess is implemented. They are also manually reference counted and are unlinked when their reference count drops to zero.

The cache is used in the following ways:

Crucially, the cache does not understand Cache-Control headers, and will never skip a download when requested by a user. Similarly, loading a "cache:" URL (e.g. view source) is guaranteed to never make a network request.

Future directions: for non-JS buffers, we could kill idle processes and reload them on-demand from the cache. This could solve the problem of spawning too many processes that then do nothing.

Parsing HTML

The character decoder and the HTML parser are implementations of the WHATWG standard, and are available as separate libraries.

The decoding and parsing of HTML documents happens in buffer processes. This operation is asynchronous; when bytes from the network are exhausted, the buffer will 1) partially render the current document as-is, 2) return it to the pager so that the user can interact with the document.

Character encoding detection is rather primitive; the list specified in encoding.document-charset is enumerated until either no errors are produced by the decoder, or no more charsets exist. In some edge cases, the document must be (and is) re-downloaded from the cache, but this pretty much never happens in real-world scenarios. (The most common case is that the UTF-8 validator just runs through the entire document without reporting errors.)

The HTML parser then consumes the input buffer, which on the happy path (valid UTF-8) is just whatever we pulled from the network as-is. In some cases, a script calls document.write and then the parser is called re-entrantly. (Debugging this is not very fun.)

JavaScript

QuickJS is used by both the pager as a scripting language, and by buffers for running on-page scripts when JavaScript is enabled.

The core JS related functionality has been separated out into the Monoucha library, so it can be used outside of Chawan too. Interested readers are invited to read the Monoucha manual as well.

General

To avoid having to type out all the type conversion & error handling code manually, we have JS pragmas to automagically turn Nim procedures into JavaScript functions. An explanation of what these pragmas are & what they do can be found in the header of js/javascript.nim.

The type conversion itself is handled by the overloaded toJS function and the generic fromJS function. toJS returns a JSValue, the native data type of QuickJS. fromJS returns a Result[T, JSError], which is interpreted as follows:

An additional point of interest is reference types: ref types registered with the registerType macro can be freely passed to JS, and the function- defining macros set functions on their JS prototypes. When a ref type is passed to JS, a shim JS object is associated with the Nim object, and will remain in memory until neither Nim nor JS has references to it.

Effectively, this means that you can expose Nim objects to JS and take Nim objects as arguments through the automagical .jsfunc pragma (& friends) without having to bother with error-prone manual reference counting. How this is achieved is detailed below. (You generally don't need the following info unless you're debugging the JS type conversion logic, in which case I offer my condolences.)

In fact, there is a complication in this system: QuickJS has a reference- counting GC, but Nim also has a reference-counting GC. Associating two objects that are managed by two separate GCs is problematic, because even if you can freely manage the references on both objects, you now have a cycle that only a cycle collector can break up. A cross-GC cycle collector is obviously out of question; then it would be easier to just replace the entire GC in one of the runtimes.

So instead, we hook into the QuickJS cycle collector (through a custom patch). Every time a JS companion object of a Nim object would be freed, we first check if the Nim object still has references from Nim, and if yes, prevent the JS object from being freed by "moving" a reference to the JS object (i.e. unref Nim, ref JS).

Then, if we want to pass the object to JS again, we add no references to the JS object, only to the Nim object. By this, we "moved" the reference back to JS.

This way, the Nim cycle collector can destroy the object without problems if no more references to it exist. But also, if you set some properties on the JS companion object, it will remain even if no more references exist to it in JS for some time, only in Nim. i.e. this works:

document.querySelector("html").canary = "chirp";
console.log(document.querySelector("html").canary); /* chirp */

JS in the pager

Keybindings can be assigned JavaScript functions in the config, and then the pager executes those when the keybindings are pressed.

Also, contents of the start.startup-script option are executed at startup. This is used when cha is called with the -r flag.

There is an API, described at api.md. Web APIs are exposed to pager too, but you cannot operate on the DOMs themselves from the pager, unless you create one yourself with DOMParser.parseFromString.

config.md describes all commands that are used in the default config.

JS in the buffer

The DOM is implemented through the same wrappers as those in pager. (Obviously, the pager modules are not exposed to buffer JS.)

Aside from document.write, it is mostly straightforward, and usually works OK, though too many things are missing to really make it useful.

As for document.write: don't ask. It works as far as I can tell, but I wouldn't know why.

Styling

css/ contains everything related to styling: CSS parsing and cascading.

The parser is not very interesting, it's just an implementation of the CSS 3 parsing module. The latest iteration of the selector parser is pretty good. The media query parser and the CSS value parser both work OK, but are missing some commonly used features like variables.

Cascading is slow, though it could be slower. Chawan has style caching, so re-styles are normally very fast. Also, a hash map is used for reducing initial style calculation times. However, we don't have a Bloom filter yet.

Layout

Layout can be found in the layout/ module.

It has some problems:

Our layout engine is a rather simple procedural layout implementation. It runs in two passes.

In the first pass, it generates the layout tree; this is important because rules for generating anonymous boxes are surprisingly involved. (Specifically, anonymous inline box handling is kind of a mess.)

The second pass then does the actual arrangement of the boxes on the screen. The output tree uses relative coordinates; that is, every box is positioned relative to its parent.

Layout is fully recursive. This means that after a certain nesting depth, the buffer will run out of stack space and promptly crash.

Since we do not cache layout results, and the whole page is layouted (no partial layouting), it gets quite slow on large documents.

Rendering

After layout is finished, the document is rendered onto a text-based canvas, which is represented as a sequence of strings associated with their formatting.

Again, the entire document is rendered, which is the main reason why Chawan performs poorly on large documents.

The positive side of this is that search is very simple (and fast), since we are just running regexes over a linear sequence of strings.