diff options
Diffstat (limited to 'doc/cha-protocols.5')
-rw-r--r-- | doc/cha-protocols.5 | 283 |
1 files changed, 283 insertions, 0 deletions
diff --git a/doc/cha-protocols.5 b/doc/cha-protocols.5 new file mode 100644 index 00000000..202fd991 --- /dev/null +++ b/doc/cha-protocols.5 @@ -0,0 +1,283 @@ +.\" Automatically generated by Pandoc 2.17.1.1 +.\" +.\" Define V font for inline verbatim, using C font in formats +.\" that render this, and otherwise B font. +.ie "\f[CB]x\f[]"x" \{\ +. ftr V B +. ftr VI BI +. ftr VB B +. ftr VBI BI +.\} +.el \{\ +. ftr V CR +. ftr VI CI +. ftr VB CB +. ftr VBI CBI +.\} +.TH "cha-protocols" "5" "" "" "Protocol support in Chawan" +.hy +.SH Protocols +.PP +Chawan supports downloading resources from various protocols: HTTP, FTP, +Gopher, Gemini, and Finger. +Details on these protocols, and information on how users can add support +to their preferred protocols is outlined in this document. +.SS HTTP +.PP +HTTP/s support is based on libcurl; supported features largely depend on +your libcurl version. +The adapter is found at \f[V]adapter/protocol/http.nim\f[R]. +.PP +The libcurl HTTP adapter can take arbitrary headers and POST data, is +able to use passed userinfo data +(\f[V]https://username:password\[at]example.org\f[R]), and returns all +headers and response body it receives from libcurl without exception. +.PP +It is possible to build these adapters using +curl-impersonate (https://github.com/lwthiker/curl-impersonate) by +setting the compile-time variable CURLLIBNAME to +\f[V]libcurl-impersonate.so\f[R]. +Note that for curl-impersonate to work, you must set +\f[V]network.default-headers = {}\f[R] in the Chawan config. +(Otherwise, the libcurl adapter will happily override curl-impersonate +headers, which is probably not what you want.) +.PP +The \f[V]bonus/libfetch\f[R] directory contains an alternative HTTP +client, which is based on FreeBSD libfetch. +It is mostly a proof of concept, as FreeBSD libfetch HTTP support is +very limited; in particular, it does not support HTTP headers (beyond +some basic request headers), so e.g.\ cookies will not work. +.SS FTP +.PP +Chawan supports FTP through the \f[V]adapter/protocol/ftp.nim\f[R] +libcurl adapter. +For directory listings, it assumes UNIX output style, and will probably +break horribly on receiving anything else. +Otherwise, the directory listing view is identical to the file:// +directory listing. +.PP +SFTP \[lq]works\[rq] too, but YMMV. +Note that if an IdentityFile declaration is found in your ssh config, +then it will prompt for the identity file password, but there is no way +to tell whether it is really asking for that. +Also, settings covered by the Match field are ignored. +.PP +In theory, FTPS should work too, but it is completely untested. +.SS Gopher +.PP +Gopher is supported through the \f[V]adapter/protocol/gopher.nim\f[R] +libcurl adapter. +Gopher directories are passed as the \f[V]text/gopher\f[R] type, and +\f[V]adapter/format/gopher.nim\f[R] takes care of converting this to +HTML. +.PP +Gopher selector types are converted to MIME types when possible; note +however, that this is very limited, as most of them (like \f[V]s\f[R] +sound, or \f[V]I\f[R] image) cannot be unambiguously converted without +some other sniffing method. +Chawan will fall back to extension-based detection in these cases, and +in the worst case may end up with \f[V]application/octet-stream\f[R]. +.SS Gemini +.PP +Chawan\[cq]s gemini adapter (in \f[V]adapter/protocol/gmifetch.c\f[R]) +is a C program. +It requires OpenSSL to work. +.PP +Currently, it still has some limitations: +.IP \[bu] 2 +It does not support proxies yet. +.IP \[bu] 2 +It does not support sites that require private key authentication. +.PP +\f[V]adapter/format/gmi2html.nim\f[R] is its companion program to +convert the \f[V]text/gemini\f[R] file format to HTML. +Note that the gemtext specification insists on line breaks being +visually significant, and forbids their collapsing onto a single line; +gmi2html respects this. +However, inline whitespace is still collapsed outside of preformatted +blocks. +.SS Finger +.PP +Finger is supported through the \f[V]adapter/protocol/cha-finger\f[R] +shell script. +It is implemented as a shell script because of the protocol\[cq]s +simplicity. +cha-finger uses the \f[V]curl\f[R] program\[cq]s telnet:// protocol to +make requests. +As such, it will not work if \f[V]curl\f[R] is not installed. +.PP +Aspiring protocol adapter writers are encouraged to study cha-finger for +a simple example of how a custom protocol handler could be written. +.SS Spartan +.PP +Spartan is a protocol similar to Gemini, but without TLS. +It is supported through the \f[V]adapter/protocol/spartan\f[R] shell +script, which uses \f[V]nc\f[R] to make requests. +.PP +Spartan has the very strange property of extending gemtext with a +protocol-specific line type. +This is sort of supported through a sed filter for gemtext outputs in +the CGI script (in other words, no modification to gmi2html was done to +support this). +.SS Local schemes: file:, about:, man:, data: +.PP +While these are not necessarily \f[I]protocols\f[R], they are +implemented similarly to the protocols listed above (and thus can also +be replaced, if the user wishes; see below). +.PP +\f[V]file:\f[R] loads a file from the local filesystem. +In case of directories, it shows the directory listing like the FTP +protocol does. +.PP +\f[V]about:\f[R] contains informational pages about the browser. +At the time of writing, the following pages are available: +\f[V]about:chawan\f[R], \f[V]about:blank\f[R] and +\f[V]about:license\f[R]. +.PP +\f[V]man:\f[R], \f[V]man-k:\f[R] and \f[V]man-l:\f[R] are wrappers +around the commands \f[V]man\f[R], \f[V]man -k\f[R] and +\f[V]man -l\f[R]. +These look up man pages using \f[V]/usr/bin/man\f[R] and turn on-page +references into links. +A wrapper command \f[V]mancha\f[R] also exists; this has an interface +similar to \f[V]man\f[R]. +Note: this used to be based on w3mman2html.cgi, but it has been +rewritten in Nim (and therefore no longer depends on Perl either). +.PP +\f[V]data:\f[R] decodes a data URL as defined in RFC 2397. +.SS Internal schemes: cgi-bin:, stream:, cache: +.PP +Three internal protocols exist: \f[V]cgi-bin:\f[R], \f[V]stream:\f[R] +and \f[V]cache:\f[R]. +These are the basic building blocks for the implementation of every +protocol mentioned above; for this reason, these can \f[I]not\f[R] be +replaced, and are implemented in the main browser binary. +.PP +\f[V]cgi-bin:\f[R] executes a local CGI script. +This scheme is used for the actual implementation of the non-internal +protocols mentioned above. +Local CGI scripts can also be used to implement wrappers of other +programs inside Chawan (e.g.\ dictionaries). +.PP +\f[V]stream:\f[R] is used for reading in streams returned by external +programs or passed to Chawan via standard input. +It differs from \f[V]cgi-bin:\f[R] in that it does not cooperate with +the external process, and that the loader does not keep track of where +the stream originally comes from. +Therefore it is suitable for reading in the output of mailcap entries, +or for turning stdin into a URL. +.PP +Since Chawan does not keep track of the origin of \f[V]stream:\f[R] +URLs, it is not possible to reload them. +(For that matter, reloading stdin does not make much sense anyway.) +To support rewinding and \[lq]view source\[rq], the output of +\f[V]stream:\f[R]\[cq]s is stored in a temporary file until the buffer +is discarded. +.PP +\f[V]cache:\f[R] is not something an end user would normally see; +it\[cq]s used for rewinding or re-interpreting streams already +downloaded. +Note that this is not a real cache; files are deterministically loaded +from the \[lq]cache\[rq] upon certain actions, and from the network upon +others, but neither is used as a fallback to the other. +.SS Custom protocols +.PP +Chawan is protocol-agnostic. +This means that the \f[V]cha\f[R] binary itself does not know much about +the protocols listed above; instead, it loads these through a +combination of local CGI, urimethodmap, and if conversion to HTML or +plain text is necessary, mailcap (using x-htmloutput, x-ansioutput and +copiousoutput). +.PP +urimethodmap can also be used to override default handlers for the +protocols listed above. +This is similar to how w3m allows you to override the default directory +listing display, but much more powerful; this way, any library or +program that can retrieve and output text through a certain protocol can +be combined with Chawan. +.PP +For example, consider the urimethodmap definition of cha-finger: +.IP +.nf +\f[C] +finger: cgi-bin:cha-finger +\f[R] +.fi +.PP +This commands Chawan to load the cha-finger CGI script, setting the +\f[V]$MAPPED_URI_*\f[R] variables to the target URL\[cq]s parts in the +process. +.PP +Then, cha-finger uses these passed parts to construct an appropriate +curl command that will retrieve the specified \f[V]finger:\f[R] URL; it +prints the header `Content-Type: text/plain' to the output, then an +empty line, then the body of the retrieved resource. +If an error is encountered, it prints a \f[V]Cha-Control\f[R] header +with an error code and a specific error message instead. +.SS Adding a new protocol +.PP +Here we will add a protocol called \[lq]cowsay\[rq], so that the URL +cowsay:text prints the output of \f[V]cowsay text\f[R] after a second of +waiting. +.PP +First, make sure you have a local CGI path \f[V]\[ti]/cgi-bin\f[R] set +up in your \f[V]\[ti]/.config/chawan/config.toml\f[R]: +.IP +.nf +\f[C] +cgi-dir = [\[dq]\[ti]/cgi-bin\[dq], \[dq]${%CHA_LIBEXEC_DIR}/cgi-bin\[dq]] +\f[R] +.fi +.PP +It is also possible to just put your CGI scripts to +\f[V]/usr/local/libexec/chawan/cgi-bin\f[R]; this is enabled by default, +so you need no edits in your config. +But it seems more convenient to use a dedicated cgi-bin in your home +directory. +.PP +\f[V]mkdir \[ti]/cgi-bin\f[R], and create a CGI script in +\f[V]\[ti]/cgi-bin\f[R] called \f[V]cowsay.cgi\f[R]: +.IP +.nf +\f[C] +#!/bin/sh +# We are going to wait a second from now, but want Chawan to show +# \[dq]Downloading...\[dq] instead of \[dq]Connecting...\[dq]. So signal to the browser that the +# connection has succeeded. +printf \[aq]Cha-Control: Connectedn\[aq] +sleep 1 # sleep +# Status is a special header that signals the equivalent HTTP status code. +printf \[aq]Status: 200\[aq] # HTTP OK +# Tell the browser that no more control headers are to be expected. +# This is useful when you want to send remotely received headers; then, it would +# be an attack vector to simply send the headers without ControlDone, as nothing +# stops the website from sending a Cha-Control header. With ControlDone sent, +# even Cha-Control headers will be interpreted as regular headers. +printf \[aq]Cha-Control: ControlDonen\[aq] +# As in HTTP, you must send an empty line before the body. +printf \[aq]n\[aq] +# Now, print the body. We take the path passed to the URL; urimethodmap +# sets this as MAPPED_URI_PATH. This is URI-encoded, so we also run the urldec +# utility on it. +cowsay \[dq]$(printf \[aq]%sn\[aq] \[dq]$MAPPED_URI_PATH\[dq] | \[dq]$CHA_LIBEXEC_DIR\[dq]/urldec)\[dq] +\f[R] +.fi +.PP +Now, create a \[lq].urimethodmap\[rq] file in your \f[V]$HOME\f[R] +directory. +.PP +Then, enter into it the following: +.IP +.nf +\f[C] +cowsay: /cgi-bin/cowsay.cgi +\f[R] +.fi +.PP +Now try \f[V]cha cowsay:Hello,%20world.\f[R]. +If you did everything correctly, it should wait one second, then print a +cow saying \[lq]Hello, world.\[rq]. +.SS See also +.PP +\f[B]cha\f[R](1), \f[B]cha-localcgi\f[R](5), +\f[B]cha-urimethodmap\f[R](5), \f[B]cha-mailcap\f[R](5) |