about summary refs log tree commit diff stats
path: root/doc/cha-protocols.5
diff options
context:
space:
mode:
Diffstat (limited to 'doc/cha-protocols.5')
-rw-r--r--doc/cha-protocols.5283
1 files changed, 283 insertions, 0 deletions
diff --git a/doc/cha-protocols.5 b/doc/cha-protocols.5
new file mode 100644
index 00000000..202fd991
--- /dev/null
+++ b/doc/cha-protocols.5
@@ -0,0 +1,283 @@
+.\" Automatically generated by Pandoc 2.17.1.1
+.\"
+.\" Define V font for inline verbatim, using C font in formats
+.\" that render this, and otherwise B font.
+.ie "\f[CB]x\f[]"x" \{\
+. ftr V B
+. ftr VI BI
+. ftr VB B
+. ftr VBI BI
+.\}
+.el \{\
+. ftr V CR
+. ftr VI CI
+. ftr VB CB
+. ftr VBI CBI
+.\}
+.TH "cha-protocols" "5" "" "" "Protocol support in Chawan"
+.hy
+.SH Protocols
+.PP
+Chawan supports downloading resources from various protocols: HTTP, FTP,
+Gopher, Gemini, and Finger.
+Details on these protocols, and information on how users can add support
+to their preferred protocols is outlined in this document.
+.SS HTTP
+.PP
+HTTP/s support is based on libcurl; supported features largely depend on
+your libcurl version.
+The adapter is found at \f[V]adapter/protocol/http.nim\f[R].
+.PP
+The libcurl HTTP adapter can take arbitrary headers and POST data, is
+able to use passed userinfo data
+(\f[V]https://username:password\[at]example.org\f[R]), and returns all
+headers and response body it receives from libcurl without exception.
+.PP
+It is possible to build these adapters using
+curl-impersonate (https://github.com/lwthiker/curl-impersonate) by
+setting the compile-time variable CURLLIBNAME to
+\f[V]libcurl-impersonate.so\f[R].
+Note that for curl-impersonate to work, you must set
+\f[V]network.default-headers = {}\f[R] in the Chawan config.
+(Otherwise, the libcurl adapter will happily override curl-impersonate
+headers, which is probably not what you want.)
+.PP
+The \f[V]bonus/libfetch\f[R] directory contains an alternative HTTP
+client, which is based on FreeBSD libfetch.
+It is mostly a proof of concept, as FreeBSD libfetch HTTP support is
+very limited; in particular, it does not support HTTP headers (beyond
+some basic request headers), so e.g.\ cookies will not work.
+.SS FTP
+.PP
+Chawan supports FTP through the \f[V]adapter/protocol/ftp.nim\f[R]
+libcurl adapter.
+For directory listings, it assumes UNIX output style, and will probably
+break horribly on receiving anything else.
+Otherwise, the directory listing view is identical to the file://
+directory listing.
+.PP
+SFTP \[lq]works\[rq] too, but YMMV.
+Note that if an IdentityFile declaration is found in your ssh config,
+then it will prompt for the identity file password, but there is no way
+to tell whether it is really asking for that.
+Also, settings covered by the Match field are ignored.
+.PP
+In theory, FTPS should work too, but it is completely untested.
+.SS Gopher
+.PP
+Gopher is supported through the \f[V]adapter/protocol/gopher.nim\f[R]
+libcurl adapter.
+Gopher directories are passed as the \f[V]text/gopher\f[R] type, and
+\f[V]adapter/format/gopher.nim\f[R] takes care of converting this to
+HTML.
+.PP
+Gopher selector types are converted to MIME types when possible; note
+however, that this is very limited, as most of them (like \f[V]s\f[R]
+sound, or \f[V]I\f[R] image) cannot be unambiguously converted without
+some other sniffing method.
+Chawan will fall back to extension-based detection in these cases, and
+in the worst case may end up with \f[V]application/octet-stream\f[R].
+.SS Gemini
+.PP
+Chawan\[cq]s gemini adapter (in \f[V]adapter/protocol/gmifetch.c\f[R])
+is a C program.
+It requires OpenSSL to work.
+.PP
+Currently, it still has some limitations:
+.IP \[bu] 2
+It does not support proxies yet.
+.IP \[bu] 2
+It does not support sites that require private key authentication.
+.PP
+\f[V]adapter/format/gmi2html.nim\f[R] is its companion program to
+convert the \f[V]text/gemini\f[R] file format to HTML.
+Note that the gemtext specification insists on line breaks being
+visually significant, and forbids their collapsing onto a single line;
+gmi2html respects this.
+However, inline whitespace is still collapsed outside of preformatted
+blocks.
+.SS Finger
+.PP
+Finger is supported through the \f[V]adapter/protocol/cha-finger\f[R]
+shell script.
+It is implemented as a shell script because of the protocol\[cq]s
+simplicity.
+cha-finger uses the \f[V]curl\f[R] program\[cq]s telnet:// protocol to
+make requests.
+As such, it will not work if \f[V]curl\f[R] is not installed.
+.PP
+Aspiring protocol adapter writers are encouraged to study cha-finger for
+a simple example of how a custom protocol handler could be written.
+.SS Spartan
+.PP
+Spartan is a protocol similar to Gemini, but without TLS.
+It is supported through the \f[V]adapter/protocol/spartan\f[R] shell
+script, which uses \f[V]nc\f[R] to make requests.
+.PP
+Spartan has the very strange property of extending gemtext with a
+protocol-specific line type.
+This is sort of supported through a sed filter for gemtext outputs in
+the CGI script (in other words, no modification to gmi2html was done to
+support this).
+.SS Local schemes: file:, about:, man:, data:
+.PP
+While these are not necessarily \f[I]protocols\f[R], they are
+implemented similarly to the protocols listed above (and thus can also
+be replaced, if the user wishes; see below).
+.PP
+\f[V]file:\f[R] loads a file from the local filesystem.
+In case of directories, it shows the directory listing like the FTP
+protocol does.
+.PP
+\f[V]about:\f[R] contains informational pages about the browser.
+At the time of writing, the following pages are available:
+\f[V]about:chawan\f[R], \f[V]about:blank\f[R] and
+\f[V]about:license\f[R].
+.PP
+\f[V]man:\f[R], \f[V]man-k:\f[R] and \f[V]man-l:\f[R] are wrappers
+around the commands \f[V]man\f[R], \f[V]man -k\f[R] and
+\f[V]man -l\f[R].
+These look up man pages using \f[V]/usr/bin/man\f[R] and turn on-page
+references into links.
+A wrapper command \f[V]mancha\f[R] also exists; this has an interface
+similar to \f[V]man\f[R].
+Note: this used to be based on w3mman2html.cgi, but it has been
+rewritten in Nim (and therefore no longer depends on Perl either).
+.PP
+\f[V]data:\f[R] decodes a data URL as defined in RFC 2397.
+.SS Internal schemes: cgi-bin:, stream:, cache:
+.PP
+Three internal protocols exist: \f[V]cgi-bin:\f[R], \f[V]stream:\f[R]
+and \f[V]cache:\f[R].
+These are the basic building blocks for the implementation of every
+protocol mentioned above; for this reason, these can \f[I]not\f[R] be
+replaced, and are implemented in the main browser binary.
+.PP
+\f[V]cgi-bin:\f[R] executes a local CGI script.
+This scheme is used for the actual implementation of the non-internal
+protocols mentioned above.
+Local CGI scripts can also be used to implement wrappers of other
+programs inside Chawan (e.g.\ dictionaries).
+.PP
+\f[V]stream:\f[R] is used for reading in streams returned by external
+programs or passed to Chawan via standard input.
+It differs from \f[V]cgi-bin:\f[R] in that it does not cooperate with
+the external process, and that the loader does not keep track of where
+the stream originally comes from.
+Therefore it is suitable for reading in the output of mailcap entries,
+or for turning stdin into a URL.
+.PP
+Since Chawan does not keep track of the origin of \f[V]stream:\f[R]
+URLs, it is not possible to reload them.
+(For that matter, reloading stdin does not make much sense anyway.)
+To support rewinding and \[lq]view source\[rq], the output of
+\f[V]stream:\f[R]\[cq]s is stored in a temporary file until the buffer
+is discarded.
+.PP
+\f[V]cache:\f[R] is not something an end user would normally see;
+it\[cq]s used for rewinding or re-interpreting streams already
+downloaded.
+Note that this is not a real cache; files are deterministically loaded
+from the \[lq]cache\[rq] upon certain actions, and from the network upon
+others, but neither is used as a fallback to the other.
+.SS Custom protocols
+.PP
+Chawan is protocol-agnostic.
+This means that the \f[V]cha\f[R] binary itself does not know much about
+the protocols listed above; instead, it loads these through a
+combination of local CGI, urimethodmap, and if conversion to HTML or
+plain text is necessary, mailcap (using x-htmloutput, x-ansioutput and
+copiousoutput).
+.PP
+urimethodmap can also be used to override default handlers for the
+protocols listed above.
+This is similar to how w3m allows you to override the default directory
+listing display, but much more powerful; this way, any library or
+program that can retrieve and output text through a certain protocol can
+be combined with Chawan.
+.PP
+For example, consider the urimethodmap definition of cha-finger:
+.IP
+.nf
+\f[C]
+finger:     cgi-bin:cha-finger
+\f[R]
+.fi
+.PP
+This commands Chawan to load the cha-finger CGI script, setting the
+\f[V]$MAPPED_URI_*\f[R] variables to the target URL\[cq]s parts in the
+process.
+.PP
+Then, cha-finger uses these passed parts to construct an appropriate
+curl command that will retrieve the specified \f[V]finger:\f[R] URL; it
+prints the header `Content-Type: text/plain' to the output, then an
+empty line, then the body of the retrieved resource.
+If an error is encountered, it prints a \f[V]Cha-Control\f[R] header
+with an error code and a specific error message instead.
+.SS Adding a new protocol
+.PP
+Here we will add a protocol called \[lq]cowsay\[rq], so that the URL
+cowsay:text prints the output of \f[V]cowsay text\f[R] after a second of
+waiting.
+.PP
+First, make sure you have a local CGI path \f[V]\[ti]/cgi-bin\f[R] set
+up in your \f[V]\[ti]/.config/chawan/config.toml\f[R]:
+.IP
+.nf
+\f[C]
+cgi-dir = [\[dq]\[ti]/cgi-bin\[dq], \[dq]${%CHA_LIBEXEC_DIR}/cgi-bin\[dq]]
+\f[R]
+.fi
+.PP
+It is also possible to just put your CGI scripts to
+\f[V]/usr/local/libexec/chawan/cgi-bin\f[R]; this is enabled by default,
+so you need no edits in your config.
+But it seems more convenient to use a dedicated cgi-bin in your home
+directory.
+.PP
+\f[V]mkdir \[ti]/cgi-bin\f[R], and create a CGI script in
+\f[V]\[ti]/cgi-bin\f[R] called \f[V]cowsay.cgi\f[R]:
+.IP
+.nf
+\f[C]
+#!/bin/sh
+# We are going to wait a second from now, but want Chawan to show
+# \[dq]Downloading...\[dq] instead of \[dq]Connecting...\[dq]. So signal to the browser that the
+# connection has succeeded.
+printf \[aq]Cha-Control: Connectedn\[aq]
+sleep 1 # sleep
+# Status is a special header that signals the equivalent HTTP status code.
+printf \[aq]Status: 200\[aq] # HTTP OK
+# Tell the browser that no more control headers are to be expected.
+# This is useful when you want to send remotely received headers; then, it would
+# be an attack vector to simply send the headers without ControlDone, as nothing
+# stops the website from sending a Cha-Control header. With ControlDone sent,
+# even Cha-Control headers will be interpreted as regular headers.
+printf \[aq]Cha-Control: ControlDonen\[aq]
+# As in HTTP, you must send an empty line before the body.
+printf \[aq]n\[aq]
+# Now, print the body. We take the path passed to the URL; urimethodmap
+# sets this as MAPPED_URI_PATH. This is URI-encoded, so we also run the urldec
+# utility on it.
+cowsay \[dq]$(printf \[aq]%sn\[aq] \[dq]$MAPPED_URI_PATH\[dq] | \[dq]$CHA_LIBEXEC_DIR\[dq]/urldec)\[dq]
+\f[R]
+.fi
+.PP
+Now, create a \[lq].urimethodmap\[rq] file in your \f[V]$HOME\f[R]
+directory.
+.PP
+Then, enter into it the following:
+.IP
+.nf
+\f[C]
+cowsay:     /cgi-bin/cowsay.cgi
+\f[R]
+.fi
+.PP
+Now try \f[V]cha cowsay:Hello,%20world.\f[R].
+If you did everything correctly, it should wait one second, then print a
+cow saying \[lq]Hello, world.\[rq].
+.SS See also
+.PP
+\f[B]cha\f[R](1), \f[B]cha-localcgi\f[R](5),
+\f[B]cha-urimethodmap\f[R](5), \f[B]cha-mailcap\f[R](5)