about summary refs log tree commit diff stats
path: root/doc/cha-protocols.5
blob: a5299ea94554a1904fb49c33db210930fceb41a2 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
.\" Automatically generated by Pandoc 3.1.13
.\"
.TH "cha\-protocols" "5" "" "" "Protocol support in Chawan"
.SH Protocols
Chawan supports downloading resources from various protocols: HTTP, FTP,
Gopher, Gemini, and Finger.
Details on these protocols, and information on how users can add support
to their preferred protocols is outlined in this document.
.SS HTTP
HTTP/s support is based on libcurl; supported features largely depend on
your libcurl version.
The adapter is found at \f[CR]adapter/protocol/http.nim\f[R].
.PP
The libcurl HTTP adapter can take arbitrary headers and POST data, is
able to use passed userinfo data
(\f[CR]https://username:password\[at]example.org\f[R]), and returns all
headers and response body it receives from libcurl without exception.
.PP
It is possible to build these adapters using \c
.UR https://github.com/lwthiker/curl-impersonate
curl\-impersonate
.UE \c
\ by setting the compile\-time variable CURLLIBNAME to
\f[CR]libcurl\-impersonate.so\f[R].
Note that for curl\-impersonate to work, you must set
\f[CR]network.default\-headers = {}\f[R] in the Chawan config.
(Otherwise, the libcurl adapter will happily override curl\-impersonate
headers, which is probably not what you want.)
.PP
The \f[CR]bonus/libfetch\f[R] directory contains an alternative HTTP
client, which is based on FreeBSD libfetch.
It is mostly a proof of concept, as FreeBSD libfetch HTTP support is
very limited; in particular, it does not support HTTP headers (beyond
some basic request headers), so e.g.\ cookies will not work.
.SS FTP
Chawan supports FTP through the \f[CR]adapter/protocol/ftp.nim\f[R]
libcurl adapter.
For directory listings, it assumes UNIX output style, and will probably
break horribly on receiving anything else.
Otherwise, the directory listing view is identical to the file://
directory listing.
.PP
SFTP \[lq]works\[rq] too, but YMMV.
Note that if an IdentityFile declaration is found in your ssh config,
then it will prompt for the identity file password, but there is no way
to tell whether it is really asking for that.
Also, settings covered by the Match field are ignored.
.PP
In theory, FTPS should work too, but it is completely untested.
.SS Gopher
Gopher is supported through the \f[CR]adapter/protocol/gopher.nim\f[R]
libcurl adapter.
Gopher directories are passed as the \f[CR]text/gopher\f[R] type, and
\f[CR]adapter/format/gopher.nim\f[R] takes care of converting this to
HTML.
.PP
Gopher selector types are converted to MIME types when possible; note
however, that this is very limited, as most of them (like \f[CR]s\f[R]
sound, or \f[CR]I\f[R] image) cannot be unambiguously converted without
some other sniffing method.
Chawan will fall back to extension\-based detection in these cases, and
in the worst case may end up with \f[CR]application/octet\-stream\f[R].
.SS Gemini
Chawan\[cq]s gemini adapter (in \f[CR]adapter/protocol/gmifetch.c\f[R])
is a C program.
It requires OpenSSL to work.
.PP
Currently, it still has some limitations:
.IP \[bu] 2
It does not support proxies yet.
.IP \[bu] 2
It does not support sites that require private key authentication.
.PP
\f[CR]adapter/format/gmi2html.nim\f[R] is its companion program to
convert the \f[CR]text/gemini\f[R] file format to HTML.
Note that the gemtext specification insists on line breaks being
visually significant, and forbids their collapsing onto a single line;
gmi2html respects this.
However, inline whitespace is still collapsed outside of preformatted
blocks.
.SS Finger
Finger is supported through the \f[CR]adapter/protocol/cha\-finger\f[R]
shell script.
It is implemented as a shell script because of the protocol\[cq]s
simplicity.
cha\-finger uses the \f[CR]curl\f[R] program\[cq]s telnet:// protocol to
make requests.
As such, it will not work if \f[CR]curl\f[R] is not installed.
.PP
Aspiring protocol adapter writers are encouraged to study cha\-finger
for a simple example of how a custom protocol handler could be written.
.SS Spartan
Spartan is a protocol similar to Gemini, but without TLS.
It is supported through the \f[CR]adapter/protocol/spartan\f[R] shell
script, which uses \f[CR]nc\f[R] to make requests.
.PP
Spartan has the very strange property of extending gemtext with a
protocol\-specific line type.
This is sort of supported through a sed filter for gemtext outputs in
the CGI script (in other words, no modification to gmi2html was done to
support this).
.SS Local schemes: file:, about:, man:, data:
While these are not necessarily \f[I]protocols\f[R], they are
implemented similarly to the protocols listed above (and thus can also
be replaced, if the user wishes; see below).
.PP
\f[CR]file:\f[R] loads a file from the local filesystem.
In case of directories, it shows the directory listing like the FTP
protocol does.
.PP
\f[CR]about:\f[R] contains informational pages about the browser.
At the time of writing, the following pages are available:
\f[CR]about:chawan\f[R], \f[CR]about:blank\f[R] and
\f[CR]about:license\f[R].
.PP
\f[CR]man:\f[R], \f[CR]man\-k:\f[R] and \f[CR]man\-l:\f[R] are wrappers
around the commands \f[CR]man\f[R], \f[CR]man \-k\f[R] and
\f[CR]man \-l\f[R].
These look up man pages using \f[CR]/usr/bin/man\f[R] and turn on\-page
references into links.
A wrapper command \f[CR]mancha\f[R] also exists; this has an interface
similar to \f[CR]man\f[R].
Note: this used to be based on w3mman2html.cgi, but it has been
rewritten in Nim (and therefore no longer depends on Perl either).
.PP
\f[CR]data:\f[R] decodes a data URL as defined in RFC 2397.
.SS Internal schemes: cgi\-bin:, stream:, cache:
Three internal protocols exist: \f[CR]cgi\-bin:\f[R], \f[CR]stream:\f[R]
and \f[CR]cache:\f[R].
These are the basic building blocks for the implementation of every
protocol mentioned above; for this reason, these can \f[I]not\f[R] be
replaced, and are implemented in the main browser binary.
.PP
\f[CR]cgi\-bin:\f[R] executes a local CGI script.
This scheme is used for the actual implementation of the non\-internal
protocols mentioned above.
Local CGI scripts can also be used to implement wrappers of other
programs inside Chawan (e.g.\ dictionaries).
.PP
\f[CR]stream:\f[R] is used for reading in streams returned by external
programs or passed to Chawan via standard input.
It differs from \f[CR]cgi\-bin:\f[R] in that it does not cooperate with
the external process, and that the loader does not keep track of where
the stream originally comes from.
Therefore it is suitable for reading in the output of mailcap entries,
or for turning stdin into a URL.
.PP
Since Chawan does not keep track of the origin of \f[CR]stream:\f[R]
URLs, it is not possible to reload them.
(For that matter, reloading stdin does not make much sense anyway.)
To support rewinding and \[lq]view source\[rq], the output of
\f[CR]stream:\f[R]\[cq]s is stored in a temporary file until the buffer
is discarded.
.PP
\f[CR]cache:\f[R] is not something an end user would normally see;
it\[cq]s used for rewinding or re\-interpreting streams already
downloaded.
Note that this is not a real cache; files are deterministically loaded
from the \[lq]cache\[rq] upon certain actions, and from the network upon
others, but neither is used as a fallback to the other.
.SS Custom protocols
Chawan is protocol\-agnostic.
This means that the \f[CR]cha\f[R] binary itself does not know much
about the protocols listed above; instead, it loads these through a
combination of local CGI, urimethodmap, and if conversion to HTML or
plain text is necessary, mailcap (using x\-htmloutput, x\-ansioutput and
copiousoutput).
.PP
urimethodmap can also be used to override default handlers for the
protocols listed above.
This is similar to how w3m allows you to override the default directory
listing display, but much more powerful; this way, any library or
program that can retrieve and output text through a certain protocol can
be combined with Chawan.
.PP
For example, consider the urimethodmap definition of cha\-finger:
.IP
.EX
finger:     cgi\-bin:cha\-finger
.EE
.PP
This commands Chawan to load the cha\-finger CGI script, setting the
\f[CR]$MAPPED_URI_*\f[R] variables to the target URL\[cq]s parts in the
process.
.PP
Then, cha\-finger uses these passed parts to construct an appropriate
curl command that will retrieve the specified \f[CR]finger:\f[R] URL; it
prints the header `Content\-Type: text/plain' to the output, then an
empty line, then the body of the retrieved resource.
If an error is encountered, it prints a \f[CR]Cha\-Control\f[R] header
with an error code and a specific error message instead.
.SS Adding a new protocol
Here we will add a protocol called \[lq]cowsay\[rq], so that the URL
cowsay:text prints the output of \f[CR]cowsay text\f[R] after a second
of waiting.
.PP
First, make sure you have a local CGI path \f[CR]\[ti]/cgi\-bin\f[R] set
up in your \f[CR]\[ti]/.config/chawan/config.toml\f[R]:
.IP
.EX
cgi\-dir = [\[dq]\[ti]/cgi\-bin\[dq], \[dq]${%CHA_LIBEXEC_DIR}/cgi\-bin\[dq]]
.EE
.PP
It is also possible to just put your CGI scripts to
\f[CR]/usr/local/libexec/chawan/cgi\-bin\f[R]; this is enabled by
default, so you need no edits in your config.
But it seems more convenient to use a dedicated cgi\-bin in your home
directory.
.PP
\f[CR]mkdir \[ti]/cgi\-bin\f[R], and create a CGI script in
\f[CR]\[ti]/cgi\-bin\f[R] called \f[CR]cowsay.cgi\f[R]:
.IP
.EX
\f[I]#!/bin/sh\f[R]
\f[I]# We are going to wait a second from now, but want Chawan to show\f[R]
\f[I]# \[dq]Downloading...\[dq] instead of \[dq]Connecting...\[dq]. So signal to the browser that the\f[R]
\f[I]# connection has succeeded.\f[R]
printf \[aq]Cha\-Control: Connectedn\[aq]
sleep 1 \f[I]# sleep\f[R]
\f[I]# Status is a special header that signals the equivalent HTTP status code.\f[R]
printf \[aq]Status: 200\[aq] \f[I]# HTTP OK\f[R]
\f[I]# Tell the browser that no more control headers are to be expected.\f[R]
\f[I]# This is useful when you want to send remotely received headers; then, it would\f[R]
\f[I]# be an attack vector to simply send the headers without ControlDone, as nothing\f[R]
\f[I]# stops the website from sending a Cha\-Control header. With ControlDone sent,\f[R]
\f[I]# even Cha\-Control headers will be interpreted as regular headers.\f[R]
printf \[aq]Cha\-Control: ControlDonen\[aq]
\f[I]# As in HTTP, you must send an empty line before the body.\f[R]
printf \[aq]n\[aq]
\f[I]# Now, print the body. We take the path passed to the URL; urimethodmap\f[R]
\f[I]# sets this as MAPPED_URI_PATH. This is URI\-encoded, so we also run the urldec\f[R]
\f[I]# utility on it.\f[R]
cowsay \[dq]$(printf \[aq]%sn\[aq] \[dq]$MAPPED_URI_PATH\[dq] \f[B]|\f[R] \[dq]$CHA_LIBEXEC_DIR\[dq]/urldec)\[dq]
.EE
.PP
Now, create a \[lq].urimethodmap\[rq] file in your \f[CR]$HOME\f[R]
directory.
.PP
Then, enter into it the following:
.IP
.EX
cowsay:     /cgi\-bin/cowsay.cgi
.EE
.PP
Now try \f[CR]cha cowsay:Hello,%20world.\f[R].
If you did everything correctly, it should wait one second, then print a
cow saying \[lq]Hello, world.\[rq].
.SS See also
\f[B]cha\f[R](1), \f[B]cha\-localcgi\f[R](5),
\f[B]cha\-urimethodmap\f[R](5), \f[B]cha\-mailcap\f[R](5)