about summary refs log tree commit diff stats
path: root/doc/cha-localcgi.5
blob: 08f0e42b76861b86b0448b3e5bc2283ddff29b53 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
.\" Automatically generated by Pandoc 3.4
.\"
.TH "cha\-localcgi" "5" "" "" "Local CGI support in Chawan"
.SH Local CGI support in Chawan
Chawan supports the invocation of CGI scripts locally.
This feature can be used in the following way:
.IP \[bu] 2
All local CGI scripts must be placed in a directory specified in
\f[CR]external.cgi\-dir\f[R].
Multiple directories can be specified in an array too, and directories
specified first have higher precedence.
.IP \[bu] 2
Then, a CGI script in one of these directories can be executed by
visiting the URL \f[CR]cgi\-bin:script\-name\f[R].
$PATH_INFO and $QUERY_STRING are set as normal,
i.e.\ \f[CR]cgi\-bin:script\-name/abcd?defgh=ijkl\f[R] will set
$PATH_INFO to \f[CR]/abcd\f[R], and $QUERY_STRING to
\f[CR]defgh=ijkl\f[R].
.PP
Further notes on processing CGI paths:
.IP \[bu] 2
The URL must be opaque, so you must not add a double slash after the
scheme.
e.g.\ \f[CR]cgi\-bin://script\-name\f[R] will NOT work, only
\f[CR]cgi\-bin:script\-name\f[R].
.IP \[bu] 2
Paths beginning with \f[CR]/cgi\-bin/\f[R] or \f[CR]/$LIB/\f[R] are
stripped of this segment automatically.
So e.g.\ \f[CR]cgi\-bin:/cgi\-bin/script\-name\f[R] becomes
\f[CR]cgi\-bin:script\-name\f[R].
.IP \[bu] 2
If \f[CR]external.w3m\-cgi\-compat\f[R] is true, file: URLs are
converted to cgi\-bin: URLs if the path name starts with
\f[CR]/cgi\-bin/\f[R], \f[CR]/$LIB/\f[R], or the path of a local CGI
script.
Note: this is unsafe, please do not use it unless you must.
.IP \[bu] 2
Absolute paths are accepted as
e.g.\ \f[CR]cgi\-bin:/path/to/cgi/dir/script\-name\f[R].
Note however, that this only works if \f[CR]/path/to/cgi/dir\f[R] has
already been specified as a CGI directory in
\f[CR]external.cgi\-dir\f[R].
.PP
Note that this is different from w3m\[cq]s cgi\-bin functionality, in
that we use a custom scheme for local CGI instead of interpreting all
requests to a designated path as a CGI request.
(This incompatibility is bridged over when
\f[CR]external.w3m\-cgi\-compat\f[R] is true.)
.SS Headers
Local CGI scripts may send some headers that Chawan will interpret
specially (and thus will not pass forward to e.g.\ the fetch API, etc):
.IP \[bu] 2
\f[CR]Status\f[R]: interpreted as the HTTP status code.
.IP \[bu] 2
\f[CR]Cha\-Control\f[R]: special header, see below.
.PP
Note that these headers MUST be sent before any regular headers.
Headers received after a regular header or a
\f[CR]Cha\-Control: ControlDone\f[R] header will be treated as regular
headers.
.PP
The \f[CR]Cha\-Control\f[R] header\[cq]s value is parsed as follows:
.IP
.EX
Cha\-Control\-Value = Command *Parameter
Command = ALPHA *ALPHA
Parameter = SPACE *CHAR
.EE
.PP
In other words, it is \f[CR]Command [Param1] [Param2] ...\f[R].
.PP
Currently available commands are:
.IP \[bu] 2
\f[CR]Connected\f[R]: Takes no parameters.
Must be the first reported header; it means that connection to the
server has been successfully established, but no data has been received
yet.
When any other header is sent first, Chawan will act as if a
\f[CR]Cha\-Control: Connected\f[R] header had been implicitly sent
before that.
.IP \[bu] 2
\f[CR]ConnectionError\f[R]: Must be the first reported header.
Parameter 1 is the error code, see below.
If any following parameters are given, they are concatenated to form a
custom error message.
Note: short but descriptive error messages are preferred, messages that
do not fit on the screen are currently truncated.
(TODO fix this somehow :P)
.IP \[bu] 2
\f[CR]ControlDone\f[R]: Signals that no more special headers will be
sent; this means that \f[CR]Cha\-Control\f[R] and \f[CR]Status\f[R]
headers sent after this must be interpreted as regular headers (and thus
e.g.\ will be available for JS code calling the script using the fetch
API).
WARNING: this header must be sent before any non\-hardcoded headers that
take external input.
For example, an HTTP client would have to send
\f[CR]Cha\-Control: ControlDone\f[R] before returning the retrieved
headers.
.PP
Following is a list of error codes and their string counterparts.
CGI scripts may use either (but not both) in a ConnectionError header.
.IP \[bu] 2
\f[CR]1 InternalError\f[R]: An internal error prevented the script from
retrieving the requested resource.
CGI scripts can also use this to signal that they have no information on
what went wrong.
.IP \[bu] 2
\f[CR]2 InvalidMethod\f[R]: The client requested data using a method not
supported by this protocol.
.IP \[bu] 2
\f[CR]3 InvalidURL\f[R]: The request URL could not be interpreted as a
valid URL for this format.
.IP \[bu] 2
\f[CR]4 FileNotFound\f[R]: No file was found at the requested address,
and thus the request is meaningless.
Note: this should only be used by protocols that do not rely on a
client\-server architecture, e.g.\ local file access, local databases,
or peer\-to\-peer file retrieval mechanisms.
A server responding with \[lq]no file found\[rq] is NOT a connection
error, and is better represented as a response with a 404 status code.
.IP \[bu] 2
\f[CR]5 FailedToResolveHost\f[R]: The hostname could not be resolved.
.IP \[bu] 2
\f[CR]6 FailedToResolveProxy\f[R]: The proxy could not be resolved.
.IP \[bu] 2
\f[CR]7 ConnectionRefused\f[R]: The server refused to establish a
connection.
.IP \[bu] 2
\f[CR]8 ProxyRefusedToConnect\f[R]: The proxy refused to establish a
connection.
.SS Environment variables
Chawan sets the following environment variables:
.IP \[bu] 2
\f[CR]SERVER_SOFTWARE=\[dq]Chawan\[dq]\f[R]
.IP \[bu] 2
\f[CR]SERVER_PROTOCOL=\[dq]HTTP/1.0\[dq]\f[R]
.IP \[bu] 2
\f[CR]SERVER_NAME=\[dq]localhost\[dq]\f[R]
.IP \[bu] 2
\f[CR]SERVER_PORT=\[dq]80\[dq]\f[R]
.IP \[bu] 2
\f[CR]REMOTE_HOST=\[dq]localhost\[dq]\f[R]
.IP \[bu] 2
\f[CR]REMOTE_ADDR=\[dq]127.0.0.1\[dq]\f[R]
.IP \[bu] 2
\f[CR]GATEWAY_INTERFACE=\[dq]CGI/1.1\[dq]\f[R]
.IP \[bu] 2
\f[CR]SCRIPT_NAME=\[dq]/cgi\-bin/script\-name\[dq]\f[R] if called with a
relative path, and \f[CR]\[dq]/path/to/script/script\-name\[dq]\f[R] if
called with an absolute path.
.IP \[bu] 2
\f[CR]SCRIPT_FILENAME=\[dq]/path/to/script/script\-name\[dq]\f[R]
.IP \[bu] 2
\f[CR]QUERY_STRING=\f[R] the query string (i.e.\ \f[CR]URL.search\f[R]).
Note that this variable is percent\-encoded.
.IP \[bu] 2
\f[CR]PATH_INFO=\f[R] everything after the script\[cq]s path name,
e.g.\ for \f[CR]cgi\-bin:script\-name/abcd/efgh\f[R]
\f[CR]\[dq]/abcd/efgh\[dq]\f[R].
Note that this variable is NOT percent\-encoded.
.IP \[bu] 2
\f[CR]REQUEST_URI=\[dq]$SCRIPT_NAME/$PATH_INFO?$QUERY_STRING\f[R]
.IP \[bu] 2
\f[CR]REQUEST_METHOD=\f[R] HTTP method used for making the request,
e.g.\ GET or POST
.IP \[bu] 2
\f[CR]REQUEST_HEADERS=\f[R] A newline\-separated list of all headers for
this request.
.IP \[bu] 2
\f[CR]CHA_LIBEXEC_DIR=\f[R] The libexec directory Chawan was configured
to use at compile time.
See the tools section below for details of why this is useful.
.IP \[bu] 2
\f[CR]CONTENT_TYPE=\f[R] for POST requests, the Content\-Type header.
Not set for other request types (e.g.\ GET).
.IP \[bu] 2
\f[CR]CONTENT_LENGTH=\f[R] the content length, if $CONTENT_TYPE has been
set.
.IP \[bu] 2
\f[CR]ALL_PROXY=\f[R] if a proxy has been set, the proxy URL.
WARNING: for security reasons, this MUST be respected when making
external connections.
If a CGI script does not support proxies, it must never make any
external connections when the \f[CR]ALL_PROXY\f[R] variable is set, even
if this results in it returning an error.
.IP \[bu] 2
\f[CR]HTTP_COOKIE=\f[R] if set, the Cookie header.
.IP \[bu] 2
\f[CR]HTTP_REFERER=\f[R] if set, the Referer header.
.PP
For requests originating from a urimethodmap rewrite, Chawan will also
set the parsed URL\[cq]s parts as environment variables.
Use of these is highly encouraged, to avoid exploits originating from
double\-parsing of URLs.
.PP
e.g.\ if
example://username:password\[at]example.org:1234/path/name.html?example
is the original URL, then:
.IP \[bu] 2
\f[CR]MAPPED_URI_SCHEME=\f[R] the scheme of the original URL, in this
case \f[CR]example\f[R].
.IP \[bu] 2
\f[CR]MAPPED_URI_USERNAME=\f[R] the username part, in this case
\f[CR]username\f[R].
If no username was specified, the variable is set to the empty string.
.IP \[bu] 2
\f[CR]MAPPED_URI_PASSWORD=\f[R] the password part, in this case
\f[CR]password\f[R].
If no password was specified, the variable is set to the empty string.
.IP \[bu] 2
\f[CR]MAPPED_URI_HOST=\f[R] the host part, in this case
\f[CR]host.org\f[R] If no host was specified, the variable is set to the
empty string.
(An example of a URL with no host: \f[CR]about:blank\f[R], here
\f[CR]blank\f[R] is the path name.)
.IP \[bu] 2
\f[CR]MAPPED_URI_PORT=\f[R] the port, in this case \f[CR]1234\f[R].
If no port was specified, the variable is set to the empty string.
(In this case, the CGI script is expected to use the default port for
the scheme, if any.)
.IP \[bu] 2
\f[CR]MAPPED_URI_PATH=\f[R] the path name, in this case
\f[CR]/path/name.html?example\f[R].
If no path was specified, the variable is set to the empty string.
Note: the path name is percent\-encoded.
.IP \[bu] 2
\f[CR]MAPPED_URI_QUERY=\f[R] the query string, in this case
\f[CR]example\f[R].
Note that, unlike in JavaScript, no question mark is prepended to the
string.
The query string is percent\-encoded as well.
.PP
Note: the fragment part is omitted intentionally.
.SS Request body
If the request body is not empty, it is streamed into the program
through the standard input.
.PP
Note that this may be both an application/x\-www\-form\-urlencoded or a
multipart/form\-data request; \f[CR]CONTENT_TYPE\f[R] stores information
about the request type, and in case of a multipart request, the boundary
as well.
.SS Tools
Chawan provides certain helper binaries that may be useful for CGI
scripts.
These can be portably accessed by executing
\f[CR]\[dq]$CHA_LIBEXEC_DIR\[dq]/[program name]\f[R].
.PP
Currently, the following tools are available:
.IP \[bu] 2
\f[CR]urldec\f[R]: percent\-decode strings passed on standard input.
.IP \[bu] 2
\f[CR]urlenc\f[R]: percent\-encode strings passed on standard input,
taking a percent\-encode set as the first parameter.
.SS Troubleshooting
Note that standard error is redirected to the browser console (by
default, M\-cM\-c).
This makes it easy to debug a misbehaving CGI script, but may also slow
down the browser in case of excessive logging.
If this is not the desired behavior, we recommend wrapping your script
into a shell script that redirects stderr to /dev/null.
.SS My script is returning a \[lq]no local\-CGI directory configured\[rq] error message.
Currently, the default setting includes a cgi\-bin directory at
\f[CR]$(which cha)/../libexec/chawan/cgi\-bin\f[R], which usually looks
something like \f[CR]/usr/local/libexec/chawan/cgi\-bin\f[R].
You only get the above message if you intentionally set the cgi\-dir
setting to an empty array.
(This will likely break everything else too, so do not.)
.PP
To change the default local\-CGI directory, use the
\f[CR]external.cgi\-dir\f[R] option.
.PP
e.g.\ you could add this to your config.toml:
.IP
.EX
\f[B][external]\f[R]
cgi\-dir = [\[dq]\[ti]/cgi\-bin\[dq], \[dq]${%CHA_LIBEXEC_DIR}/cgi\-bin\[dq]]
.EE
.PP
and then put your script in \f[CR]$HOME/cgi\-bin\f[R].
Note the second element in the array; if you don\[cq]t add it, the
default CGI scripts (including http, https, etc\&...)
will not work.
.SS My script is returning a \[lq]Failed to execute script\[rq] error message.
This means the \f[CR]execl\f[R] call to the script failed.
Make sure that your CGI script\[cq]s executable bit is set, i.e.\ run
\f[CR]chmod +x /path/to/cgi/script\f[R].
.SS My script is returning an \[lq]invalid CGI path\[rq] error message.
Make sure that you did not include leading slashes.
Reminder: \f[CR]cgi\-bin://script\-name\f[R] does not work, use
\f[CR]cgi\-bin:script\-name\f[R].
.SS My script is returning a \[lq]CGI file not found\[rq] error message.
Double check that your CGI script is in the correct location.
Also, make sure that you are not accidentally calling the script with an
absolute path via \f[CR]cgi\-bin:/script\-name\f[R] (instead of the
correct \f[CR]cgi\-bin:script\-name\f[R]).
.PP
It is also possible that \f[CR]external.cgi\-dir\f[R] is not really set
to the directory your script is in.
Note that by default, this depends on the binary\[cq]s path, so e.g.\ if
your binary is in \f[CR]\[ti]/src/chawan/target/release/bin/cha\f[R],
but you put your CGI script to
\f[CR]/usr/local/libexec/chawan/cgi\-bin\f[R], then it will not work.
.SS My script is returning a \[lq]failed to set up CGI script\[rq] error message.
This means that either \f[CR]pipe\f[R] or \f[CR]fork\f[R] failed.
Something strange is going on with your system; we recommend exorcism.
(Maybe you are running out of memory?)
.SS See also
\f[B]cha\f[R](1)