diff options
Diffstat (limited to 'doc/cha-localcgi.5')
-rw-r--r-- | doc/cha-localcgi.5 | 336 |
1 files changed, 336 insertions, 0 deletions
diff --git a/doc/cha-localcgi.5 b/doc/cha-localcgi.5 new file mode 100644 index 00000000..c9a8a54e --- /dev/null +++ b/doc/cha-localcgi.5 @@ -0,0 +1,336 @@ +.\" Automatically generated by Pandoc 2.17.1.1 +.\" +.\" Define V font for inline verbatim, using C font in formats +.\" that render this, and otherwise B font. +.ie "\f[CB]x\f[]"x" \{\ +. ftr V B +. ftr VI BI +. ftr VB B +. ftr VBI BI +.\} +.el \{\ +. ftr V CR +. ftr VI CI +. ftr VB CB +. ftr VBI CBI +.\} +.TH "cha-localcgi" "5" "" "" "Local CGI support in Chawan" +.hy +.SH Local CGI support in Chawan +.PP +Chawan supports the invocation of CGI scripts locally. +This feature can be used in the following way: +.IP \[bu] 2 +All local CGI scripts must be placed in a directory specified in +\f[V]external.cgi-dir\f[R]. +Multiple directories can be specified in an array too, and directories +specified first have higher precedence. +.IP \[bu] 2 +Then, a CGI script in one of these directories can be executed by +visiting the URL \f[V]cgi-bin:script-name\f[R]. +$PATH_INFO and $QUERY_STRING are set as normal, +i.e.\ \f[V]cgi-bin:script-name/abcd?defgh=ijkl\f[R] will set $PATH_INFO +to \f[V]/abcd\f[R], and $QUERY_STRING to \f[V]defgh=ijkl\f[R]. +.PP +Further notes on processing CGI paths: +.IP \[bu] 2 +The URL must be opaque, so you must not add a double slash after the +scheme. +e.g.\ \f[V]cgi-bin://script-name\f[R] will NOT work, only +\f[V]cgi-bin:script-name\f[R]. +.IP \[bu] 2 +Paths beginning with \f[V]/cgi-bin/\f[R] or \f[V]/$LIB/\f[R] are +stripped of this segment automatically. +So e.g.\ \f[V]cgi-bin:/cgi-bin/script-name\f[R] becomes +\f[V]cgi-bin:script-name\f[R]. +.IP \[bu] 2 +If \f[V]external.w3m-cgi-compat\f[R] is true, file: URLs are converted +to cgi-bin: URLs if the path name starts with \f[V]/cgi-bin/\f[R], +\f[V]/$LIB/\f[R], or the path of a local CGI script. +Note: this is unsafe, please do not use it unless you must. +.IP \[bu] 2 +Absolute paths are accepted as +e.g.\ \f[V]cgi-bin:/path/to/cgi/dir/script-name\f[R]. +Note however, that this only works if \f[V]/path/to/cgi/dir\f[R] has +already been specified as a CGI directory in \f[V]external.cgi-dir\f[R]. +.PP +Note that this is different from w3m\[cq]s cgi-bin functionality, in +that we use a custom scheme for local CGI instead of interpreting all +requests to a designated path as a CGI request. +(This incompatibility is bridged over when +\f[V]external.w3m-cgi-compat\f[R] is true.) +.SS Headers +.PP +Local CGI scripts may send some headers that Chawan will interpret +specially (and thus will not pass forward to e.g.\ the fetch API, etc): +.IP \[bu] 2 +\f[V]Status\f[R]: interpreted as the HTTP status code. +.IP \[bu] 2 +\f[V]Cha-Control\f[R]: special header, see below. +.PP +Note that these headers MUST be sent before any regular headers. +Headers received after a regular header or a +\f[V]Cha-Control: ControlDone\f[R] header will be treated as regular +headers. +.PP +The \f[V]Cha-Control\f[R] header\[cq]s value is parsed as follows: +.IP +.nf +\f[C] +Cha-Control-Value = Command *Parameter +Command = ALPHA *ALPHA +Parameter = SPACE *CHAR +\f[R] +.fi +.PP +In other words, it is \f[V]Command [Param1] [Param2] ...\f[R]. +.PP +Currently available commands are: +.IP \[bu] 2 +\f[V]Connected\f[R]: Takes no parameters. +Must be the first reported header; it means that connection to the +server has been successfully established, but no data has been received +yet. +When any other header is sent first, Chawan will act as if a +\f[V]Cha-Control: Connected\f[R] header had been implicitly sent before +that. +.IP \[bu] 2 +\f[V]ConnectionError\f[R]: Must be the first reported header. +Parameter 1 is the error code, see below. +If any following parameters are given, they are concatenated to form a +custom error message. +Note: short but descriptive error messages are preferred, messages that +do not fit on the screen are currently truncated. +(TODO fix this somehow :P) +.IP \[bu] 2 +\f[V]ControlDone\f[R]: Signals that no more special headers will be +sent; this means that \f[V]Cha-Control\f[R] and \f[V]Status\f[R] headers +sent after this must be interpreted as regular headers (and thus +e.g.\ will be available for JS code calling the script using the fetch +API). +WARNING: this header must be sent before any non-hardcoded headers that +take external input. +For example, an HTTP client would have to send +\f[V]Cha-Control: ControlDone\f[R] before returning the retrieved +headers. +.PP +List of public error codes: +.IP \[bu] 2 +\f[V]1 internal error\f[R]: An internal error prevented the script from +retrieving the requested resource. +CGI scripts can also use this to signal that they have no information on +what went wrong. +.IP \[bu] 2 +\f[V]2 invalid method\f[R]: The client requested data using a method not +supported by this protocol. +.IP \[bu] 2 +\f[V]3 invalid URL\f[R]: The request URL could not be interpreted as a +valid URL for this format. +.IP \[bu] 2 +\f[V]4 file not found\f[R]: No file was found at the requested address, +and thus the request is meaningless. +Note: this should only be used by protocols that do not rely on a +client-server architecture, e.g.\ local file access, local databases, or +peer-to-peer file retrieval mechanisms. +A server responding with \[lq]no file found\[rq] is NOT a connection +error, and is better represented as a response with a 404 status code. +.IP \[bu] 2 +\f[V]5 failed to resolve host\f[R]: The hostname could not be resolved. +.IP \[bu] 2 +\f[V]6 failed to resolve proxy\f[R]: The proxy could not be resolved. +.IP \[bu] 2 +\f[V]7 connection refused\f[R]: The server refused to establish a +connection. +.IP \[bu] 2 +\f[V]8 proxy refused to connect\f[R]: The proxy refused to establish a +connection. +.SS Environment variables +.PP +Chawan sets the following environment variables: +.IP \[bu] 2 +\f[V]SERVER_SOFTWARE=\[dq]Chawan\[dq]\f[R] +.IP \[bu] 2 +\f[V]SERVER_PROTOCOL=\[dq]HTTP/1.0\[dq]\f[R] +.IP \[bu] 2 +\f[V]SERVER_NAME=\[dq]localhost\[dq]\f[R] +.IP \[bu] 2 +\f[V]SERVER_PORT=\[dq]80\[dq]\f[R] +.IP \[bu] 2 +\f[V]REMOTE_HOST=\[dq]localhost\[dq]\f[R] +.IP \[bu] 2 +\f[V]REMOTE_ADDR=\[dq]127.0.0.1\[dq]\f[R] +.IP \[bu] 2 +\f[V]GATEWAY_INTERFACE=\[dq]CGI/1.1\[dq]\f[R] +.IP \[bu] 2 +\f[V]SCRIPT_NAME=\[dq]/cgi-bin/script-name\[dq]\f[R] if called with a +relative path, and \f[V]\[dq]/path/to/script/script-name\[dq]\f[R] if +called with an absolute path. +.IP \[bu] 2 +\f[V]SCRIPT_FILENAME=\[dq]/path/to/script/script-name\[dq]\f[R] +.IP \[bu] 2 +\f[V]QUERY_STRING=\f[R] the query string (i.e.\ \f[V]URL.search\f[R]). +Note that this variable is percent-encoded. +.IP \[bu] 2 +\f[V]PATH_INFO=\f[R] everything after the script\[cq]s path name, +e.g.\ for \f[V]cgi-bin:script-name/abcd/efgh\f[R] +\f[V]\[dq]/abcd/efgh\[dq]\f[R]. +Note that this variable is NOT percent-encoded. +.IP \[bu] 2 +\f[V]REQUEST_URI=\[dq]$SCRIPT_NAME/$PATH_INFO?$QUERY_STRING\f[R] +.IP \[bu] 2 +\f[V]REQUEST_METHOD=\f[R] HTTP method used for making the request, +e.g.\ GET or POST +.IP \[bu] 2 +\f[V]REQUEST_HEADERS=\f[R] A newline-separated list of all headers for +this request. +.IP \[bu] 2 +\f[V]CHA_LIBEXEC_DIR=\f[R] The libexec directory Chawan was configured +to use at compile time. +See the tools section below for details of why this is useful. +.IP \[bu] 2 +\f[V]CONTENT_TYPE=\f[R] for POST requests, the Content-Type header. +Not set for other request types (e.g.\ GET). +.IP \[bu] 2 +\f[V]CONTENT_LENGTH=\f[R] the content length, if $CONTENT_TYPE has been +set. +.IP \[bu] 2 +\f[V]ALL_PROXY=\f[R] if a proxy has been set, the proxy URL. +WARNING: for security reasons, this MUST be respected when making +external connections. +If a CGI script does not support proxies, it must never make any +external connections when the \f[V]ALL_PROXY\f[R] variable is set, even +if this results in it returning an error. +.IP \[bu] 2 +\f[V]HTTP_COOKIE=\f[R] if set, the Cookie header. +.IP \[bu] 2 +\f[V]HTTP_REFERER=\f[R] if set, the Referer header. +.PP +For requests originating from a urimethodmap rewrite, Chawan will also +set the parsed URL\[cq]s parts as environment variables. +Use of these is highly encouraged, to avoid exploits originating from +double-parsing of URLs. +.PP +e.g.\ if +example://username:password\[at]example.org:1234/path/name.html?example +is the original URL, then: +.IP \[bu] 2 +\f[V]MAPPED_URI_SCHEME=\f[R] the scheme of the original URL, in this +case \f[V]example\f[R]. +.IP \[bu] 2 +\f[V]MAPPED_URI_USERNAME=\f[R] the username part, in this case +\f[V]username\f[R]. +If no username was specified, the variable is set to the empty string. +.IP \[bu] 2 +\f[V]MAPPED_URI_PASSWORD=\f[R] the password part, in this case +\f[V]password\f[R]. +If no password was specified, the variable is set to the empty string. +.IP \[bu] 2 +\f[V]MAPPED_URI_HOST=\f[R] the host part, in this case +\f[V]host.org\f[R] If no host was specified, the variable is set to the +empty string. +(An example of a URL with no host: \f[V]about:blank\f[R], here +\f[V]blank\f[R] is the path name.) +.IP \[bu] 2 +\f[V]MAPPED_URI_PORT=\f[R] the port, in this case \f[V]1234\f[R]. +If no port was specified, the variable is set to the empty string. +(In this case, the CGI script is expected to use the default port for +the scheme, if any.) +.IP \[bu] 2 +\f[V]MAPPED_URI_PATH=\f[R] the path name, in this case +\f[V]/path/name.html?example\f[R]. +If no path was specified, the variable is set to the empty string. +Note: the path name is percent-encoded. +.IP \[bu] 2 +\f[V]MAPPED_URI_QUERY=\f[R] the query string, in this case +\f[V]example\f[R]. +Note that, unlike in JavaScript, no question mark is prepended to the +string. +The query string is percent-encoded as well. +.PP +Note: the fragment part is omitted intentionally. +.SS Request body +.PP +If the request body is not empty, it is streamed into the program +through the standard input. +.PP +Note that this may be both an application/x-www-form-urlencoded or a +multipart/form-data request; \f[V]CONTENT_TYPE\f[R] stores information +about the request type, and in case of a multipart request, the boundary +as well. +.SS Tools +.PP +Chawan provides certain helper binaries that may be useful for CGI +scripts. +These can be portably accessed by executing +\f[V]\[dq]$CHA_LIBEXEC_DIR\[dq]/[program name]\f[R]. +.PP +Currently, the following tools are available: +.IP \[bu] 2 +\f[V]urldec\f[R]: percent-decode strings passed on standard input. +.IP \[bu] 2 +\f[V]urlenc\f[R]: percent-encode strings passed on standard input, +taking a percent-encode set as the first parameter. +.SS Troubleshooting +.PP +Note that standard error is redirected to the browser console (by +default, M-cM-c). +This makes it easy to debug a misbehaving CGI script, but may also slow +down the browser in case of excessive logging. +If this is not the desired behavior, we recommend wrapping your script +into a shell script that redirects stderr to /dev/null. +.SS My script is returning a \[lq]no local-CGI directory configured\[rq] error message. +.PP +Currently, the default setting includes a cgi-bin directory at +\f[V]$(which cha)/../libexec/chawan/cgi-bin\f[R], which usually looks +something like \f[V]/usr/local/libexec/chawan/cgi-bin\f[R]. +You only get the above message if you intentionally set the cgi-dir +setting to an empty array. +(This will likely break everything else too, so do not.) +.PP +To change the default local-CGI directory, use the +\f[V]external.cgi-dir\f[R] option. +.PP +e.g.\ you could add this to your config.toml: +.IP +.nf +\f[C] +[external] +cgi-dir = [\[dq]\[ti]/cgi-bin\[dq], \[dq]${%CHA_LIBEXEC_DIR}/cgi-bin\[dq]] +\f[R] +.fi +.PP +and then put your script in \f[V]$HOME/cgi-bin\f[R]. +Note the second element in the array; if you don\[cq]t add it, the +default CGI scripts (including http, https, etc\&...) +will not work. +.SS My script is returning a \[lq]Failed to execute script\[rq] error message. +.PP +This means the \f[V]execl\f[R] call to the script failed. +Make sure that your CGI script\[cq]s executable bit is set, i.e.\ run +\f[V]chmod +x /path/to/cgi/script\f[R]. +.SS My script is returning an \[lq]invalid CGI path\[rq] error message. +.PP +Make sure that you did not include leading slashes. +Reminder: \f[V]cgi-bin://script-name\f[R] does not work, use +\f[V]cgi-bin:script-name\f[R]. +.SS My script is returning a \[lq]CGI file not found\[rq] error message. +.PP +Double check that your CGI script is in the correct location. +Also, make sure that you are not accidentally calling the script with an +absolute path via \f[V]cgi-bin:/script-name\f[R] (instead of the correct +\f[V]cgi-bin:script-name\f[R]). +.PP +It is also possible that \f[V]external.cgi-dir\f[R] is not really set to +the directory your script is in. +Note that by default, this depends on the binary\[cq]s path, so e.g.\ if +your binary is in \f[V]\[ti]/src/chawan/target/release/bin/cha\f[R], but +you put your CGI script to \f[V]/usr/local/libexec/chawan/cgi-bin\f[R], +then it will not work. +.SS My script is returning a \[lq]failed to set up CGI script\[rq] error message. +.PP +This means that either \f[V]pipe\f[R] or \f[V]fork\f[R] failed. +Something strange is going on with your system; we recommend exorcism. +(Maybe you are running out of memory?) +.SS See also +.PP +\f[B]cha\f[R](1) |