about summary refs log tree commit diff stats
path: root/doc/cha-localcgi.5
diff options
context:
space:
mode:
authorbptato <nincsnevem662@gmail.com>2024-04-26 19:35:21 +0200
committerbptato <nincsnevem662@gmail.com>2024-04-26 19:45:53 +0200
commit601ad98818f3b966686181445339c52f74f75979 (patch)
tree3f245aacb085ccbee7e7b7c5efbf5b7cd318a5c1 /doc/cha-localcgi.5
parent83dae4a87a78190262317eca15cbb5d25989d41b (diff)
downloadchawan-601ad98818f3b966686181445339c52f74f75979.tar.gz
doc: include auto-generated manpages in repository
The 100kb or so doesn't hurt as much as not having manual pages at all
without pandoc (+ not auto-updating them through make all) does.
Diffstat (limited to 'doc/cha-localcgi.5')
-rw-r--r--doc/cha-localcgi.5336
1 files changed, 336 insertions, 0 deletions
diff --git a/doc/cha-localcgi.5 b/doc/cha-localcgi.5
new file mode 100644
index 00000000..c9a8a54e
--- /dev/null
+++ b/doc/cha-localcgi.5
@@ -0,0 +1,336 @@
+.\" Automatically generated by Pandoc 2.17.1.1
+.\"
+.\" Define V font for inline verbatim, using C font in formats
+.\" that render this, and otherwise B font.
+.ie "\f[CB]x\f[]"x" \{\
+. ftr V B
+. ftr VI BI
+. ftr VB B
+. ftr VBI BI
+.\}
+.el \{\
+. ftr V CR
+. ftr VI CI
+. ftr VB CB
+. ftr VBI CBI
+.\}
+.TH "cha-localcgi" "5" "" "" "Local CGI support in Chawan"
+.hy
+.SH Local CGI support in Chawan
+.PP
+Chawan supports the invocation of CGI scripts locally.
+This feature can be used in the following way:
+.IP \[bu] 2
+All local CGI scripts must be placed in a directory specified in
+\f[V]external.cgi-dir\f[R].
+Multiple directories can be specified in an array too, and directories
+specified first have higher precedence.
+.IP \[bu] 2
+Then, a CGI script in one of these directories can be executed by
+visiting the URL \f[V]cgi-bin:script-name\f[R].
+$PATH_INFO and $QUERY_STRING are set as normal,
+i.e.\ \f[V]cgi-bin:script-name/abcd?defgh=ijkl\f[R] will set $PATH_INFO
+to \f[V]/abcd\f[R], and $QUERY_STRING to \f[V]defgh=ijkl\f[R].
+.PP
+Further notes on processing CGI paths:
+.IP \[bu] 2
+The URL must be opaque, so you must not add a double slash after the
+scheme.
+e.g.\ \f[V]cgi-bin://script-name\f[R] will NOT work, only
+\f[V]cgi-bin:script-name\f[R].
+.IP \[bu] 2
+Paths beginning with \f[V]/cgi-bin/\f[R] or \f[V]/$LIB/\f[R] are
+stripped of this segment automatically.
+So e.g.\ \f[V]cgi-bin:/cgi-bin/script-name\f[R] becomes
+\f[V]cgi-bin:script-name\f[R].
+.IP \[bu] 2
+If \f[V]external.w3m-cgi-compat\f[R] is true, file: URLs are converted
+to cgi-bin: URLs if the path name starts with \f[V]/cgi-bin/\f[R],
+\f[V]/$LIB/\f[R], or the path of a local CGI script.
+Note: this is unsafe, please do not use it unless you must.
+.IP \[bu] 2
+Absolute paths are accepted as
+e.g.\ \f[V]cgi-bin:/path/to/cgi/dir/script-name\f[R].
+Note however, that this only works if \f[V]/path/to/cgi/dir\f[R] has
+already been specified as a CGI directory in \f[V]external.cgi-dir\f[R].
+.PP
+Note that this is different from w3m\[cq]s cgi-bin functionality, in
+that we use a custom scheme for local CGI instead of interpreting all
+requests to a designated path as a CGI request.
+(This incompatibility is bridged over when
+\f[V]external.w3m-cgi-compat\f[R] is true.)
+.SS Headers
+.PP
+Local CGI scripts may send some headers that Chawan will interpret
+specially (and thus will not pass forward to e.g.\ the fetch API, etc):
+.IP \[bu] 2
+\f[V]Status\f[R]: interpreted as the HTTP status code.
+.IP \[bu] 2
+\f[V]Cha-Control\f[R]: special header, see below.
+.PP
+Note that these headers MUST be sent before any regular headers.
+Headers received after a regular header or a
+\f[V]Cha-Control: ControlDone\f[R] header will be treated as regular
+headers.
+.PP
+The \f[V]Cha-Control\f[R] header\[cq]s value is parsed as follows:
+.IP
+.nf
+\f[C]
+Cha-Control-Value = Command *Parameter
+Command = ALPHA *ALPHA
+Parameter = SPACE *CHAR
+\f[R]
+.fi
+.PP
+In other words, it is \f[V]Command [Param1] [Param2] ...\f[R].
+.PP
+Currently available commands are:
+.IP \[bu] 2
+\f[V]Connected\f[R]: Takes no parameters.
+Must be the first reported header; it means that connection to the
+server has been successfully established, but no data has been received
+yet.
+When any other header is sent first, Chawan will act as if a
+\f[V]Cha-Control: Connected\f[R] header had been implicitly sent before
+that.
+.IP \[bu] 2
+\f[V]ConnectionError\f[R]: Must be the first reported header.
+Parameter 1 is the error code, see below.
+If any following parameters are given, they are concatenated to form a
+custom error message.
+Note: short but descriptive error messages are preferred, messages that
+do not fit on the screen are currently truncated.
+(TODO fix this somehow :P)
+.IP \[bu] 2
+\f[V]ControlDone\f[R]: Signals that no more special headers will be
+sent; this means that \f[V]Cha-Control\f[R] and \f[V]Status\f[R] headers
+sent after this must be interpreted as regular headers (and thus
+e.g.\ will be available for JS code calling the script using the fetch
+API).
+WARNING: this header must be sent before any non-hardcoded headers that
+take external input.
+For example, an HTTP client would have to send
+\f[V]Cha-Control: ControlDone\f[R] before returning the retrieved
+headers.
+.PP
+List of public error codes:
+.IP \[bu] 2
+\f[V]1 internal error\f[R]: An internal error prevented the script from
+retrieving the requested resource.
+CGI scripts can also use this to signal that they have no information on
+what went wrong.
+.IP \[bu] 2
+\f[V]2 invalid method\f[R]: The client requested data using a method not
+supported by this protocol.
+.IP \[bu] 2
+\f[V]3 invalid URL\f[R]: The request URL could not be interpreted as a
+valid URL for this format.
+.IP \[bu] 2
+\f[V]4 file not found\f[R]: No file was found at the requested address,
+and thus the request is meaningless.
+Note: this should only be used by protocols that do not rely on a
+client-server architecture, e.g.\ local file access, local databases, or
+peer-to-peer file retrieval mechanisms.
+A server responding with \[lq]no file found\[rq] is NOT a connection
+error, and is better represented as a response with a 404 status code.
+.IP \[bu] 2
+\f[V]5 failed to resolve host\f[R]: The hostname could not be resolved.
+.IP \[bu] 2
+\f[V]6 failed to resolve proxy\f[R]: The proxy could not be resolved.
+.IP \[bu] 2
+\f[V]7 connection refused\f[R]: The server refused to establish a
+connection.
+.IP \[bu] 2
+\f[V]8 proxy refused to connect\f[R]: The proxy refused to establish a
+connection.
+.SS Environment variables
+.PP
+Chawan sets the following environment variables:
+.IP \[bu] 2
+\f[V]SERVER_SOFTWARE=\[dq]Chawan\[dq]\f[R]
+.IP \[bu] 2
+\f[V]SERVER_PROTOCOL=\[dq]HTTP/1.0\[dq]\f[R]
+.IP \[bu] 2
+\f[V]SERVER_NAME=\[dq]localhost\[dq]\f[R]
+.IP \[bu] 2
+\f[V]SERVER_PORT=\[dq]80\[dq]\f[R]
+.IP \[bu] 2
+\f[V]REMOTE_HOST=\[dq]localhost\[dq]\f[R]
+.IP \[bu] 2
+\f[V]REMOTE_ADDR=\[dq]127.0.0.1\[dq]\f[R]
+.IP \[bu] 2
+\f[V]GATEWAY_INTERFACE=\[dq]CGI/1.1\[dq]\f[R]
+.IP \[bu] 2
+\f[V]SCRIPT_NAME=\[dq]/cgi-bin/script-name\[dq]\f[R] if called with a
+relative path, and \f[V]\[dq]/path/to/script/script-name\[dq]\f[R] if
+called with an absolute path.
+.IP \[bu] 2
+\f[V]SCRIPT_FILENAME=\[dq]/path/to/script/script-name\[dq]\f[R]
+.IP \[bu] 2
+\f[V]QUERY_STRING=\f[R] the query string (i.e.\ \f[V]URL.search\f[R]).
+Note that this variable is percent-encoded.
+.IP \[bu] 2
+\f[V]PATH_INFO=\f[R] everything after the script\[cq]s path name,
+e.g.\ for \f[V]cgi-bin:script-name/abcd/efgh\f[R]
+\f[V]\[dq]/abcd/efgh\[dq]\f[R].
+Note that this variable is NOT percent-encoded.
+.IP \[bu] 2
+\f[V]REQUEST_URI=\[dq]$SCRIPT_NAME/$PATH_INFO?$QUERY_STRING\f[R]
+.IP \[bu] 2
+\f[V]REQUEST_METHOD=\f[R] HTTP method used for making the request,
+e.g.\ GET or POST
+.IP \[bu] 2
+\f[V]REQUEST_HEADERS=\f[R] A newline-separated list of all headers for
+this request.
+.IP \[bu] 2
+\f[V]CHA_LIBEXEC_DIR=\f[R] The libexec directory Chawan was configured
+to use at compile time.
+See the tools section below for details of why this is useful.
+.IP \[bu] 2
+\f[V]CONTENT_TYPE=\f[R] for POST requests, the Content-Type header.
+Not set for other request types (e.g.\ GET).
+.IP \[bu] 2
+\f[V]CONTENT_LENGTH=\f[R] the content length, if $CONTENT_TYPE has been
+set.
+.IP \[bu] 2
+\f[V]ALL_PROXY=\f[R] if a proxy has been set, the proxy URL.
+WARNING: for security reasons, this MUST be respected when making
+external connections.
+If a CGI script does not support proxies, it must never make any
+external connections when the \f[V]ALL_PROXY\f[R] variable is set, even
+if this results in it returning an error.
+.IP \[bu] 2
+\f[V]HTTP_COOKIE=\f[R] if set, the Cookie header.
+.IP \[bu] 2
+\f[V]HTTP_REFERER=\f[R] if set, the Referer header.
+.PP
+For requests originating from a urimethodmap rewrite, Chawan will also
+set the parsed URL\[cq]s parts as environment variables.
+Use of these is highly encouraged, to avoid exploits originating from
+double-parsing of URLs.
+.PP
+e.g.\ if
+example://username:password\[at]example.org:1234/path/name.html?example
+is the original URL, then:
+.IP \[bu] 2
+\f[V]MAPPED_URI_SCHEME=\f[R] the scheme of the original URL, in this
+case \f[V]example\f[R].
+.IP \[bu] 2
+\f[V]MAPPED_URI_USERNAME=\f[R] the username part, in this case
+\f[V]username\f[R].
+If no username was specified, the variable is set to the empty string.
+.IP \[bu] 2
+\f[V]MAPPED_URI_PASSWORD=\f[R] the password part, in this case
+\f[V]password\f[R].
+If no password was specified, the variable is set to the empty string.
+.IP \[bu] 2
+\f[V]MAPPED_URI_HOST=\f[R] the host part, in this case
+\f[V]host.org\f[R] If no host was specified, the variable is set to the
+empty string.
+(An example of a URL with no host: \f[V]about:blank\f[R], here
+\f[V]blank\f[R] is the path name.)
+.IP \[bu] 2
+\f[V]MAPPED_URI_PORT=\f[R] the port, in this case \f[V]1234\f[R].
+If no port was specified, the variable is set to the empty string.
+(In this case, the CGI script is expected to use the default port for
+the scheme, if any.)
+.IP \[bu] 2
+\f[V]MAPPED_URI_PATH=\f[R] the path name, in this case
+\f[V]/path/name.html?example\f[R].
+If no path was specified, the variable is set to the empty string.
+Note: the path name is percent-encoded.
+.IP \[bu] 2
+\f[V]MAPPED_URI_QUERY=\f[R] the query string, in this case
+\f[V]example\f[R].
+Note that, unlike in JavaScript, no question mark is prepended to the
+string.
+The query string is percent-encoded as well.
+.PP
+Note: the fragment part is omitted intentionally.
+.SS Request body
+.PP
+If the request body is not empty, it is streamed into the program
+through the standard input.
+.PP
+Note that this may be both an application/x-www-form-urlencoded or a
+multipart/form-data request; \f[V]CONTENT_TYPE\f[R] stores information
+about the request type, and in case of a multipart request, the boundary
+as well.
+.SS Tools
+.PP
+Chawan provides certain helper binaries that may be useful for CGI
+scripts.
+These can be portably accessed by executing
+\f[V]\[dq]$CHA_LIBEXEC_DIR\[dq]/[program name]\f[R].
+.PP
+Currently, the following tools are available:
+.IP \[bu] 2
+\f[V]urldec\f[R]: percent-decode strings passed on standard input.
+.IP \[bu] 2
+\f[V]urlenc\f[R]: percent-encode strings passed on standard input,
+taking a percent-encode set as the first parameter.
+.SS Troubleshooting
+.PP
+Note that standard error is redirected to the browser console (by
+default, M-cM-c).
+This makes it easy to debug a misbehaving CGI script, but may also slow
+down the browser in case of excessive logging.
+If this is not the desired behavior, we recommend wrapping your script
+into a shell script that redirects stderr to /dev/null.
+.SS My script is returning a \[lq]no local-CGI directory configured\[rq] error message.
+.PP
+Currently, the default setting includes a cgi-bin directory at
+\f[V]$(which cha)/../libexec/chawan/cgi-bin\f[R], which usually looks
+something like \f[V]/usr/local/libexec/chawan/cgi-bin\f[R].
+You only get the above message if you intentionally set the cgi-dir
+setting to an empty array.
+(This will likely break everything else too, so do not.)
+.PP
+To change the default local-CGI directory, use the
+\f[V]external.cgi-dir\f[R] option.
+.PP
+e.g.\ you could add this to your config.toml:
+.IP
+.nf
+\f[C]
+[external]
+cgi-dir = [\[dq]\[ti]/cgi-bin\[dq], \[dq]${%CHA_LIBEXEC_DIR}/cgi-bin\[dq]]
+\f[R]
+.fi
+.PP
+and then put your script in \f[V]$HOME/cgi-bin\f[R].
+Note the second element in the array; if you don\[cq]t add it, the
+default CGI scripts (including http, https, etc\&...)
+will not work.
+.SS My script is returning a \[lq]Failed to execute script\[rq] error message.
+.PP
+This means the \f[V]execl\f[R] call to the script failed.
+Make sure that your CGI script\[cq]s executable bit is set, i.e.\ run
+\f[V]chmod +x /path/to/cgi/script\f[R].
+.SS My script is returning an \[lq]invalid CGI path\[rq] error message.
+.PP
+Make sure that you did not include leading slashes.
+Reminder: \f[V]cgi-bin://script-name\f[R] does not work, use
+\f[V]cgi-bin:script-name\f[R].
+.SS My script is returning a \[lq]CGI file not found\[rq] error message.
+.PP
+Double check that your CGI script is in the correct location.
+Also, make sure that you are not accidentally calling the script with an
+absolute path via \f[V]cgi-bin:/script-name\f[R] (instead of the correct
+\f[V]cgi-bin:script-name\f[R]).
+.PP
+It is also possible that \f[V]external.cgi-dir\f[R] is not really set to
+the directory your script is in.
+Note that by default, this depends on the binary\[cq]s path, so e.g.\ if
+your binary is in \f[V]\[ti]/src/chawan/target/release/bin/cha\f[R], but
+you put your CGI script to \f[V]/usr/local/libexec/chawan/cgi-bin\f[R],
+then it will not work.
+.SS My script is returning a \[lq]failed to set up CGI script\[rq] error message.
+.PP
+This means that either \f[V]pipe\f[R] or \f[V]fork\f[R] failed.
+Something strange is going on with your system; we recommend exorcism.
+(Maybe you are running out of memory?)
+.SS See also
+.PP
+\f[B]cha\f[R](1)