about summary refs log tree commit diff stats
path: root/src/utils
Commit message (Collapse)AuthorAgeFilesLines
* twtstr: fix overflow checkbptato2024-05-211-2/+2
|
* sandbox: add sigreturnbptato2024-05-211-0/+2
| | | | seems to get called for signal handlers
* html: improve Request, derive Client from Windowbptato2024-05-201-8/+8
| | | | | | | * make Client an instance of Window (for less special casing) * misc work on Request & fetch * improve origin comparison (opaque origins of same URLs are now considered the same)
* forkserver: simplify fcLoadConfigbptato2024-05-181-3/+1
|
* config: separate tmp dir for sockets, usersbptato2024-05-161-2/+0
| | | | | | | * add $LOGNAME to the tmp directory name, so that tmpdirs of separate users don't conflict * use separate directory for sockets, so that we do not have to give buffers access to all cached pages
* luwrap: use separate context (+ various cleanups)bptato2024-05-104-191/+147
| | | | | | Use a LUContext to only load required CharRanges once per pager. Also, add kana & hangul vi word break categories for convenience.
* luwrap: replace Nim unicode maps with libunicodebptato2024-05-093-48/+108
| | | | | | | | | | | | | | | | | Instead of using the built-in (and outdated, and buggy) tables, we now use libunicode from QJS. This shaves some bytes off the executable, though far less than I had imagined it would. Also, a surprising effect of this change: because libunicode's tables aren't glitched out, kanji properly gets classified as alpha. I found this greatly annoying because `w' in Japanese text would now jump through whole sentences. As a band-aid solution I added an extra Han category, but I wish we had a more robust solution that could differentiate between *all* scripts. TODO: I suspect that separately loading the tables for every rune in breaksViWordCat is rather inefficient. Using some context object (at least per operation) would probably be beneficial.
* dom: simplify ButtonTypebptato2024-05-081-1/+1
|
* sandbox: allow getpid in seccomp network sandboxbptato2024-04-271-0/+1
| | | | openssl needs it
* Remove unnecessary unsigned castsbptato2024-04-261-1/+1
| | | | | Unsigned operations and conversions to unsigned types always wrap/narrow without checks, so no need to manually mask/cast/etc. them.
* data: replace std/base64 with atobbptato2024-04-251-0/+63
| | | | | | | | | | std's version is known to be broken on versions we still support, and it makes no sense to use different decoders anyway. (This does introduce a bit of a dependency hell, because js/base64 depends on js/javascript which tries to bring in the entire QuickJS runtime. So we move that out into twtstr, and manually convert a Result[string, string] to DOMException in js/base64.)
* sandbox: remove unveil callbptato2024-04-231-7/+4
| | | | | We no longer modify the file system inside the sandbox, so this permission is simply not needed.
* sandbox: allow syscalls for epoll Nim selectorsbptato2024-04-201-0/+4
| | | | | | | | | | | | This fixes setTimeout/setInterval causing crashes. Note: timerfd_gettime is not actually used by Nim right now. However, it seems like a good idea to add it to the set in case a future Nim version needs it, as it does no harm. We still do not allow signalfd, because it would let rogue buffers override our SIGSYS handler. (Not sure if this really matters, but we don't need it for now anyway.)
* http: fix sandbox violation in readFromStdinbptato2024-04-191-0/+2
| | | | | | | | | glibc apparently calls fstat from fread, and we didn't allow it in seccomp. So: * allow fstat in the sandbox; no reason not to, and it seems too big of a footgun to assume we never call fread * use read(2) in http; no need for buffered i/o here
* twtstr: remove pointless checks in parseIntImplbptato2024-04-191-5/+3
|
* url, twtstr: correct number parsingbptato2024-04-182-33/+48
| | | | | | | | | * do not use std's parse*Int; they accept weird stuff that we do not want to accept in any case * fix bug in parseHost where a parseIpv4 failure would result in an empty host * do not use isDigit, isAlphaAscii * improve parse*IntImpl error handling
* sandbox: seccomp support on Linuxbptato2024-04-181-2/+118
| | | | | | | | | | | | | | | | | We use libseccomp, which is now a semi-mandatory dependency on Linux. (You can still build without it, but only if you pass a scary long flag to make.) For this to work I had to disable getTimezoneOffset, which would otherwise call localtime_r which in turn reads in some files from /usr/share/zoneinfo. To allow this we would have to give unrestricted openat(2) access to buffer processes, which is unacceptable. (Giving websites access to the local timezone is a fingerprinting vector so if this ever gets fixed then it should be an opt-in config setting.) This patch also includes misc fixes to buffer cloning, and fixes the LIBEXECDIR override in the makefile so that it is actually useful.
* strwidth: return alpha for underscore in vi wordsbptato2024-04-171-1/+1
|
* Update code stylebptato2024-04-174-50/+68
| | | | | | * separate params with ; (semicolon) instead of , (colon) * reduce screaming snake case use * wrap long lines
* utils: polyfill addr/unsafeAddr distinction in Nim 2+bptato2024-04-141-0/+18
| | | | | | | | | | | | | | | | | | | | | I wish they didn't change this. unsafeAddr may be a confusing name, but it's more powerful than addr. Merging them violates the principle of least power. e.g. say I get n thru a param, and shadow it proc x(n: int) = var n = n + 1 a screen or two later I call mutates_variable_in_c(addr i) then later I no longer need to add 1, so I remove the var line. In Nim 1.6 the compiler refuses to compile, I can instantly find the bug. In 2.0 it does... whatever?? Maybe for an int it "works", for an object it likely doesn't. Certainly not something I'd enjoy debugging.
* twtstr: remove isAscii, simplify onlyWhitespacebptato2024-04-101-10/+1
|
* twtstr: remove pointless lookup tablesbptato2024-04-101-18/+10
| | | | it's a waste of space; we don't use these *that* much.
* remove dead code, fix openArray casingbptato2024-04-081-4/+1
|
* sandbox: add OpenBSD pledge/unveil supportbptato2024-04-031-3/+26
| | | | | | | | | | | | pledge is a bit more fine-grained than Capsicum's capability mode, so the buffer & http ("network") sandboxes are now split up into two parts. I applied the same hack as in FreeBSD for overriding the buffer selector kqueue, because a) I didn't want to request sysctl promise b) I'm not sure if it would even work and c) if it breaks on OpenBSD, then it's broken on FreeBSD too, so there's a greater chance of discovering the bug.
* ansi2html: support passing titlesbptato2024-03-291-17/+21
| | | | | | | Use content type attributes so e.g. git.cgi can set the title even with a text/x-ansi content type. (This commit also fixes some bugs in content type attribute handling.)
* Add capsicum supportbptato2024-03-281-0/+13
| | | | | | | | | | | | | It's the sandboxing system of FreeBSD. Quite pleasant to work with. (Just trying to figure out the basics with this one before tackling the abomination that is seccomp.) Indeed, the only non-trivial part was getting newSelector to work with Capsicum. Long story short it doesn't, so we use an ugly pointer cast + assignment. But even that is stdlib's "fault", not Capsicum's. This also gets rid of that ugly SocketPath global.
* config: parse mime.types/mailcap/urimethodmap inside parseConfigbptato2024-03-181-1/+0
| | | | | | Better (and simpler) than storing them all over the place. extra: change lmDownload text to match w3m
* forkserver: set process titles for processesbptato2024-03-171-0/+38
| | | | | this is unfortunately truncated on Linux, but I don't care enough to hack around this
* container: fall back to text/plain instead of application/octet-streambptato2024-03-161-2/+3
| | | | | | | | | | | | | This has its own problems, but application/octet-stream has the horrible consequence that opening any local file with an unrecognized type automatically quits the browser. (FWIW, w3m also falls back to text/plain, so it's not such an unreasonable default.) The proper solution would be to a) fix the bug that makes the browser auto-quit and b) show a "what to do" prompt for unrecognized file types (and allow users to override it, preferably on a per-protocol basis.)
* twtstr: fix deleteChars, do not remove space in replaceControlsbptato2024-03-141-6/+6
|
* twtstr: simplify control char procsbptato2024-03-131-24/+7
|
* loader: remove applyHeadersbptato2024-03-122-11/+29
| | | | | | | Better compute the values we need on-demand at the call sites; this way, we can pass through content type attributes to mailcap too. (Also, remove a bug where applyResponse was called twice.)
* loader: rework process modelbptato2024-03-111-62/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Originally we had several loader processes so that the loader did not need asynchronity for loading several buffers at once. Since then, the scope of what loader does has been reduced significantly, and with that loader has become mostly asynchronous. This patch finishes the above work as follows: * We only fork a single loader process for the browser. It is a waste of resources to do otherwise, and would have made future work on a download manager very difficult. * loader becomes (almost) fully async. Now the only sync part is a) processing commands and b) waiting for clients to consume responses. b) is a bit more problematic than a), but should not cause problems unless some other horrible bug exists in a client. (TODO: make it fully async.) This gives us a noticable improvement in CSS loading speed, since all resources can now be queried at once (even before the previous ones are connected). * Buffers now only get processes when the *connection* is finished. So headers, status code, etc. are handled by the client, and the buffer is forked when the loader starts streaming the response body. As a result, mailcap entries can simply dup2 the first UNIX domain socket connection as their stdin. This allows us to remove the ugly (and slow) `canredir' hack, which required us to send file handles on a tour accross the entire codebase. * The "cache" has been reworked somewhat: - Since canredir is gone, buffer-level requests usually start in a suspended state, and are explicitly resumed only after the client could decide whether it wants to cache the response. - Instead of a flag on Request and the URL as the cache key, we now use a global counter and the special `cache:' scheme. * misc fixes: referer_from is now actually respected by buffers (not just the pager), load info display should work slightly better, etc.
* strwidth, renderdocument: small refactoringbptato2024-03-032-45/+26
| | | | | | * put attrs pointer in state * simplify width() * use unsigned int as ptint to avoid UB
* misc refactoringsbptato2024-02-271-13/+1
| | | | | | * rename buffer enums * fix isAscii for char 0x80 * remove dead code from URL
* regex: re-work compileSearchRegexbptato2024-02-171-0/+16
| | | | | | | I've gotten tired of not being able to search for forward slashes. Now it works like in vim, and you can also set default ignore case in the config.
* widthconv: bugfixesbptato2024-02-111-25/+11
| | | | | | | * fix failed assertion on non-ha-column half-width chars followed by handakuten with text-transform: full-width * fix dquot full-width conversion * fix lone half-width han/dakuten conversion
* twtstr: misc refactoringsbptato2024-02-092-141/+150
| | | | | | * move out half width <-> full width converters * snake_case -> camelCase * improve toScreamingSnakeCase slicing
* mimetypes: simplify parseMimeTypesbptato2024-01-271-0/+6
| | | | | * use functions like until * do not call atEnd for every line, use boolean readLine instead
* Re-design word handling, add e, E, W, B, etc.bptato2024-01-191-11/+19
| | | | | | | | | | | | * Add functions for moving to the beginning/end of words (vi `b', `e'). * As it turns out, there are many possible interpretations of what a word is. Now we have a function for each reasonable interpretation, and the default settings match those of vi (and w3m in w3m.toml). (Exception: it's still broken on line boundaries... TODO) * Remove `bounds` from lineedit, it was horrible API design and mostly useless. In the future, an API similar to what pager now has could be added. * Update docs, and fix some spacing issues with symbols in the tables.
* utils/map: remove unused special casebptato2024-01-171-3/+0
| | | | Even if it were used, it's UB...
* Add urlenc, urldec; fix a URL encoding bug; improve trans.cgibptato2024-01-081-12/+18
| | | | | | | | | | * Fix incorrect internal definition of the fragment percent-encode set * urlenc, urldec: these are simple utility programs mainly for use with shell local CGI scripts. (Sadly the printf + xargs solution is not portable.) * Pass libexec directory as an env var to local CGI scripts * Update trans.cgi to use urldec and add an example for combining it with selections
* Use std/* imports everywherebptato2024-01-073-15/+15
|
* charwidth: use pre-generated map filebptato2024-01-042-136/+42
| | | | Also for reducing compilation time.
* Compile with styleCheck:usagesbptato2023-12-281-1/+1
| | | | much better
* strwidth & url fixesbptato2023-12-162-4/+4
| | | | | | * actually search Combining for isCombining * fix searchInMap * fix cmpRange of url
* charcategory: move out isDigitAsciibptato2023-12-143-6/+4
| | | | so we do not have to import unicode
* Various fixesbptato2023-12-132-37/+46
| | | | | | | * Makefile: fix parallel build, add new binaries to install target * twtstr: split out libunicode-related stuff to luwrap * config: quote default gopher2html URL env var for unquote * adapter/: get rid of types/url dependency, use CURL url in all cases
* break up twtstr somewhatbptato2023-12-134-372/+274
| | | | | Avoid computing e.g. charwidth data for http which does not need it at all.
* twtstr: import functions from gopher2htmlbptato2023-12-121-15/+10
|