about summary refs log tree commit diff stats
path: root/src/utils/twtstr.nim
Commit message (Collapse)AuthorAgeFilesLines
* utils: add twtunibptato2024-09-081-11/+9
| | | | | | | | | | | | | | | | | | | std/unicode has the following issues: * Rune is an int32, which implies overflow checking. Also, it is distinct, so you have to convert it manually to do arithmetic. * QJS libunicode and Chagashi work with uint32, interfacing with these required pointless type conversions. * fastRuneAt is a template, meaning it's pasted into every call site. Also, it decodes to UCS-4, so it generates two branches that aren't even used. Overall this lead to quite some code bloat. * fastRuneAt and lastRune have frustratingly different interfaces. Writing code to handle both cases is error prone. * On older Nim versions which we still support, std/unicode takes strings, not openArray[char]'s. Replace it with "twtuni", which includes some improved versions of the few procedures from std/unicode that we actually use.
* md2html: code, pre, inline fixesbptato2024-09-071-2/+2
|
* twtstr: type erase binarySearch instantiationbptato2024-09-061-28/+38
| | | | | | | | Do it like parseEnumNoCase0, so we no longer instantiate a gazillion different binary searches for the same type. While we're at it, make matchNameProduction's searchInMap use uint32 too.
* xhr: progressbptato2024-08-131-0/+2
| | | | | | | | | | | | | * fix header case sensitivity issues -> probably still wrong as it discards the original casing. better than nothing, anyway * fix fulfill on generic promises * support standard open() async parameter weirdness * refactor loader response body reading (so bodyRead is no longer mandatory) * actually read response body still missing: response body getters
* dom: fix crash on wrong image content typebptato2024-08-111-0/+3
| | | | + slightly optimize getContentType
* twtstr: don't cast in parseEnumbptato2024-08-091-2/+2
| | | | Nim 1.6 does not like it.
* cssvalues, twtstr, mediaquery: refactor & fixesbptato2024-08-021-28/+25
| | | | | | | * cssvalues, twtstr: unify enum parsing code paths, parse enums by bisearch instead of hash tables * mediaquery: refactor (long overdue), fix range comparison syntax parsing, make ident comparisons case-insensitive (as they should be)
* twtstr: fix startsWithIgnoreCasebptato2024-07-291-2/+2
|
* buffer, pager, config: add meta-refresh + misc fixesbptato2024-07-281-18/+12
| | | | | | | | | * buffer, pager, config: add meta-refresh value, which makes it possible to follow http-equiv=refresh META tags. * config: clean up redundant format mode parser * timeout: accept varargs for params to pass on to functions * pager: add "options" dict to JS gotoURL * twtstr: remove redundant startsWithNoCase
* url: misc fixes & improvementsbptato2024-07-241-25/+13
| | | | | | * fix various parsing bugs * rewrite state machine * other small optimizations
* html: event cleanup, XHR progressbptato2024-07-181-0/+27
|
* img, loader: separate out png codec into cgi, misc improvementsbptato2024-06-201-0/+3
| | | | | | | | | | | | | | | * multi-processed and sandboxed PNG decoding & encoding (through local CGI) * improved request body passing (including support for output id as response body) * simplified & faster blob()/text() - now every request starts suspended, and OngoingData.buf has been replaced with loader's buffering capability * image caching: we no longer pull bitmaps from the container after every single getLines call Next steps: replace our bespoke PNG decoder with something more usable, add other decoders, and make them stream.
* twtstr: fix overflow checkbptato2024-05-211-2/+2
|
* html: improve Request, derive Client from Windowbptato2024-05-201-8/+8
| | | | | | | * make Client an instance of Window (for less special casing) * misc work on Request & fetch * improve origin comparison (opaque origins of same URLs are now considered the same)
* luwrap: use separate context (+ various cleanups)bptato2024-05-101-137/+34
| | | | | | Use a LUContext to only load required CharRanges once per pager. Also, add kana & hangul vi word break categories for convenience.
* dom: simplify ButtonTypebptato2024-05-081-1/+1
|
* Remove unnecessary unsigned castsbptato2024-04-261-1/+1
| | | | | Unsigned operations and conversions to unsigned types always wrap/narrow without checks, so no need to manually mask/cast/etc. them.
* data: replace std/base64 with atobbptato2024-04-251-0/+63
| | | | | | | | | | std's version is known to be broken on versions we still support, and it makes no sense to use different decoders anyway. (This does introduce a bit of a dependency hell, because js/base64 depends on js/javascript which tries to bring in the entire QuickJS runtime. So we move that out into twtstr, and manually convert a Result[string, string] to DOMException in js/base64.)
* twtstr: remove pointless checks in parseIntImplbptato2024-04-191-5/+3
|
* url, twtstr: correct number parsingbptato2024-04-181-33/+47
| | | | | | | | | * do not use std's parse*Int; they accept weird stuff that we do not want to accept in any case * fix bug in parseHost where a parseIpv4 failure would result in an empty host * do not use isDigit, isAlphaAscii * improve parse*IntImpl error handling
* Update code stylebptato2024-04-171-20/+38
| | | | | | * separate params with ; (semicolon) instead of , (colon) * reduce screaming snake case use * wrap long lines
* twtstr: remove isAscii, simplify onlyWhitespacebptato2024-04-101-10/+1
|
* twtstr: remove pointless lookup tablesbptato2024-04-101-18/+10
| | | | it's a waste of space; we don't use these *that* much.
* remove dead code, fix openArray casingbptato2024-04-081-4/+1
|
* ansi2html: support passing titlesbptato2024-03-291-17/+21
| | | | | | | Use content type attributes so e.g. git.cgi can set the title even with a text/x-ansi content type. (This commit also fixes some bugs in content type attribute handling.)
* twtstr: fix deleteChars, do not remove space in replaceControlsbptato2024-03-141-6/+6
|
* twtstr: simplify control char procsbptato2024-03-131-24/+7
|
* loader: remove applyHeadersbptato2024-03-121-0/+21
| | | | | | | Better compute the values we need on-demand at the call sites; this way, we can pass through content type attributes to mailcap too. (Also, remove a bug where applyResponse was called twice.)
* loader: rework process modelbptato2024-03-111-62/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Originally we had several loader processes so that the loader did not need asynchronity for loading several buffers at once. Since then, the scope of what loader does has been reduced significantly, and with that loader has become mostly asynchronous. This patch finishes the above work as follows: * We only fork a single loader process for the browser. It is a waste of resources to do otherwise, and would have made future work on a download manager very difficult. * loader becomes (almost) fully async. Now the only sync part is a) processing commands and b) waiting for clients to consume responses. b) is a bit more problematic than a), but should not cause problems unless some other horrible bug exists in a client. (TODO: make it fully async.) This gives us a noticable improvement in CSS loading speed, since all resources can now be queried at once (even before the previous ones are connected). * Buffers now only get processes when the *connection* is finished. So headers, status code, etc. are handled by the client, and the buffer is forked when the loader starts streaming the response body. As a result, mailcap entries can simply dup2 the first UNIX domain socket connection as their stdin. This allows us to remove the ugly (and slow) `canredir' hack, which required us to send file handles on a tour accross the entire codebase. * The "cache" has been reworked somewhat: - Since canredir is gone, buffer-level requests usually start in a suspended state, and are explicitly resumed only after the client could decide whether it wants to cache the response. - Instead of a flag on Request and the URL as the cache key, we now use a global counter and the special `cache:' scheme. * misc fixes: referer_from is now actually respected by buffers (not just the pager), load info display should work slightly better, etc.
* misc refactoringsbptato2024-02-271-13/+1
| | | | | | * rename buffer enums * fix isAscii for char 0x80 * remove dead code from URL
* regex: re-work compileSearchRegexbptato2024-02-171-0/+16
| | | | | | | I've gotten tired of not being able to search for forward slashes. Now it works like in vim, and you can also set default ignore case in the config.
* twtstr: misc refactoringsbptato2024-02-091-141/+24
| | | | | | * move out half width <-> full width converters * snake_case -> camelCase * improve toScreamingSnakeCase slicing
* mimetypes: simplify parseMimeTypesbptato2024-01-271-0/+6
| | | | | * use functions like until * do not call atEnd for every line, use boolean readLine instead
* Add urlenc, urldec; fix a URL encoding bug; improve trans.cgibptato2024-01-081-12/+18
| | | | | | | | | | * Fix incorrect internal definition of the fragment percent-encode set * urlenc, urldec: these are simple utility programs mainly for use with shell local CGI scripts. (Sadly the printf + xargs solution is not portable.) * Pass libexec directory as an env var to local CGI scripts * Update trans.cgi to use urldec and add an example for combining it with selections
* Use std/* imports everywherebptato2024-01-071-11/+11
|
* charcategory: move out isDigitAsciibptato2023-12-141-1/+1
| | | | so we do not have to import unicode
* Various fixesbptato2023-12-131-37/+0
| | | | | | | * Makefile: fix parallel build, add new binaries to install target * twtstr: split out libunicode-related stuff to luwrap * config: quote default gopher2html URL env var for unquote * adapter/: get rid of types/url dependency, use CURL url in all cases
* break up twtstr somewhatbptato2023-12-131-369/+9
| | | | | Avoid computing e.g. charwidth data for http which does not need it at all.
* twtstr: import functions from gopher2htmlbptato2023-12-121-15/+10
|
* css: add case-insensitive matchingbptato2023-12-111-0/+14
| | | | Also case-sensitive, but for now that is the same as normal matching...
* css: add text-transformbptato2023-12-111-3/+113
| | | | | | | Probably not fully correct, but it's a good start. Includes proprietary extension -cha-half-width, which converts full-width characters to half-width ones.
* config: better path handling; fix array parsing bugbptato2023-12-101-1/+1
| | | | | | | | | * Paths are now parsed through an unified code path with some useful additions like environment variable substitution. * Fix a bug in parseConfigValue where strings would be appended to existing arrays (and not override them). * Fix beforeLast calling afterLast for some reason. * Add a default CGI directory.
* html: add HTMLElement.dataset (+ some twtstr cleanup)bptato2023-12-011-11/+19
|
* twtstr: simplify expandPathbptato2023-11-291-19/+12
|
* twtstr: remove tolower, isWhitespacebptato2023-11-201-18/+4
| | | | | | | | * tolower: strutils toLowerAscii is good enough for the cases where we need it. Also, it's easy to confuse with unicode toLower and vice versa. * isWhitespace: in AsciiWhitespace is more idiomatic. Also has a naming collision with unicode toLower.
* Add -C optionbptato2023-10-271-0/+7
|
* twtstr: optimize widthbptato2023-10-011-34/+17
|
* Add urimethodmap supportbptato2023-09-301-4/+2
| | | | yay
* remove sequtils dependencybptato2023-09-241-3/+5
|
* ftp: encode paths, escape displayed stringsbptato2023-09-191-0/+17
| | | | avoid e.g. # being interpreted as a fragment