| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
- add inline function to test and convert surrogates
is_surrogate(c), is_hi_surrogate(c), is_lo_surrogate(c),
get_hi_surrogate(c), get_lo_surrogate(c), from_surrogate(hi, lo)
- use names for BC header offsets and lengths in libregexp.c
- remove strict aliasing violations in `lre_exec_backtrack()`
- pass all context variables to XXX_CHAR macros in `lre_exec_backtrack()`
|
|
|
|
|
| |
- rename is_utf16 structure member to is_unicode
- rename flag LRE_FLAG_UTF16 as LRE_FLAG_UNICODE
|
| |
|
|
|
|
| |
it's 0, not 1 :(
|
|
|
|
|
| |
The previous approach to add UTF-8 support to libregexp was broken. This
time, we use a separate flag (cbuf_len == 3) to indicate UTF-8 input.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
| |
This allows us to greatly simplify exec(Regex). In particular, we
no longer have to convert any line containing non-ascii characters
into UTF-16 (which was a significant inefficiency in regex search
until now).
|
|
Taken from txiki.js, so it includes zamofex's top-level await patch.
|