Commit message (Collapse) | Author | Age | Files | Lines | ||
---|---|---|---|---|---|---|
... | ||||||
| * | htmlparser: take Option[Handle] for `before' in insertText | bptato | 2023-12-04 | 2 | -4/+5 | |
| | | | | | | | | had to be fixed too | |||||
| * | Update readme | bptato | 2023-12-03 | 1 | -7/+11 | |
| | | ||||||
| * | htmlparser: take Option[Handle] for `before' in insertBefore | bptato | 2023-12-03 | 2 | -14/+15 | |
| | | | | | | | | | | Passing `nil' there was an unfortunate mistake that requires an API breakage to fix. | |||||
| * | Version 0.13.0 | bptato | 2023-12-03 | 2 | -2/+2 | |
| | | ||||||
| * | tests/tree: add tests 4-8 | bptato | 2023-12-03 | 1 | -17/+73 | |
| | | ||||||
| * | Various fixes & improvements in all modules | bptato | 2023-12-03 | 3 | -58/+136 | |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | minidom: * add fragment parsing algorithms * document parseHTML htmlparser: * fix table body/in caption being mixed up in resetInsertionMode * fix frameset-ok not being initialized to true * fix opts.ctx not being used * naively parse tags in `match' instead of using the tokenizer htmltokenizer: * remove special-cased compile-time tokenizer mode * change sbuf to an array (from a seq), and store length in a separate variable instead of constantly resizing it * do not check for eof in emit_current (it never occurs) | |||||
| * | entity: use pre-generated file | bptato | 2023-11-20 | 4 | -13/+1087 | |
| | | | | | | | | | | Nim's JSON parser is slow, in nimvm even more so. Use a pre-generated entity_gen.nim file instead. | |||||
| * | tests/tree: add tests2, tests3 | bptato | 2023-11-19 | 1 | -3/+9 | |
| | | ||||||
| * | minidom: fix insertText if before is first in parent | bptato | 2023-11-19 | 1 | -1/+5 | |
| | | ||||||
| * | htmlparser, tests: make tests1.dat run without errors | bptato | 2023-11-19 | 3 | -99/+147 | |
| | | | | | | | | | | | | | | | | * Fix several bugs in adoptionAgencyAlgorithm, and factor out several "find index" operations * Fix some frameset, table col related bugs * minidom: simplify moveChildren, assert on adding children with an existing parent | |||||
| * | tests/tree: fix comment handling, log data | bptato | 2023-11-18 | 1 | -26/+16 | |
| | | ||||||
| * | htmltokenizer: format | bptato | 2023-11-18 | 1 | -2/+2 | |
| | | ||||||
| * | htmlparser: adoption agency algorithm fixes | bptato | 2023-11-18 | 1 | -13/+20 | |
| | | | | | | | | | | * Fix misunderstanding: the stack grows *downwards*. * Add some comments | |||||
| * | tests: incomplete support for tree builder tests | bptato | 2023-11-18 | 1 | -0/+275 | |
| | | ||||||
| * | Update chakasu | bptato | 2023-11-18 | 1 | -1/+1 | |
| | | ||||||
| * | tokenizer: move flush_chars into a proc | bptato | 2023-10-27 | 1 | -28/+28 | |
| | | ||||||
| * | Add null character token type | bptato | 2023-10-27 | 3 | -47/+42 | |
| | | | | | | | | So that we do not have to replace it in the parser. | |||||
| * | Version 0.12.0 | bptato | 2023-10-23 | 2 | -3/+3 | |
| | | ||||||
| * | Add pushInTemplate for fragment parsing | bptato | 2023-10-23 | 1 | -0/+5 | |
| | | ||||||
| * | Reduce nil usage for Handles | bptato | 2023-10-23 | 1 | -9/+13 | |
| | | | | | | | | Still not nil-free, because insertBefore & insertText needs nil. | |||||
| * | htmlparser: add openElementsInit, formInit to opts | bptato | 2023-10-23 | 1 | -1/+12 | |
| | | | | | | | | | | Makes it possible to set an initial value for openElements and the form pointer, as required by the HTML fragment parsing algorithm. | |||||
| * | parser: add initial tokenizer state option; tokenizer: allow any kind of stream | bptato | 2023-10-22 | 4 | -27/+61 | |
| | | | | | | | | | | Use this to enable the unicodeCharsProblematic test, by importing runestream. | |||||
| * | update chakasu | bptato | 2023-10-22 | 1 | -1/+1 | |
| | | ||||||
| * | Version 0.11.2 | bptato | 2023-09-30 | 2 | -2/+2 | |
| | | ||||||
| * | Fix potential OOB seq access in peek_char | bptato | 2023-09-30 | 1 | -1/+2 | |
| | | | | | | | | | | Call consume() so that the buffer is filled if we are not at EOF yet (through checkBufLen). | |||||
| * | tolower -> toLowerAscii | bptato | 2023-09-24 | 1 | -1/+1 | |
| | | ||||||
| * | twtstr: remove unused functions | bptato | 2023-09-24 | 1 | -307/+0 | |
| | | ||||||
| * | Version 0.11.1 | bptato | 2023-09-24 | 2 | -2/+2 | |
| | | ||||||
| * | remove unused functions | bptato | 2023-09-24 | 1 | -8/+1 | |
| | | ||||||
| * | update chakasu | bptato | 2023-09-24 | 1 | -1/+1 | |
| | | ||||||
| * | Version 0.11.0 | bptato | 2023-09-19 | 2 | -3/+3 | |
| | | ||||||
| * | tags: clean up | bptato | 2023-09-19 | 1 | -72/+1 | |
| | | | | | | | | | | | | * InputType, ButtonType have nothing to do with the parser. * Neither do many categories included in the module, these have been removed too. (Many of these are remnants of the previous HTML parser.) | |||||
| * | Version 0.10.1 | bptato | 2023-09-14 | 2 | -2/+2 | |
| | | ||||||
| * | htmlparser: add whitespace handling to text & in table states | bptato | 2023-09-14 | 1 | -2/+2 | |
| | | | | | | | | a rather problematic omission | |||||
| * | Version 0.10.0 | bptato | 2023-09-14 | 2 | -3/+3 | |
| | | ||||||
| * | htmlparser: check for moveChildren not being nil | bptato | 2023-09-14 | 1 | -0/+1 | |
| | | ||||||
| * | Update chakasu | bptato | 2023-09-14 | 2 | -2/+3 | |
| | | ||||||
| * | tests: disable unicodeCharsProblematic | bptato | 2023-09-03 | 1 | -2/+10 | |
| | | | | | | | | This really just won't work with what we have right now. | |||||
| * | tokenizer: fix more tests | bptato | 2023-09-03 | 3 | -50/+83 | |
| | | | | | | | | Now all tokenizer tests work, except for unicodeCharsProblematic. | |||||
| * | tokenizer: make domjs tests work | bptato | 2023-09-02 | 2 | -16/+64 | |
| | | | | | | | | add escaped inputs/outputs, fix some tokenizer bugs | |||||
| * | tokenizer: fix contentModelFlags tests | bptato | 2023-09-02 | 2 | -6/+8 | |
| | | | | | | | | Fix some bugs with EOF handling, also some bugs in the test code. | |||||
| * | tokenizer: expose laststart | bptato | 2023-09-02 | 2 | -7/+13 | |
| | | ||||||
| * | Add html5lib-tests | bptato | 2023-09-02 | 3 | -0/+177 | |
| | | | | | | | | For now, tokenizer tests only. | |||||
| * | tokenizer: emit strings instead of chars | bptato | 2023-09-02 | 2 | -109/+137 | |
| | | | | | | | | | | | | | | Makes more sense overall. As an optimization, emit separate whitespace tokens so that we do not have to check for string contents. | |||||
| * | htmlparser: fix dependency on nodeType | bptato | 2023-09-02 | 1 | -1/+1 | |
| | | ||||||
| * | minidom: fix warning | bptato | 2023-09-02 | 1 | -1/+0 | |
| | | ||||||
| * | Add moveChildren, remove dependency on childList | bptato | 2023-09-02 | 2 | -14/+20 | |
| | | | | | | | | | | | | * moveChildren: to move child nodes in the adoption agency algorithm. * We accidentally depended on childList existing in the DOM implementation, this has been fixed by the above addition. | |||||
| * | Version 0.9.3 | bptato | 2023-08-15 | 2 | -2/+2 | |
| | | ||||||
| * | Fix assertion on unexpected characters | bptato | 2023-08-15 | 2 | -0/+3 | |
| | | | | | | | | | | In some cases, an unexpected character token could call parseErrorByTokenType... | |||||
| * | Add restart callback, implement setCharacterSet | bptato | 2023-08-15 | 2 | -7/+38 | |
| | | | | | | | | | | | | restart is mainly needed for resetting the document node. setCharacterSet now works (albeit somewhat differently than previously specified.) |