about summary refs log tree commit diff stats
Commit message (Collapse)AuthorAgeFilesLines
...
| * htmlparser: take Option[Handle] for `before' in insertTextbptato2023-12-042-4/+5
| | | | | | | | had to be fixed too
| * Update readmebptato2023-12-031-7/+11
| |
| * htmlparser: take Option[Handle] for `before' in insertBeforebptato2023-12-032-14/+15
| | | | | | | | | | Passing `nil' there was an unfortunate mistake that requires an API breakage to fix.
| * Version 0.13.0bptato2023-12-032-2/+2
| |
| * tests/tree: add tests 4-8bptato2023-12-031-17/+73
| |
| * Various fixes & improvements in all modulesbptato2023-12-033-58/+136
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | minidom: * add fragment parsing algorithms * document parseHTML htmlparser: * fix table body/in caption being mixed up in resetInsertionMode * fix frameset-ok not being initialized to true * fix opts.ctx not being used * naively parse tags in `match' instead of using the tokenizer htmltokenizer: * remove special-cased compile-time tokenizer mode * change sbuf to an array (from a seq), and store length in a separate variable instead of constantly resizing it * do not check for eof in emit_current (it never occurs)
| * entity: use pre-generated filebptato2023-11-204-13/+1087
| | | | | | | | | | Nim's JSON parser is slow, in nimvm even more so. Use a pre-generated entity_gen.nim file instead.
| * tests/tree: add tests2, tests3bptato2023-11-191-3/+9
| |
| * minidom: fix insertText if before is first in parentbptato2023-11-191-1/+5
| |
| * htmlparser, tests: make tests1.dat run without errorsbptato2023-11-193-99/+147
| | | | | | | | | | | | | | | | * Fix several bugs in adoptionAgencyAlgorithm, and factor out several "find index" operations * Fix some frameset, table col related bugs * minidom: simplify moveChildren, assert on adding children with an existing parent
| * tests/tree: fix comment handling, log databptato2023-11-181-26/+16
| |
| * htmltokenizer: formatbptato2023-11-181-2/+2
| |
| * htmlparser: adoption agency algorithm fixesbptato2023-11-181-13/+20
| | | | | | | | | | * Fix misunderstanding: the stack grows *downwards*. * Add some comments
| * tests: incomplete support for tree builder testsbptato2023-11-181-0/+275
| |
| * Update chakasubptato2023-11-181-1/+1
| |
| * tokenizer: move flush_chars into a procbptato2023-10-271-28/+28
| |
| * Add null character token typebptato2023-10-273-47/+42
| | | | | | | | So that we do not have to replace it in the parser.
| * Version 0.12.0bptato2023-10-232-3/+3
| |
| * Add pushInTemplate for fragment parsingbptato2023-10-231-0/+5
| |
| * Reduce nil usage for Handlesbptato2023-10-231-9/+13
| | | | | | | | Still not nil-free, because insertBefore & insertText needs nil.
| * htmlparser: add openElementsInit, formInit to optsbptato2023-10-231-1/+12
| | | | | | | | | | Makes it possible to set an initial value for openElements and the form pointer, as required by the HTML fragment parsing algorithm.
| * parser: add initial tokenizer state option; tokenizer: allow any kind of streambptato2023-10-224-27/+61
| | | | | | | | | | Use this to enable the unicodeCharsProblematic test, by importing runestream.
| * update chakasubptato2023-10-221-1/+1
| |
| * Version 0.11.2bptato2023-09-302-2/+2
| |
| * Fix potential OOB seq access in peek_charbptato2023-09-301-1/+2
| | | | | | | | | | Call consume() so that the buffer is filled if we are not at EOF yet (through checkBufLen).
| * tolower -> toLowerAsciibptato2023-09-241-1/+1
| |
| * twtstr: remove unused functionsbptato2023-09-241-307/+0
| |
| * Version 0.11.1bptato2023-09-242-2/+2
| |
| * remove unused functionsbptato2023-09-241-8/+1
| |
| * update chakasubptato2023-09-241-1/+1
| |
| * Version 0.11.0bptato2023-09-192-3/+3
| |
| * tags: clean upbptato2023-09-191-72/+1
| | | | | | | | | | | | * InputType, ButtonType have nothing to do with the parser. * Neither do many categories included in the module, these have been removed too. (Many of these are remnants of the previous HTML parser.)
| * Version 0.10.1bptato2023-09-142-2/+2
| |
| * htmlparser: add whitespace handling to text & in table statesbptato2023-09-141-2/+2
| | | | | | | | a rather problematic omission
| * Version 0.10.0bptato2023-09-142-3/+3
| |
| * htmlparser: check for moveChildren not being nilbptato2023-09-141-0/+1
| |
| * Update chakasubptato2023-09-142-2/+3
| |
| * tests: disable unicodeCharsProblematicbptato2023-09-031-2/+10
| | | | | | | | This really just won't work with what we have right now.
| * tokenizer: fix more testsbptato2023-09-033-50/+83
| | | | | | | | Now all tokenizer tests work, except for unicodeCharsProblematic.
| * tokenizer: make domjs tests workbptato2023-09-022-16/+64
| | | | | | | | add escaped inputs/outputs, fix some tokenizer bugs
| * tokenizer: fix contentModelFlags testsbptato2023-09-022-6/+8
| | | | | | | | Fix some bugs with EOF handling, also some bugs in the test code.
| * tokenizer: expose laststartbptato2023-09-022-7/+13
| |
| * Add html5lib-testsbptato2023-09-023-0/+177
| | | | | | | | For now, tokenizer tests only.
| * tokenizer: emit strings instead of charsbptato2023-09-022-109/+137
| | | | | | | | | | | | | | Makes more sense overall. As an optimization, emit separate whitespace tokens so that we do not have to check for string contents.
| * htmlparser: fix dependency on nodeTypebptato2023-09-021-1/+1
| |
| * minidom: fix warningbptato2023-09-021-1/+0
| |
| * Add moveChildren, remove dependency on childListbptato2023-09-022-14/+20
| | | | | | | | | | | | * moveChildren: to move child nodes in the adoption agency algorithm. * We accidentally depended on childList existing in the DOM implementation, this has been fixed by the above addition.
| * Version 0.9.3bptato2023-08-152-2/+2
| |
| * Fix assertion on unexpected charactersbptato2023-08-152-0/+3
| | | | | | | | | | In some cases, an unexpected character token could call parseErrorByTokenType...
| * Add restart callback, implement setCharacterSetbptato2023-08-152-7/+38
| | | | | | | | | | | | restart is mainly needed for resetting the document node. setCharacterSet now works (albeit somewhat differently than previously specified.)