diff options
author | apense <apense@users.noreply.github.com> | 2015-06-08 19:48:57 -0400 |
---|---|---|
committer | apense <apense@users.noreply.github.com> | 2015-06-08 19:48:57 -0400 |
commit | 0ee1672d69272aa75cf9be15dede34773a4fa487 (patch) | |
tree | 0c8b567b249b033d5c5d6d83f290294069205aa2 /lib/pure | |
parent | c4009c61820190c188f6bcf7469754b3c40201e5 (diff) | |
download | Nim-0ee1672d69272aa75cf9be15dede34773a4fa487.tar.gz |
Updated whitespace ranges
Ranges sourced from <http://www.unicode.org/Public/7.0.0/ucd/PropList.txt>_. Wikipedia also uses these ranges on its information page <http://en.wikipedia.org/wiki/Whitespace_character#Unicode>_. 0xfeff isn't included in the list, but it is a no-break space, so I guess it makes sense. 0x200b is actually a format character, but it is a zero-width space. To fit Unicode, both 0x200b and 0xfeff would be removed.
Diffstat (limited to 'lib/pure')
-rw-r--r-- | lib/pure/unicode.nim | 10 |
1 files changed, 8 insertions, 2 deletions
diff --git a/lib/pure/unicode.nim b/lib/pure/unicode.nim index 5fd3c2418..4446eaa0c 100644 --- a/lib/pure/unicode.nim +++ b/lib/pure/unicode.nim @@ -372,11 +372,17 @@ const 0xfe74] # spaceRanges = [ - 0x0009, 0x000a, # tab and newline + 0x0009, 0x000d, # tab and newline 0x0020, 0x0020, # space + 0x0085, 0x0085, # next line 0x00a0, 0x00a0, # - 0x2000, 0x200b, # - + 0x1680, 0x1680, # Ogham space mark + 0x2000, 0x200b, # en dash .. zero-width space + 0x200e, 0x200f, # LTR mark .. RTL mark (pattern whitespace) 0x2028, 0x2029, # - 0x3000, 0x3000, # + 0x202f, 0x202f, # narrow no-break space + 0x205f, 0x205f, # medium mathematical space + 0x3000, 0x3000, # ideographic space 0xfeff, 0xfeff] # toupperRanges = [ |