diff options
author | Kartik Agaram <vc@akkartik.com> | 2020-08-02 15:31:56 -0700 |
---|---|---|
committer | Kartik Agaram <vc@akkartik.com> | 2020-08-02 15:50:19 -0700 |
commit | 89c9ed80f9f7f4d4d40fea44c6e08362cfde50c7 (patch) | |
tree | 2ab8f044346695447468303e1f74372e5f158d76 /apps/mu.subx | |
parent | 0f5d0ec519c5b6fbb36ace912426e6a3fb8aa8ec (diff) | |
download | mu-89c9ed80f9f7f4d4d40fea44c6e08362cfde50c7.tar.gz |
6706 - support utf-8
For example: fn main -> r/ebx: int { var x/eax: grapheme <- copy 0x9286e2 # code point 0x2192 in utf-8 print-grapheme-to-real-screen x print-string-to-real-screen "\n" } Graphemes must fit in 4 bytes (21 bits for code points). Unclear what we should do for longer clusters since graphemes are a fixed-size type at the moment.
Diffstat (limited to 'apps/mu.subx')
-rw-r--r-- | apps/mu.subx | 2 |
1 files changed, 2 insertions, 0 deletions
diff --git a/apps/mu.subx b/apps/mu.subx index 20f59336..912b2b1f 100644 --- a/apps/mu.subx +++ b/apps/mu.subx @@ -414,6 +414,8 @@ Type-id: # (stream (addr array byte)) "slice"/imm32 # 12 "code-point"/imm32 # 13; smallest scannable unit from a text stream "grapheme"/imm32 # 14; smallest printable unit; will eventually be composed of multiple code-points, but currently corresponds 1:1 + # only 4-byte graphemes in utf-8 are currently supported; + # unclear how we should deal with larger clusters. # Keep Primitive-type-ids in sync if you add types here. 0/imm32 0/imm32 0/imm32 0/imm32 0/imm32 0/imm32 0/imm32 0/imm32 0/imm32 |