diff options
author | Kartik K. Agaram <vc@akkartik.com> | 2021-11-09 08:12:11 -0800 |
---|---|---|
committer | Kartik K. Agaram <vc@akkartik.com> | 2021-11-09 08:12:11 -0800 |
commit | d253a3182859c7c989449122a60d5f362f19ded0 (patch) | |
tree | 7459cddc57f93107fa4cee89d4f0a94dd0f0f131 /tutorial | |
parent | d1808995b2c6b99749237a29e6ac6477d00ff8f9 (diff) | |
download | mu-d253a3182859c7c989449122a60d5f362f19ded0.tar.gz |
rename grapheme to code-point-utf8
Longer name, but it doesn't lie. We have no data structure right now for combining multiple code points. And it makes no sense for the notion of a grapheme to conflate its Unicode encoding.
Diffstat (limited to 'tutorial')
-rw-r--r-- | tutorial/converter.mu | 2 | ||||
-rw-r--r-- | tutorial/converter2.mu | 2 | ||||
-rw-r--r-- | tutorial/index.md | 12 |
3 files changed, 8 insertions, 8 deletions
diff --git a/tutorial/converter.mu b/tutorial/converter.mu index a8aa26e3..b101fbdd 100644 --- a/tutorial/converter.mu +++ b/tutorial/converter.mu @@ -55,7 +55,7 @@ fn main screen: (addr screen), keyboard: (addr keyboard), data-disk: (addr disk) # process a single keystroke $main:input: { var key/eax: byte <- read-key keyboard - var key/eax: grapheme <- copy key + var key/eax: code-point-utf8 <- copy key compare key, 0 loop-if-= # tab = switch cursor between input areas diff --git a/tutorial/converter2.mu b/tutorial/converter2.mu index ae445239..5e338647 100644 --- a/tutorial/converter2.mu +++ b/tutorial/converter2.mu @@ -37,7 +37,7 @@ fn main screen: (addr screen), keyboard: (addr keyboard), data-disk: (addr disk) # process a single keystroke $main:input: { var key/eax: byte <- read-key keyboard - var key/eax: grapheme <- copy key + var key/eax: code-point-utf8 <- copy key compare key, 0 loop-if-= # tab = switch cursor between input areas diff --git a/tutorial/index.md b/tutorial/index.md index 48173fbb..3c7f781f 100644 --- a/tutorial/index.md +++ b/tutorial/index.md @@ -541,7 +541,7 @@ fn main screen: (addr screen), keyboard: (addr keyboard) { var done?/eax: boolean <- stream-empty? in compare done?, 0/false break-if-!= - var g/eax: grapheme <- read-grapheme in + var g/eax: code-point-utf8 <- read-code-point-utf8 in # do stuff with g here loop } @@ -550,8 +550,8 @@ fn main screen: (addr screen), keyboard: (addr keyboard) { `read-line-from-keyboard` reads keystrokes from the keyboard until you press the `Enter` (also called `newline`) key, and accumulates them into a _stream_ -of bytes. The loop then repeatedly reads _graphemes_ from the stream. A -grapheme can consist of multiple bytes, particularly outside of the Latin +of bytes. The loop then repeatedly reads _code-point-utf8s_ from the stream. A +code-point-utf8 can consist of multiple bytes, particularly outside of the Latin alphabet and Arabic digits most prevalent in the West. Mu doesn't yet support non-Qwerty keyboards, but support for other keyboards should be easy to add. @@ -561,12 +561,12 @@ give yourself a sense of what you can do with them. Does the above program make sense now? Feel free to experiment to make sense of it. Can you modify it to print out the line a second time, after you've typed it -out until the `Enter` key? Can you print a space after every grapheme when you +out until the `Enter` key? Can you print a space after every code-point-utf8 when you print the line out a second time? You'll need to skim the section on [printing to screen](https://github.com/akkartik/mu/blob/main/vocabulary.md#printing-to-screen) from Mu's vocabulary. Pay particular attention to the difference between a -grapheme and a _code-point_. Mu programs often read characters in units of -graphemes, but they must draw in units of code-points that the font manages. +code-point-utf8 and a _code-point_. Mu programs often read characters in units of +code-point-utf8s, but they must draw in units of code-points that the font manages. (This adds some complexity but helps combine multiple code-points into a single glyph as needed for some languages.) |