From 52432265c0b8a35af02e3d05911fb50613b5ce81 Mon Sep 17 00:00:00 2001 From: Andinus Date: Sun, 7 Feb 2021 23:16:31 +0530 Subject: Fix the regex for puzzle MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The older regex fails on [today's puzzle] & I didn't really understand what it did. The newer one is simpler & I understand how it works. [today's puzzle] https://mastodon.art/@Algot/105690195742318751 Thanks to guifa on #raku@freenode, they explained me how they would build regex for this problem. I'm pasting the logs here: |[...] |10:35 Smallest element here is the letter. Lots of ways to | represent it, but I’d go with \S+ |10:35 The next smallest is the group of letters |10:36 Which is what you just got with spaces in between it |10:36 so you get (\S+)+ % \h |10:36 Next you want to grab individual lines with that | pattern in it so |10:37 ( (\S+)+ % \h )+ \n |10:37 And lastly, you want to start the pattern after a | double return |[...] |10:39 \n \n ( (\S+)+ % \h )+ % \n |10:41 The only problem here is that this technically does | match Hint. So to limit things more, you can either | be stricter about the inner bit (using \S \*? instead | of \S+), explicitly putting “Hint\n\n” in the regex | start, or requiring more than one inner match (\S+) | ** 2..* % \h |[...] |10:47 But you might consider breaking things out into | tokens |[...] |10:54 I think a lot of times people try to write regex left | to right, when they need to make it small to big |10:54 That’s part of the reason you have the grammars in | Raku — it really pushes you to think of things that | way I asked guifa before including this, they were okay with it. --- lib/Octans/Puzzle.rakumod | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/lib/Octans/Puzzle.rakumod b/lib/Octans/Puzzle.rakumod index 89eae69..d97f9d6 100644 --- a/lib/Octans/Puzzle.rakumod +++ b/lib/Octans/Puzzle.rakumod @@ -24,15 +24,19 @@ sub get-puzzle ( $url = "https://mastodon.art/api/v1/statuses/" ~ $path.split("/")[*-1]; } + # grids capture grids of a row. + my token grids { \S \*? } + # rows capture rows of the puzzle. + my token rows { ** 2..* % \h } + # jget just get's the url & decodes the json. We access the # description field of 1st media attachment. if (jget($url)[0] ~~ - - # This regex gets the puzzle in $match. - / ([(\w [\*]?) \s*?]+ \n)+ $/) -> $match { - for 0 .. $match[0].end -> $y { - for 0 .. $match[0][$y].words.end -> $x { - @puzzle[$y][$x] = $match[0][$y].words[$x].lc; + / \n\n + % \n / + ) -> $match { + for 0 .. $match.end -> $y { + for 0 .. $match[$y].end -> $x { + @puzzle[$y][$x] = $match[$y][$x].lc; } } } -- cgit 1.4.1-2-gfad0