From 52432265c0b8a35af02e3d05911fb50613b5ce81 Mon Sep 17 00:00:00 2001
From: Andinus <andinus@nand.sh>
Date: Sun, 7 Feb 2021 23:16:31 +0530
Subject: Fix the regex for puzzle
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The older regex fails on [today's puzzle] & I didn't really understand
what it did. The newer one is simpler & I understand how it works.

[today's puzzle] https://mastodon.art/@Algot/105690195742318751

Thanks to guifa on #raku@freenode, they explained me how they would
build regex for this problem. I'm pasting the logs here:

|[...]
|10:35 <guifa> Smallest element here is the letter.  Lots of ways to
|              represent it, but I’d go with \S+
|10:35 <guifa> The next smallest is the group of letters
|10:36 <guifa> Which is what you just got with spaces in between it
|10:36 <guifa> so you get (\S+)+ % \h
|10:36 <guifa> Next you want to grab individual lines with that
|              pattern in it so
|10:37 <guifa> ( (\S+)+ % \h )+ \n
|10:37 <guifa> And lastly, you want to start the pattern after a
|              double return
|[...]
|10:39 <guifa> \n \n ( (\S+)+ % \h )+ % \n
|10:41 <guifa> The only problem here is that this technically does
|              match Hint.  So to limit things more, you can either
|              be stricter about the inner bit (using \S \*? instead
|              of \S+), explicitly putting “Hint\n\n” in the regex
|              start, or requiring more than one inner match (\S+)
|              ** 2..* % \h
|[...]
|10:47 <guifa> But you might consider breaking things out into
|              tokens
|[...]
|10:54 <guifa> I think a lot of times people try to write regex left
|              to right, when they need to make it small to big
|10:54 <guifa> That’s part of the reason you have the grammars in
|              Raku — it really pushes you to think of things that
|              way

I asked guifa before including this, they were okay with it.
---
 lib/Octans/Puzzle.rakumod | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

(limited to 'lib')
diff --git a/lib/Octans/Puzzle.rakumod b/lib/Octans/Puzzle.rakumod
index 89eae69..d97f9d6 100644
--- a/lib/Octans/Puzzle.rakumod
+++ b/lib/Octans/Puzzle.rakumod
@@ -24,15 +24,19 @@ sub get-puzzle (
             $url = "https://mastodon.art/api/v1/statuses/" ~ $path.split("/")[*-1];
         }
 
+        # grids capture grids of a row.
+        my token grids { \S \*? }
+        # rows capture rows of the puzzle.
+        my token rows { <grids> ** 2..* % \h }
+
         # jget just get's the url & decodes the json. We access the
         # description field of 1st media attachment.
         if (jget($url)<media_attachments>[0]<description> ~~
-
-            # This regex gets the puzzle in $match.
-            / ([(\w [\*]?) \s*?]+ \n)+  $/) -> $match {
-            for 0 .. $match[0].end -> $y {
-                for 0 .. $match[0][$y].words.end -> $x {
-                    @puzzle[$y][$x] = $match[0][$y].words[$x].lc;
+            / \n\n <rows>+ % \n /
+           ) -> $match {
+            for 0 .. $match<rows>.end -> $y {
+                for 0 .. $match<rows>[$y]<grids>.end -> $x {
+                    @puzzle[$y][$x] = $match<rows>[$y]<grids>[$x].lc;
                 }
             }
         }
-- 
cgit 1.4.1-2-gfad0