about summary refs log tree commit diff stats
diff options
context:
space:
mode:
authorAndinus <andinus@nand.sh>2021-01-19 18:20:23 +0530
committerAndinus <andinus@nand.sh>2021-01-19 18:20:23 +0530
commit5bb0f224483fbc1d57fd1c5a2f4a22dd7263ecd6 (patch)
tree1623478ad72521793c499522a88bf258a39f6464
parenta5fd4afc258afe61adc298101cf2665e0dfb6b9f (diff)
downloadoctans-5bb0f224483fbc1d57fd1c5a2f4a22dd7263ecd6.tar.gz
Re-implement octans, move subroutines to respective modules
Initially it went over the list of words & checked if they exist in
the grid. This was very slow.

Currently it walks the grid & checks if the current string exist in
the dictionary. This is faster for these reasons:

• The dictionary is sorted, we perform binary range search on the
  dictionary to return the list of all words that start with specific
  string.
• Starting positions are limited.

If the dictionary wasn't sorted then this probably would've been
-rw-r--r--.gitignore1
-rw-r--r--README85
-rw-r--r--README.org28
-rw-r--r--lib/Neighbors.rakumod53
-rw-r--r--lib/Puzzle.rakumod56
-rw-r--r--lib/RangeSearch.rakumod70
-rw-r--r--lib/WordSearch.rakumod67
-rwxr-xr-xoctans.raku268
8 files changed, 410 insertions, 218 deletions
diff --git a/.gitignore b/.gitignore
new file mode 100644
index 0000000..4a5e4c7
--- /dev/null
+++ b/.gitignore
@@ -0,0 +1 @@
+lib/.precomp
diff --git a/README b/README
new file mode 100644
index 0000000..b05d0ae
--- /dev/null
+++ b/README
@@ -0,0 +1,85 @@
+                               ━━━━━━━━━
+                                 OCTANS
+
+                                Andinus
+                               ━━━━━━━━━
+
+
+Table of Contents
+─────────────────
+
+1. Demo
+2. Documentation
+.. 1. Implementation
+.. 2. Options
+
+
+Octans is a program to solve Algot's Wordplay (Wordsearch) puzzles.
+
+• Website: <https://andinus.nand.sh/octans>
+• Source: <https://git.tilde.institute/andinus/octans>
+• GitHub: <https://github.com/andinus/octans>
+
+
+1 Demo
+══════
+
+  This was recorded with `asciinema(1)'.
+
+  [https://asciinema.org/a/384464.png]
+
+  ⁃ Octans 2020-01-14: <https://asciinema.org/a/384464>
+  ⁃ alt-link (download): <https://andinus.nand.sh/static/octans>
+
+
+[https://asciinema.org/a/384464.png] <https://asciinema.org/a/384464>
+
+
+2 Documentation
+═══════════════
+
+2.1 Implementation
+──────────────────
+
+  Initially it went over the list of words & checked if they exist in
+  the grid. This was very slow.
+
+  Currently it walks the grid & checks if the current string exist in
+  the dictionary. This is faster for these reasons:
+
+  • The dictionary is sorted, we perform binary range search on the
+    dictionary to return the list of all words that start with specific
+    string.
+  • Starting positions are limited.
+
+  If the dictionary wasn't sorted then this probably would've been
+  slower than previous implementation.
+
+
+2.2 Options
+───────────
+
+2.2.1 dict
+╌╌╌╌╌╌╌╌╌╌
+
+  Octans's default dictionary file is `/usr/share/dict/words', use
+  `--dict' flag to change the dictionary. The words in dictionary must
+  be seperated by a newline (`\n') & sorted alphabetically.
+
+
+2.2.2 url
+╌╌╌╌╌╌╌╌╌
+
+  The url to be passed must be in either format:
+
+  • Link when you view it from your local instance:
+    <https://tilde.zone/web/statuses/105531207939242077>
+
+  • Link from Algot's profile:
+    <https://mastodon.art/@Algot/105333136907848390>
+
+
+2.2.3 verbose
+╌╌╌╌╌╌╌╌╌╌╌╌╌
+
+  This will increase verbosity.
diff --git a/README.org b/README.org
index 0895433..d657291 100644
--- a/README.org
+++ b/README.org
@@ -18,19 +18,35 @@ This was recorded with =asciinema(1)=.
 + alt-link (download): https://andinus.nand.sh/static/octans
 
 * Documentation
-The url to be passed must be in either format:
+** Implementation
+Initially it went over the list of words & checked if they exist in the
+grid. This was very slow.
 
-- Link when you view it from your local instance:
-  https://tilde.zone/web/statuses/105531207939242077
+Currently it walks the grid & checks if the current string exist in the
+dictionary. This is faster for these reasons:
 
-- Link from Algot's profile:
-  https://mastodon.art/@Algot/105333136907848390
+- The dictionary is sorted, we perform binary range search on the
+  dictionary to return the list of all words that start with specific
+  string.
+- Starting positions are limited.
+
+If the dictionary wasn't sorted then this probably would've been slower
+than previous implementation.
 
 ** Options
 *** dict
 Octans's default dictionary file is =/usr/share/dict/words=, use =--dict=
 flag to change the dictionary. The words in dictionary must be seperated
-by a newline (=\n=).
+by a newline (=\n=) & sorted alphabetically.
+
+*** url
+The url to be passed must be in either format:
+
+- Link when you view it from your local instance:
+  https://tilde.zone/web/statuses/105531207939242077
+
+- Link from Algot's profile:
+  https://mastodon.art/@Algot/105333136907848390
 
 *** verbose
 This will increase verbosity.
diff --git a/lib/Neighbors.rakumod b/lib/Neighbors.rakumod
new file mode 100644
index 0000000..41cb35d
--- /dev/null
+++ b/lib/Neighbors.rakumod
@@ -0,0 +1,53 @@
+unit module Neighbors;
+
+# neighbors returns the neighbors of given index. Neighbors are cached
+# in @neighbors array. This way we don't have to compute them
+# everytime neighbors subroutine is called for the same position.
+sub neighbors (
+    @puzzle, Int $y, Int $x --> List
+) is export {
+    # @directions is holding a list of directions we can move in. It's
+    # used later for neighbors subroutine.
+    state List @directions = (
+        # $y, $x
+        ( +1, +0 ), # bottom
+        ( -1, +0 ), # top
+        ( +0, +1 ), # left
+        ( +0, -1 ), # right
+    );
+
+    # @neighbors holds the neighbors of given position.
+    state Array @neighbors;
+
+    if @puzzle[$y][$x] {
+
+        # If we've already computed the neighbors then no need to do
+        # it again.
+        unless @neighbors[$y][$x] {
+            my Int $pos-x;
+            my Int $pos-y;
+
+            # Starting from the intital position of $y, $x we move to
+            # each direction according to the values specified in
+            # @directions array. In this case we're just trying to
+            # move in 4 directions (top, bottom, left & right).
+            DIRECTION: for @directions -> $direction {
+                $pos-y = $y + $direction[0];
+                $pos-x = $x + $direction[1];
+
+                # If movement in this direction is out of puzzle grid
+                # boundary then move on to next direction.
+                next DIRECTION unless @puzzle[$pos-y][$pos-x];
+
+                # If neighbors exist in this direction then add them
+                # to @neighbors[$y][$x] array.
+                push @neighbors[$y][$x], [$pos-y, $pos-x];
+            }
+        }
+    } else {
+        # If it's out of boundary then return no neighbor.
+        @neighbors[$y][$x] = [];
+    }
+
+    return @neighbors[$y][$x];
+}
diff --git a/lib/Puzzle.rakumod b/lib/Puzzle.rakumod
new file mode 100644
index 0000000..bf4f8c3
--- /dev/null
+++ b/lib/Puzzle.rakumod
@@ -0,0 +1,56 @@
+unit module Puzzle;
+
+use WWW;
+
+# get-puzzle returns the @puzzle along with it's @gray-squares.
+sub get-puzzle (
+    Str $url,
+
+    # @puzzle will hold the puzzle grid.
+    @puzzle,
+
+    # @gray-squares will hold the position of gray squares. Algot
+    # marks them with an asterisk ("*") after the character.
+    @gray-squares
+) is export {
+    # $toot_url will hold the url that we'll call to get the toot data.
+    my Str $toot_url;
+
+    # User can pass 2 types of links, either it will be the one when they
+    # view it from their local instance or the one they get from Algot's
+    # profile. We set $toot_url from it.
+    if $url.match("web/statuses") -> $match {
+        $toot_url = $match.replace-with("api/v1/statuses");
+    } else {
+        $toot_url = "https://mastodon.art/api/v1/statuses/" ~ $url.split("/")[*-1];
+    }
+
+    # @gray-squares should be empty.
+    @gray-squares = ();
+
+    # jget just get's the url & decodes the json. We access the
+    # description field of 1st media attachment.
+    if (jget($toot_url)<media_attachments>[0]<description> ~~
+
+        # This regex gets the puzzle in $match.
+        / [[(\w [\*]?) \s*] ** 4] ** 4 $/) -> $match {
+
+        # We have each character of the puzzle stored in $match. It's
+        # assumed that it'll be a 4x4 grid.
+        for 0 .. 3 -> $y {
+            for 0 .. 3 -> $x {
+                with $match[0][($y * 4) + $x].Str.lc -> $char {
+
+                    # If it ends with an asterisk then we push the
+                    # position to @gray-squares.
+                    if $char.ends-with("*") {
+                        @puzzle[$y][$x] = $char.comb[0];
+                        push @gray-squares, [$y, $x];
+                    } else {
+                        @puzzle[$y][$x] = $char;
+                    }
+                }
+            }
+        }
+    }
+}
diff --git a/lib/RangeSearch.rakumod b/lib/RangeSearch.rakumod
new file mode 100644
index 0000000..16e43c1
--- /dev/null
+++ b/lib/RangeSearch.rakumod
@@ -0,0 +1,70 @@
+unit module RangeSearch;
+
+# range-starts-with returns a subset of given @dict list that start
+# with $str. It should be faster than:
+#
+#   @dict.grep: *.starts-with($str)
+#
+# @dict should be a sorted list of words. It performs binary lookup on
+# the list.
+sub range-starts-with (
+    @dict, Str $str --> List
+) is export {
+    # $lower, $upper hold the lower and upper index of the range
+    # respectively.
+    my Int ($lower, $upper);
+
+    # Lookup the whole dictionary.
+    my Int ($start, $end) = (0, @dict.end);
+
+    # Loop until we end up on the lower index of range.
+    while $start < $end {
+        # Divide the list into 2 parts.
+        my Int $mid = ($start + $end) div 2;
+
+        # Check if $mid word is le (less than or equal to) $str. If
+        # true then discard the bottom end of the list, if not then
+        # discard the top end.
+        if $str le @dict[$mid].substr(0, $str.chars).lc {
+            $end = $mid;
+        } else {
+            $start = $mid + 1;
+        }
+    }
+
+    # Found the lower index.
+    $lower = $start;
+
+    # Set $end to the end of list but keep $start at the lower index.
+    $end = @dict.end;
+
+    # Loop until we end up on the upper index of range.
+    while $start < $end {
+        # Divide the list into 2 parts. Adds 1 because we have to find
+        # the upper index in this part. `div' performs Interger
+        # division, output is floor'ed.
+        my Int $mid = (($start + $end) div 2) + 1;
+
+        # Check if $mid word is lt (less than) $str. If true then
+        # discard the bottom end of the list, if not then discard the
+        # top end.
+        if $str lt @dict[$mid].substr(0, $str.chars).lc {
+            $end = $mid - 1;
+        } else {
+            $start = $mid;
+        }
+    }
+
+    # Found the upper index.
+    $upper = $end;
+
+    with @dict[$lower..$upper] -> @list {
+        # Maybe the word doesn't exist in the list, in that case there
+        # will be a single element in @list. We return an empty list
+        # unless that single element starts with $str.
+        if @list.elems == 1 {
+            return () unless @list[0].starts-with($str);
+        }
+        return @list;
+    }
+}
diff --git a/lib/WordSearch.rakumod b/lib/WordSearch.rakumod
new file mode 100644
index 0000000..14d29b8
--- /dev/null
+++ b/lib/WordSearch.rakumod
@@ -0,0 +1,67 @@
+unit module WordSearch;
+
+use Neighbors;
+use RangeSearch;
+
+# word-search walks the given grid & tries to find words in the
+# dictionary. It walks in Depth-First manner (lookup Depth-First
+# search).
+sub word-search (
+    # @dict holds the dictionary. @puzzle holds the puzzle.
+    @dict, @puzzle,
+
+    # $y, $x is the position of the current cell, we have to follow
+    # this path. $str is the string we've looked up until now. If it's
+    # not passed then assume that we're starting at $y, $x and take
+    # @puzzle[$y][$x] as the string.
+    #
+    # $str should be passed in recursive calls, it's not required when
+    # $y, $x is the starting position.
+    Int $y, Int $x, $str? = @puzzle[$y][$x],
+
+    # @visited holds the positions that we've already visited.
+    @visited? is copy --> List
+) is export {
+    # If @visited was not passed then mark the given cell as visited
+    # because it's the cell we're starting at.
+    @visited[$y][$x] = True unless @visited;
+
+    # neighbor block loops over the neighbors of $y, $x.
+    neighbor: for neighbors(@puzzle, $y, $x).List -> $pos {
+        # Move on to next neighbor if we've already visited this one.
+        next neighbor if @visited[$pos[0]][$pos[1]];
+
+        # Mark this cell as visited but only until we search this
+        # path. When moving to next neighbor, mark it False.
+        @visited[$pos[0]][$pos[1]] = True;
+
+        # $word is the string that we're going to lookup in the
+        # dictionary.
+        my Str $word = $str ~ @puzzle[$pos[0]][$pos[1]];
+
+        # range-starts-with returns a list of all words in the
+        # dictionary that start with $word.
+        with range-starts-with(@dict, $word) -> @list {
+            if @list.elems > 0 {
+                # If $word exist in the dictionary then it should be
+                # the first element in the list.
+                take @list[0], @visited if @list[0] eq $word;
+
+                # Continue on this path because there are 1 or more
+                # elements in @list which means we could find a word.
+                word-search(
+                    # Don't pass the whole dictionary for next search.
+                    # Words that start with "ab" will always be a
+                    # subset of words that start with "a", so keeping
+                    # this in mind we pass the output of last
+                    # range-starts-with (@list).
+                    @list, @puzzle, $pos[0], $pos[1], $word, @visited
+                );
+            }
+        }
+
+        # We're done looking up this path, mark this cell as False &
+        # move on to another neighbor.
+        @visited[$pos[0]][$pos[1]] = False;
+    }
+}
diff --git a/octans.raku b/octans.raku
index 0c2e3fa..e65613e 100755
--- a/octans.raku
+++ b/octans.raku
@@ -1,87 +1,34 @@
 #!/usr/bin/env raku
 
 use v6.d;
-use WWW;
+use lib 'lib';
+use Puzzle;
+use WordSearch;
 
 unit sub MAIN (
-    Str $url, #= url for Algot's crossword
+    Str $url?, #= url for Algot's crossword
     Str :$dict = "/usr/share/dict/words", #= dictionary file
     Bool :v($verbose), #= increase verbosity
 );
 
-# @directions is holding a list of directions we can move in. It's
-# used later for neighbors subroutine.
-my List @directions[4] = (
-    # $y, $x
-    ( +1, +0 ), # bottom
-    ( -1, +0 ), # top
-    ( +0, +1 ), # left
-    ( +0, -1 ), # right
-);
-
-# This code is just for testing purpose. The code below that is
-# getting the puzzle & parsing it will set @puzzle & @gray-squares
-# like this:
-
-# We can call @puzzle[$y][$x] to get the character. $y stands for
-# column & $x for row, so @puzzle[0][3] will return `k' for this
-# sample @puzzle:
-
-# my List @puzzle = (
-#     <n a t k>,
-#     <i m e c>,
-#     <a r d e>,
-#     <t e c h>
-# );
-
-# my List @gray-squares = (3, 0), (2, 0); # $y, $x
-
-# @puzzle will hold the puzzle grid.
-my @puzzle;
-
-# @gray-squares will hold the position of gray squares. Algot marks
-# them with an asterisk ("*") after the character.
-my @gray-squares;
-
-# $toot_url will hold the url that we'll call to get the toot data.
-my Str $toot_url;
-
-# User can pass 2 types of links, either it will be the one when they
-# view it from their local instance or the one they get from Algot's
-# profile. We set $toot_url from it.
-if $url.match("web/statuses") -> $match {
-    $toot_url = $match.replace-with("api/v1/statuses");
-} else {
-    $toot_url = "https://mastodon.art/api/v1/statuses/" ~ $url.split("/")[*-1];
-}
-
-say "Fetching: $toot_url" if $verbose;
-
-# jget just get's the url & decodes the json. We access the
-# description field of 1st media attachment.
-if (jget($toot_url)<media_attachments>[0]<description> ~~
-
-    # This regex gets the puzzle in $match.
-    / [[(\w [\*]?) \s*] ** 4] ** 4 $/) -> $match {
-
-    # We have each character of the puzzle stored in $match. It's
-    # assumed that it'll be a 4x4 grid.
-    for 0 .. 3 -> $y {
-        for 0 .. 3 -> $x {
-            with $match[0][($y * 4) + $x].Str.lc -> $char {
-
-                # If it ends with an asterisk then we push the
-                # position to @gray-squares.
-                if $char.ends-with("*") {
-                    @puzzle[$y][$x] = $char.comb[0];
-                    push @gray-squares, [$y, $x];
-                } else {
-                    @puzzle[$y][$x] = $char;
-                }
-            }
-        }
-    }
-}
+# @dict holds the sorted dictionary. Only consider words >= 7 chars.
+my Str @dict = $dict.IO.lines.grep(*.chars >= 7);
+
+# @puzzle holds the puzzle.
+#
+# @gray-squares holds the list of indexes of valid starting positions
+# in the puzzle.
+my (@puzzle, @gray-squares);
+@puzzle = [
+    [<n a t k>],
+    [<i m e c>],
+    [<a r d e>],
+    [<t e c h>]
+];
+@gray-squares = [3, 0], [2, 0]; # $y, $x
+
+# Get the puzzle from $url if it's passed.
+get-puzzle($url, @puzzle, @gray-squares) with $url;
 
 if $verbose {
     say "Gray squares: ", @gray-squares;
@@ -89,144 +36,41 @@ if $verbose {
     "    $_".say for @puzzle;
 }
 
-# This for block loops over every word in the dictionary & searches
-# the puzzle grid for it's presence.
-word: for $dict.IO.lines -> $word {
-    # We don't want words whose length is less than 7.
-    next word unless $word.chars >= 7;
-
-    # start-pos block loops over each starting position. In normal
-    # case every position could be the start position but for Algot's
-    # puzzle they're limited to a few blocks.
-    start-pos: for @gray-squares -> $pos {
-
-        # If the dictionary word doesn't start with the starting
-        # position character then move on to the next start position.
-        next start-pos unless $word.starts-with(
-            @puzzle[$pos[0]][$pos[1]]
-        );
-
-        # Check if each letter of word is present in puzzle grid.
-        next word unless $word.comb ⊆ @puzzle[*;*];
-
-        # Print the word if the search is successful.
-        say $word if word-search(@puzzle, $pos[0], $pos[1], $word);
-    }
-}
-
-# word-search performs a Depth-First search on @puzzle. word-search
-# matches the word character by character.
-sub word-search (
-    @puzzle, Int $y, Int $x,
-
-    # $count will keep the count of character's of $word present in
-    # the puzzle.
-    Str $word, Int $count = 1,
-    @visited? is copy
-     --> Bool
-) {
-    # If the number of character's we've found is equal to the length
-    # of $word then return True because we've found the whole word.
-    return True if $count == $word.chars;
-
-    # For each neighbor, we perform a Depth-First search to find the
-    # word.
-    neighbor: for neighbors(@puzzle, $y, $x).List -> $pos {
-
-        # Move on to next neighbor if we've already visited this one.
-        # This is because we cannot reuse a grid.
-        next neighbor if @visited[$pos[0]][$pos[1]];
-
-        if @puzzle[$pos[0]][$pos[1]] eq $word.comb[$count] {
-
-            # This explains why we have to mark this position as False
-            # if the search fails:
-            #
-            # Here we're marking this position as True. This approach
-            # might cause us to miss possible solutions. If the puzzle
-            # is like so:
-            #
-            # a b e
-            # c a f
-            #
-            # And the word we're looking for is "cabefa". Then let's
-            # say that we go through the other 'a' first (bottom-mid
-            # 'a') & at this point it would be marked as True but the
-            # search would fail (correctly so).
-            #
-            # And after that failure we move to next neighbor which is
-            # top-left 'a'. The search goes on until we reach 'f' &
-            # get the list of f's neighbors which would return 'e' &
-            # bottom-mid 'a'. Now 'e' would be discarded because it
-            # was marked as visited but 'a' also has been marked as
-            # visited & it too would be discarded.
-            #
-            # This would cause us to miss solutions. So we just make
-            # it False again if the word wasn't found with this
-            # neighbor. After making it False, we move on to the next
-            # neighbor.
-
-            @visited[$pos[0]][$pos[1]] = True;
-
-            # Call word-search recursively & increment $count as we
-            # find each character. If the search was successful then
-            # return True.
-            if word-search(
-                @puzzle, $pos[0], $pos[1],
-                $word, $count + 1,
-                @visited
-            ) {
-                return True;
-            } else {
-                # Mark this as not visited if the search was
-                # unsuccessful and move on to next neighbor.
-                @visited[$pos[0]][$pos[1]] = False;
-                next neighbor;
-            }
-        }
-    }
-
-    # return False if no neighbor matches the character.
-    return False;
-}
-
-# neighbors returns the neighbors of given index. Neighbors are cached
-# in @neighbors array. This way we don't have to compute them
-# everytime neighbors subroutine is called for the same position.
-sub neighbors (
-    @puzzle, Int $y, Int $x --> List
-) {
-    state Array @neighbors;
-
-    if @puzzle[$y][$x] {
-
-        # If we've already computed the neighbors then no need to do
-        # it again.
-        unless @neighbors[$y][$x] {
-            my Int $pos-x;
-            my Int $pos-y;
-
-            # Starting from the intital position of $y, $x we move to
-            # each direction according to the values specified in
-            # @directions array. In this case we're just trying to
-            # move in 4 directions (top, bottom, left & right).
-            DIRECTION: for @directions -> $direction {
-                $pos-y = $y + $direction[0];
-                $pos-x = $x + $direction[1];
-
-                # If movement in this direction is out of puzzle grid
-                # boundary then move on to next direction.
-                next DIRECTION unless @puzzle[$pos-y][$pos-x];
-
-                # If neighbors exist in this direction then add them
-                # to @neighbors[$y][$x] array.
-                push @neighbors[$y][$x], [$pos-y, $pos-x];
+# After the solution is found, the path is printed with these fancy chars.
+my %𝒻𝒶𝓃𝒸𝓎-𝒸𝒽𝒶𝓇𝓈 = <a a̶ b b̶ c c̶ d d̶ e e̶ f f̶ g g̶ h h̶ i i̶ j j̶ k k̶ l l̶ m m̶
+                     n n̶ o o̶ p p̶ q q̶ r r̶ s s̶ t t̶ u u̶ v v̶ w w̶ x x̶ y y̶ z z̶>;
+
+# start-pos block loops over each starting position.
+start-pos: for @gray-squares -> $pos {
+    my DateTime $initial = DateTime.now;
+
+    # gather all the words that word-search finds starting from $pos.
+    for gather word-search(
+        @dict, @puzzle, $pos[0], $pos[1],
+    ) -> (
+        # word-search returns the word along with @visited which holds
+        # the list of all grids that were visited when the word was
+        # found.
+        $word, @visited
+    ) {
+        # Print the word, along with the time taken (if $verbose).
+        say ($verbose ??
+             "\n" ~ $word ~ " [" ~ DateTime.now - $initial ~ "𝑠]" !!
+             $word);
+
+        # Print the puzzle, highlighting the path.
+        if $verbose {
+            for ^@puzzle.elems -> $y {
+                print " " x 3;
+                for ^@puzzle[$y].elems -> $x {
+                    print " ", (
+                        @visited[$y][$x] ??
+                        (%𝒻𝒶𝓃𝒸𝓎-𝒸𝒽𝒶𝓇𝓈{@puzzle[$y][$x]} // @puzzle[$y][$x]) !!
+                        @puzzle[$y][$x]
+                    );
+                }
+                print "\n";
             }
         }
-    } else {
-        # If it's out of boundary then return no neighbor.
-        @neighbors[$y][$x] = [];
     }
-
-    return @neighbors[$y][$x];
 }