Attempt to explain better why delImplIdx is the way it is. Maybe this can (#15108)

avoid future implementation mischief. (Maybe not. Sometimes, general distrust of theory leads people to distrust simple reasoning over times from CPUs trying as hard as possible to mask DRAM latency via pre-fetch.)
author: c-blake <c-blake@users.noreply.github.com> 2020-07-29 05:19:02 -0400
committer: GitHub <noreply@github.com> 2020-07-29 11:19:02 +0200
commit: 196e747df150ace81e7f4c01253128d4a89f03c7 (patch)
tree: 76f3959a54dab3f8b83173e6502ecd02760febea /lib/pure/collections
parent: a3a87cdb298d7166846133bca5d56f055717474c (diff)
download: Nim-196e747df150ace81e7f4c01253128d4a89f03c7.tar.gz
1 files changed, 30 insertions, 0 deletions
diff --git a/lib/pure/collections/tableimpl.nim b/lib/pure/collections/tableimpl.nim
index ec2806200..df9cb7517 100644
--- a/lib/pure/collections/tableimpl.nim
+++ b/lib/pure/collections/tableimpl.nim
@@ -78,6 +78,36 @@ template hasKeyOrPutImpl(enlarge) {.dirty.} =
     maybeRehashPutImpl(enlarge)
   else: result = true
 
+# delImplIdx is KnuthV3 Algo6.4R adapted to i=i+1 (from i=i-1) which has come to
+# be called "back shift delete".  It shifts elements in the collision cluster of
+# a victim backward to make things as-if the victim were never inserted in the
+# first place.  This is desirable to keep things "ageless" after many deletes.
+# It is trickier than you might guess since initial probe (aka "home") locations
+# of keys in a cluster may collide and since table addresses wrap around.
+#
+# A before-after diagram might look like ('.' means empty):
+#   slot:   0   1   2   3   4   5   6   7
+# before(1)
+#   hash1:  6   7   .   3   .   5   5   6  ; Really hash() and msk
+#   data1:  E   F   .   A   .   B   C   D  ; About to delete C @index 6
+# after(2)
+#   hash2:  7   .   .   3   .   5   6   6  ; Really hash() and msk
+#   data2:  F   .   .   A   .   B   D   E  ; After deletion of C
+#
+# This lowers total search depth over the whole table from 1+1+2+2+2+2=10 to 7.
+# Had the victim been B@5, C would need back shifting to slot 5.  Total depth is
+# always lowered by at least 1, e.g. victim A@3.  This is all quite fast when
+# empty slots are frequent (also needed to keep insert/miss searches fast) and
+# hash() is either fast or avoided (via `.hcode`).  It need not compare keys.
+#
+# delImplIdx realizes the above transformation, but only works for dense Linear
+# Probing, nextTry(h)=h+1.  This is not an important limitation since that's the
+# fastest sequence on any CPU made since the 1980s. { Performance analysis often
+# overweights "key cmp" neglecting cache behavior, giving bad ideas how big/slow
+# tables behave (when perf matters most!).  Comparing hcode first means usually
+# only 1 key cmp is needed for *any* seq.  Timing only predictable activity,
+# small tables, and/or integer keys often perpetuates such bad ideas. }
+
 template delImplIdx(t, i, makeEmpty, cellEmpty, cellHash) =
   let msk = maxHash(t)
   if i >= 0:
author	c-blake <c-blake@users.noreply.github.com>	2020-07-29 05:19:02 -0400
committer	GitHub <noreply@github.com>	2020-07-29 11:19:02 +0200
commit	196e747df150ace81e7f4c01253128d4a89f03c7 (patch)
tree	76f3959a54dab3f8b83173e6502ecd02760febea /lib/pure/collections
parent	a3a87cdb298d7166846133bca5d56f055717474c (diff)
download	Nim-196e747df150ace81e7f4c01253128d4a89f03c7.tar.gz