about summary refs log tree commit diff stats
path: root/033address.cc
diff options
context:
space:
mode:
authorKartik K. Agaram <vc@akkartik.com>2016-04-24 11:54:30 -0700
committerKartik K. Agaram <vc@akkartik.com>2016-04-24 11:54:30 -0700
commitb0bf5321de2ba32f3b92c2faf6b7b68a06b6b432 (patch)
tree9e0473bddd0d9a2d4aec7ac58dec8d85c2833a5d /033address.cc
parent15936c91a9f8023dc868a021029f84b45aa50176 (diff)
downloadmu-b0bf5321de2ba32f3b92c2faf6b7b68a06b6b432.tar.gz
2864 - replace all address:shared with just address
Now that we no longer have non-shared addresses, we can just always
track refcounts for all addresses.

Phew!
Diffstat (limited to '033address.cc')
-rw-r--r--033address.cc810
1 files changed, 723 insertions, 87 deletions
diff --git a/033address.cc b/033address.cc
index 1b956fe0..29beec1e 100644
--- a/033address.cc
+++ b/033address.cc
@@ -1,14 +1,400 @@
-//: Instructions can read from addresses pointing at other locations using the
-//: 'lookup' property.
+//: Addresses help us spend less time copying data around.
+
+//: So far we've been operating on primitives like numbers and characters, and
+//: we've started combining these primitives together into larger logical
+//: units (containers or arrays) that may contain many different primitives at
+//: once. Containers and arrays can grow quite large in complex programs, and
+//: we'd like some way to efficiently share them between recipes without
+//: constantly having to make copies. Right now 'next-ingredient' and 'reply'
+//: copy data across recipe boundaries. To avoid copying large quantities of
+//: data around, we'll use *addresses*. An address is a bookmark to some
+//: arbitrary quantity of data (the *payload*). It's a primitive, so it's as
+//: efficient to copy as a number. To read or modify the payload 'pointed to'
+//: by an address, we'll perform a *lookup*.
+//:
+//: The notion of 'lookup' isn't an instruction like 'add' or 'subtract'.
+//: Instead it's an operation that can be performed when reading any of the
+//: ingredients of an instruction, and when writing to any of the products.
+//: Modern computers provide efficient support for addresses and lookups,
+//: making this a realistic feature.
+//:
+//: To recap: an address is a bookmark to some potentially large payload, and
+//: you can replace any ingredient or product with a lookup to an address of
+//: the appropriate type. But how do we get addresses to begin with? That
+//: requires a little more explanation. Once we introduce the notion of
+//: bookmarks to data, we have to think about the life cycle of a piece of
+//: data and its bookmark. Otherwise several bad outcomes can result (and
+//: indeed *have* resulted in past languages like C):
+//:
+//:   a) You can run out of memory if you don't have a way to reclaim
+//:   data.
+//:   b) If you allow data to be reclaimed, you have to be careful not to
+//:   leave any stale addresses pointing at it. Otherwise your program might
+//:   try to lookup such an address and find something unexpected. Such
+//:   problems can be very hard to track down, and they can also be exploited
+//:   to break into your computer over the network, etc.
+//:
+//: To avoid these problems, we introduce a *reference count* or refcount. The
+//: life cycle of a bit of data accessed through addresses looks like this.
+//:
+//:    We create space in computer memory for it using the 'new' instruction.
+//:    The 'new' instruction takes a type as an ingredient, allocates
+//:    sufficient space to hold that type, and returns an address (bookmark)
+//:    to the allocated space.
+//:
+//:      x:address:number <- new number:type
+//:
+//:                     +------------+
+//:          x -------> |  number    |
+//:                     +------------+
+//:
+//:    That isn't entirely accurate. Under the hood, 'new' allocates an extra
+//:    number -- the refcount:
+//:
+//:                     +------------+------------+
+//:          x -------> | refcount   |  number    |
+//:                     +------------+------------+
+//:
+//:    This probably seems like a waste of space. In practice it isn't worth
+//:    allocating individual numbers and our payload will tend to be larger,
+//:    so the picture would look more like this (zooming out a bit):
+//:
+//:                         +-------------------------+
+//:                     +---+                         |
+//:          x -------> | r |                         |
+//:                     +---+        DATA             |
+//:                         |                         |
+//:                         |                         |
+//:                         +-------------------------+
+//:
+//:    (Here 'r' denotes the refcount. It occupies a tiny amount of space
+//:    compared to the payload.)
+//:
+//:    Anyways, back to our example where the data is just a single number.
+//:    After the call to 'new', Mu's map of memory looks like this:
+//:
+//:                     +---+------------+
+//:          x -------> | 1 |  number    |
+//:                     +---+------------+
+//:
+//:    The refcount of 1 here indicates that this number has one bookmark
+//:    outstanding. If you then make a copy of x, the refcount increments:
+//:
+//:      y:address:number <- copy x
+//:
+//:          x ---+     +---+------------+
+//:               +---> | 2 |  number    |
+//:          y ---+     +---+------------+
+//:
+//:    Whether you access the payload through x or y, Mu knows how many
+//:    bookmarks are outstanding to it. When you change x or y, the refcount
+//:    transparently decrements:
+//:
+//:      x <- copy 0  # an address is just a number, you can always write 0 to it
+//:
+//:                     +---+------------+
+//:          y -------> | 1 |  number    |
+//:                     +---+------------+
+//:
+//:    The final flourish is what happens when the refcount goes down to 0: Mu
+//:    reclaims the space occupied by both refcount and payload in memory, and
+//:    they're ready to be reused by later calls to 'new'.
+//:
+//:      y <- copy 0
+//:
+//:                     +---+------------+
+//:                     | 0 |  XXXXXXX   |
+//:                     +---+------------+
+//:
+//: Using refcounts fixes both our problems a) and b) above: you can use
+//: memory for many different purposes as many times as you want without
+//: running out of memory, and you don't have to worry about ever leaving a
+//: dangling bookmark when you reclaim memory.
+//:
+//: Ok, let's rewind the clock back to this situation where we have an
+//: address:
+//:
+//:                     +---+------------+
+//:          x -------> | 1 |  number    |
+//:                     +---+------------+
+//:
+//: Once you have an address you can read or modify its payload by performing
+//: a lookup:
+//:
+//:     x/lookup <- copy 34
+//:
+//: or more concisely:
+//:
+//:     *x <- copy 34
+//:
+//: This modifies not x, but the payload x points to:
+//:
+//:                     +---+------------+
+//:          x -------> | 1 |         34 |
+//:                     +---+------------+
+//:
+//: You can also read from the payload in instructions like this:
+//:
+//:     z:number <- add *x, 1
+//:
+//: After this instruction runs the value of z will be 35.
+//:
+//: The rest of this (long) layer is divided up into 4 sections:
+//:   the implementation of the 'new' instruction
+//:   how instructions lookup addresses
+//:   how instructions update refcounts when modifying address variables
+//:   how instructions abandon and reclaim memory when refcounts drop to 0
+
+//:: the 'new' instruction allocates unique memory including a refcount
+//: todo: give 'new' a custodian ingredient. Following malloc/free is a temporary hack.
+
+:(scenario new)
+# call 'new' two times with identical types without modifying the results; you
+# should get back different results
+def main [
+  1:address:number/raw <- new number:type
+  2:address:number/raw <- new number:type
+  3:boolean/raw <- equal 1:address:number/raw, 2:address:number/raw
+]
++mem: storing 0 in location 3
+
+//: 'new' takes a weird 'type' as its first ingredient; don't error on it
+:(before "End Mu Types Initialization")
+put(Type_ordinal, "type", 0);
+:(code)
+bool is_mu_type_literal(reagent r) {
+  return is_literal(r) && r.type && r.type->name == "type";
+}
+
+:(before "End Primitive Recipe Declarations")
+NEW,
+:(before "End Primitive Recipe Numbers")
+put(Recipe_ordinal, "new", NEW);
+:(before "End Primitive Recipe Checks")
+case NEW: {
+  const recipe& caller = get(Recipe, r);
+  if (inst.ingredients.empty() || SIZE(inst.ingredients) > 2) {
+    raise << maybe(caller.name) << "'new' requires one or two ingredients, but got " << to_original_string(inst) << '\n' << end();
+    break;
+  }
+  // End NEW Check Special-cases
+  reagent type = inst.ingredients.at(0);
+  if (!is_mu_type_literal(type)) {
+    raise << maybe(caller.name) << "first ingredient of 'new' should be a type, but got " << type.original_string << '\n' << end();
+    break;
+  }
+  if (inst.products.empty()) {
+    raise << maybe(caller.name) << "result of 'new' should never be ignored\n" << end();
+    break;
+  }
+  if (!product_of_new_is_valid(inst)) {
+    raise << maybe(caller.name) << "product of 'new' has incorrect type: " << to_original_string(inst) << '\n' << end();
+    break;
+  }
+  break;
+}
+:(code)
+bool product_of_new_is_valid(const instruction& inst) {
+  reagent product = inst.products.at(0);
+  canonize_type(product);
+  if (!product.type || product.type->value != get(Type_ordinal, "address"))
+    return false;
+  drop_from_type(product, "address");
+  if (SIZE(inst.ingredients) > 1) {
+    // array allocation
+    if (!product.type || product.type->value != get(Type_ordinal, "array")) return false;
+    drop_from_type(product, "array");
+  }
+  reagent expected_product("x:"+inst.ingredients.at(0).name);
+  // End Post-processing(expected_product) When Checking 'new'
+  return types_strictly_match(product, expected_product);
+}
+
+//: To implement 'new', a Mu transform turns all 'new' instructions into
+//: 'allocate' instructions that precompute the amount of memory they want to
+//: allocate.
+
+//: Ensure that we never call 'allocate' directly, and that there's no 'new'
+//: instructions left after the transforms have run.
+:(before "End Primitive Recipe Checks")
+case ALLOCATE: {
+  raise << "never call 'allocate' directly'; always use 'new'\n" << end();
+  break;
+}
+:(before "End Primitive Recipe Implementations")
+case NEW: {
+  raise << "no implementation for 'new'; why wasn't it translated to 'allocate'? Please save a copy of your program and send it to Kartik.\n" << end();
+  break;
+}
+
+:(after "Transform.push_back(check_instruction)")  // check_instruction will guard against direct 'allocate' instructions below
+Transform.push_back(transform_new_to_allocate);  // idempotent
+
+:(code)
+void transform_new_to_allocate(const recipe_ordinal r) {
+  trace(9991, "transform") << "--- convert 'new' to 'allocate' for recipe " << get(Recipe, r).name << end();
+  for (int i = 0; i < SIZE(get(Recipe, r).steps); ++i) {
+    instruction& inst = get(Recipe, r).steps.at(i);
+    // Convert 'new' To 'allocate'
+    if (inst.name == "new") {
+      inst.operation = ALLOCATE;
+      string_tree* type_name = new string_tree(inst.ingredients.at(0).name);
+      // End Post-processing(type_name) When Converting 'new'
+      type_tree* type = new_type_tree(type_name);
+      inst.ingredients.at(0).set_value(size_of(type));
+      trace(9992, "new") << "size of " << to_string(type_name) << " is " << inst.ingredients.at(0).value << end();
+      delete type;
+      delete type_name;
+    }
+  }
+}
+
+//: implement 'allocate' based on size
+
+:(before "End Globals")
+const int Reserved_for_tests = 1000;
+int Memory_allocated_until = Reserved_for_tests;
+int Initial_memory_per_routine = 100000;
+:(before "End Setup")
+Memory_allocated_until = Reserved_for_tests;
+Initial_memory_per_routine = 100000;
+:(before "End routine Fields")
+int alloc, alloc_max;
+:(before "End routine Constructor")
+alloc = Memory_allocated_until;
+Memory_allocated_until += Initial_memory_per_routine;
+alloc_max = Memory_allocated_until;
+trace(9999, "new") << "routine allocated memory from " << alloc << " to " << alloc_max << end();
+
+:(before "End Primitive Recipe Declarations")
+ALLOCATE,
+:(before "End Primitive Recipe Numbers")
+put(Recipe_ordinal, "allocate", ALLOCATE);
+:(before "End Primitive Recipe Implementations")
+case ALLOCATE: {
+  // compute the space we need
+  int size = ingredients.at(0).at(0);
+  if (SIZE(ingredients) > 1) {
+    // array allocation
+    trace(9999, "mem") << "array size is " << ingredients.at(1).at(0) << end();
+    size = /*space for length*/1 + size*ingredients.at(1).at(0);
+  }
+  // include space for refcount
+  size++;
+  trace(9999, "mem") << "allocating size " << size << end();
+//?   Total_alloc += size;
+//?   Num_alloc++;
+  // compute the region of memory to return
+  // really crappy at the moment
+  ensure_space(size);
+  const int result = Current_routine->alloc;
+  trace(9999, "mem") << "new alloc: " << result << end();
+  // save result
+  products.resize(1);
+  products.at(0).push_back(result);
+  // initialize allocated space
+  for (int address = result; address < result+size; ++address)
+    put(Memory, address, 0);
+  if (SIZE(current_instruction().ingredients) > 1) {
+    // initialize array length
+    trace(9999, "mem") << "storing " << ingredients.at(1).at(0) << " in location " << result+/*skip refcount*/1 << end();
+    put(Memory, result+/*skip refcount*/1, ingredients.at(1).at(0));
+  }
+  Current_routine->alloc += size;
+  // no support yet for reclaiming memory between routines
+  assert(Current_routine->alloc <= Current_routine->alloc_max);
+  break;
+}
+
+//: statistics for debugging
+//? :(before "End Globals")
+//? int Total_alloc = 0;
+//? int Num_alloc = 0;
+//? int Total_free = 0;
+//? int Num_free = 0;
+//? :(before "End Setup")
+//? Total_alloc = Num_alloc = Total_free = Num_free = 0;
+//? :(before "End Teardown")
+//? cerr << Total_alloc << "/" << Num_alloc
+//?      << " vs " << Total_free << "/" << Num_free << '\n';
+//? cerr << SIZE(Memory) << '\n';
+
+:(code)
+void ensure_space(int size) {
+  if (size > Initial_memory_per_routine) {
+    tb_shutdown();
+    cerr << "can't allocate " << size << " locations, that's too much compared to " << Initial_memory_per_routine << ".\n";
+    exit(0);
+  }
+  if (Current_routine->alloc + size > Current_routine->alloc_max) {
+    // waste the remaining space and create a new chunk
+    Current_routine->alloc = Memory_allocated_until;
+    Memory_allocated_until += Initial_memory_per_routine;
+    Current_routine->alloc_max = Memory_allocated_until;
+    trace(9999, "new") << "routine allocated memory from " << Current_routine->alloc << " to " << Current_routine->alloc_max << end();
+  }
+}
+
+:(scenario new_initializes)
+% Memory_allocated_until = 10;
+% put(Memory, Memory_allocated_until, 1);
+def main [
+  1:address:number <- new number:type
+  2:number <- copy 1:address:number/lookup
+]
++mem: storing 0 in location 2
+
+:(scenario new_error)
+% Hide_errors = true;
+def main [
+  1:number/raw <- new number:type
+]
++error: main: product of 'new' has incorrect type: 1:number/raw <- new number:type
+
+:(scenario new_array)
+def main [
+  1:address:array:number/raw <- new number:type, 5
+  2:address:number/raw <- new number:type
+  3:number/raw <- subtract 2:address:number/raw, 1:address:array:number/raw
+]
++run: {1: ("address" "array" "number"), "raw": ()} <- new {number: "type"}, {5: "literal"}
++mem: array size is 5
+# don't forget the extra location for array size, and the second extra location for the refcount
++mem: storing 7 in location 3
+
+:(scenario new_empty_array)
+def main [
+  1:address:array:number/raw <- new number:type, 0
+  2:address:number/raw <- new number:type
+  3:number/raw <- subtract 2:address:number/raw, 1:address:array:number/raw
+]
++run: {1: ("address" "array" "number"), "raw": ()} <- new {number: "type"}, {0: "literal"}
++mem: array size is 0
+# one location for array size, and one for the refcount
++mem: storing 2 in location 3
+
+//: If a routine runs out of its initial allocation, it should allocate more.
+:(scenario new_overflow)
+% Initial_memory_per_routine = 3;  // barely enough room for point allocation below
+def main [
+  1:address:number/raw <- new number:type
+  2:address:point/raw <- new point:type  # not enough room in initial page
+]
++new: routine allocated memory from 1000 to 1003
++new: routine allocated memory from 1003 to 1006
+
+//:: /lookup can go from an address to the payload it points at, skipping the refcount
+//: the tests in this section use unsafe operations so as to stay decoupled from 'new'
 
 :(scenario copy_indirect)
 def main [
-  1:address:number <- copy 2/unsafe
-  2:number <- copy 34
+  1:address:number <- copy 10/unsafe
+  11:number <- copy 34
   # This loads location 1 as an address and looks up *that* location.
-  3:number <- copy 1:address:number/lookup
+  2:number <- copy 1:address:number/lookup
 ]
-+mem: storing 34 in location 3
+# 1 contains 10. Skip refcount and lookup location 11.
++mem: storing 34 in location 2
 
 :(before "End Preprocess read_memory(x)")
 canonize(x);
@@ -17,10 +403,10 @@ canonize(x);
 //: 'lookup' property
 :(scenario store_indirect)
 def main [
-  1:address:number <- copy 2/unsafe
+  1:address:number <- copy 10/unsafe
   1:address:number/lookup <- copy 34
 ]
-+mem: storing 34 in location 2
++mem: storing 34 in location 11
 
 :(before "End Preprocess write_memory(x)")
 canonize(x);
@@ -60,10 +446,31 @@ void lookup_memory(reagent& x) {
   trace(9999, "mem") << "location " << x.value << " is " << no_scientific(get_or_insert(Memory, x.value)) << end();
   x.set_value(get_or_insert(Memory, x.value));
   drop_from_type(x, "address");
-  // End Drop Address In lookup_memory(x)
+  if (x.value != 0) {
+    trace(9999, "mem") << "skipping refcount at " << x.value << end();
+    x.set_value(x.value+1);  // skip refcount
+  }
   drop_one_lookup(x);
 }
 
+void test_lookup_address_skips_refcount() {
+  reagent x("*x:address:number");
+  x.set_value(34);  // unsafe
+  put(Memory, 34, 1000);
+  lookup_memory(x);
+  CHECK_TRACE_CONTENTS("mem: skipping refcount at 1000");
+  CHECK_EQ(x.value, 1001);
+}
+
+void test_lookup_zero_address_does_not_skip_refcount() {
+  reagent x("*x:address:number");
+  x.set_value(34);  // unsafe
+  put(Memory, 34, 0);
+  lookup_memory(x);
+  CHECK_TRACE_DOESNT_CONTAIN("mem: skipping refcount at 0");
+  CHECK_EQ(x.value, 0);
+}
+
 :(after "bool types_strictly_match(reagent to, reagent from)")
   if (!canonize_type(to)) return false;
   if (!canonize_type(from)) return false;
@@ -98,7 +505,6 @@ bool canonize_type(reagent& r) {
       return false;
     }
     drop_from_type(r, "address");
-    // End Drop Address In canonize_type(r)
     drop_one_lookup(r);
   }
   return true;
@@ -125,34 +531,40 @@ void drop_one_lookup(reagent& r) {
   assert(false);
 }
 
-//:: 'get' can read from container address
+//: Tedious fixup to support addresses in container/array instructions of previous layers.
+//: Most instructions don't require fixup if they use the 'ingredients' and
+//: 'products' variables in run_current_routine().
+
 :(scenario get_indirect)
 def main [
-  1:number <- copy 2
-  2:number <- copy 34
-  3:number <- copy 35
-  4:number <- get 1:address:point/lookup, 0:offset
+  1:address:point <- copy 10/unsafe
+  # 10 reserved for refcount
+  11:number <- copy 34
+  12:number <- copy 35
+  2:number <- get 1:address:point/lookup, 0:offset
 ]
-+mem: storing 34 in location 4
++mem: storing 34 in location 2
 
 :(scenario get_indirect2)
 def main [
-  1:number <- copy 2
-  2:number <- copy 34
-  3:number <- copy 35
-  4:address:number <- copy 5/unsafe
-  *4:address:number <- get 1:address:point/lookup, 0:offset
+  1:address:point <- copy 10/unsafe
+  # 10 reserved for refcount
+  11:number <- copy 34
+  12:number <- copy 35
+  2:address:number <- copy 20/unsafe
+  2:address:number/lookup <- get 1:address:point/lookup, 0:offset
 ]
-+mem: storing 34 in location 5
++mem: storing 34 in location 21
 
 :(scenario include_nonlookup_properties)
 def main [
-  1:number <- copy 2
-  2:number <- copy 34
-  3:number <- copy 35
-  4:number <- get 1:address:point/lookup/foo, 0:offset
+  1:address:point <- copy 10/unsafe
+  # 10 reserved for refcount
+  11:number <- copy 34
+  12:number <- copy 35
+  2:number <- get 1:address:point/lookup/foo, 0:offset
 ]
-+mem: storing 34 in location 4
++mem: storing 34 in location 2
 
 :(after "Update GET base in Check")
 if (!canonize_type(base)) break;
@@ -162,14 +574,14 @@ if (!canonize_type(product)) break;
 canonize(base);
 
 :(scenario put_indirect)
-# 'put' can read from container address
 def main [
-  1:number <- copy 2
-  2:number <- copy 34
-  3:number <- copy 35
+  1:address:point <- copy 10/unsafe
+  # 10 reserved for refcount
+  11:number <- copy 34
+  12:number <- copy 35
   1:address:point/lookup <- put 1:address:point/lookup, 0:offset, 36
 ]
-+mem: storing 36 in location 2
++mem: storing 36 in location 11
 
 :(after "Update PUT base in Check")
 if (!canonize_type(base)) break;
@@ -180,17 +592,18 @@ canonize(base);
 
 :(scenario copy_array_indirect)
 def main [
-  1:array:number:3 <- create-array
-  2:number <- copy 14
-  3:number <- copy 15
-  4:number <- copy 16
-  5:address:array:number <- copy 1/unsafe
-  6:array:number <- copy *5:address:array:number
+  # 10 reserved for refcount
+  11:array:number:3 <- create-array
+  12:number <- copy 14
+  13:number <- copy 15
+  14:number <- copy 16
+  1:address:array:number <- copy 10/unsafe
+  2:array:number <- copy 1:address:array:number/lookup
 ]
-+mem: storing 3 in location 6
-+mem: storing 14 in location 7
-+mem: storing 15 in location 8
-+mem: storing 16 in location 9
++mem: storing 3 in location 2
++mem: storing 14 in location 3
++mem: storing 15 in location 4
++mem: storing 16 in location 5
 
 :(before "Update CREATE_ARRAY product in Check")
 // 'create-array' does not support indirection. Static arrays are meant to be
@@ -203,14 +616,15 @@ assert(!has_property(product, "lookup"));
 
 :(scenario index_indirect)
 def main [
-  1:array:number:3 <- create-array
-  2:number <- copy 14
-  3:number <- copy 15
-  4:number <- copy 16
-  5:address:array:number <- copy 1/unsafe
-  6:number <- index 5:address:array:number/lookup, 1
+  # 10 reserved for refcount
+  11:array:number:3 <- create-array
+  12:number <- copy 14
+  13:number <- copy 15
+  14:number <- copy 16
+  1:address:array:number <- copy 10/unsafe
+  2:number <- index 1:address:array:number/lookup, 1
 ]
-+mem: storing 15 in location 6
++mem: storing 15 in location 2
 
 :(before "Update INDEX base in Check")
 if (!canonize_type(base)) break;
@@ -226,14 +640,15 @@ canonize(index);
 
 :(scenario put_index_indirect)
 def main [
-  1:array:number:3 <- create-array
-  2:number <- copy 14
-  3:number <- copy 15
-  4:number <- copy 16
-  5:address:array:number <- copy 1/unsafe
-  5:address:array:number/lookup <- put-index 5:address:array:number/lookup, 1, 34
+  # 10 reserved for refcount
+  11:array:number:3 <- create-array
+  12:number <- copy 14
+  13:number <- copy 15
+  14:number <- copy 16
+  1:address:array:number <- copy 10/unsafe
+  1:address:array:number/lookup <- put-index 1:address:array:number/lookup, 1, 34
 ]
-+mem: storing 34 in location 3
++mem: storing 34 in location 13
 
 :(scenario put_index_indirect_2)
 def main [
@@ -241,8 +656,9 @@ def main [
   2:number <- copy 14
   3:number <- copy 15
   4:number <- copy 16
-  5:address:number <- copy 6/unsafe
-  6:number <- copy 1
+  5:address:number <- copy 10/unsafe
+  # 10 reserved for refcount
+  11:number <- copy 1
   5:address:array:number/lookup <- put-index 1:array:number:3, 5:address:number/lookup, 34
 ]
 +mem: storing 34 in location 3
@@ -261,14 +677,15 @@ canonize(index);
 
 :(scenario length_indirect)
 def main [
-  1:array:number:3 <- create-array
-  2:number <- copy 14
-  3:number <- copy 15
-  4:number <- copy 16
-  5:address:array:number <- copy 1/unsafe
-  6:number <- length 5:address:array:number/lookup
+  # 10 reserved for refcount
+  11:array:number:3 <- create-array
+  12:number <- copy 14
+  13:number <- copy 15
+  14:number <- copy 16
+  1:address:array:number <- copy 10/unsafe
+  2:number <- length 1:address:array:number/lookup
 ]
-+mem: storing 3 in location 6
++mem: storing 3 in location 2
 
 :(before "Update LENGTH array in Check")
 if (!canonize_type(array)) break;
@@ -277,32 +694,35 @@ canonize(array);
 
 :(scenario maybe_convert_indirect)
 def main [
-  1:number-or-point <- merge 0/number, 34
-  10:address:number-or-point <- copy 1/unsafe
-  11:number, 12:boolean <- maybe-convert 10:address:number-or-point/lookup, i:variant
+  # 10 reserved for refcount
+  11:number-or-point <- merge 0/number, 34
+  1:address:number-or-point <- copy 10/unsafe
+  2:number, 3:boolean <- maybe-convert 1:address:number-or-point/lookup, i:variant
 ]
-+mem: storing 34 in location 11
-+mem: storing 1 in location 12
++mem: storing 34 in location 2
++mem: storing 1 in location 3
 
 :(scenario maybe_convert_indirect_2)
 def main [
-  1:number-or-point <- merge 0/number, 34
-  10:address:number-or-point <- copy 1/unsafe
-  11:address:number <- copy 20/unsafe
-  11:address:number/lookup, 12:boolean <- maybe-convert 10:address:number-or-point/lookup, i:variant
+  # 10 reserved for refcount
+  11:number-or-point <- merge 0/number, 34
+  1:address:number-or-point <- copy 10/unsafe
+  2:address:number <- copy 20/unsafe
+  2:address:number/lookup, 3:boolean <- maybe-convert 1:address:number-or-point/lookup, i:variant
 ]
-+mem: storing 34 in location 20
-+mem: storing 1 in location 12
++mem: storing 34 in location 21
++mem: storing 1 in location 3
 
 :(scenario maybe_convert_indirect_3)
 def main [
-  1:number-or-point <- merge 0/number, 34
-  10:address:number-or-point <- copy 1/unsafe
-  12:address:boolean <- copy 20/unsafe
-  11:number, 12:address:boolean/lookup <- maybe-convert 10:address:number-or-point/lookup, i:variant
+  # 10 reserved for refcount
+  11:number-or-point <- merge 0/number, 34
+  1:address:number-or-point <- copy 10/unsafe
+  2:address:boolean <- copy 20/unsafe
+  3:number, 2:address:boolean/lookup <- maybe-convert 1:address:number-or-point/lookup, i:variant
 ]
-+mem: storing 34 in location 11
-+mem: storing 1 in location 20
++mem: storing 34 in location 3
++mem: storing 1 in location 21
 
 :(before "Update MAYBE_CONVERT base in Check")
 if (!canonize_type(base)) break;
@@ -323,18 +743,20 @@ def main [
   1:address:number-or-point <- copy 10/unsafe
   1:address:number-or-point/lookup <- merge 0/number, 34
 ]
-+mem: storing 0 in location 10
-+mem: storing 34 in location 11
+# skip 10 for refcount
++mem: storing 0 in location 11
++mem: storing 34 in location 12
 
 :(before "Update size_mismatch Check for MERGE(x)
 canonize(x);
 
-//:: abbreviation for '/lookup': a prefix '*'
+//: abbreviation for '/lookup': a prefix '*'
 
 :(scenario lookup_abbreviation)
 def main [
-  1:address:number <- copy 2/unsafe
-  2:number <- copy 34
+  1:address:number <- copy 10/unsafe
+  # 10 reserved for refcount
+  11:number <- copy 34
   3:number <- copy *1:address:number
 ]
 +parse: ingredient: {1: ("address" "number"), "lookup": ()}
@@ -350,6 +772,220 @@ def main [
     raise << "illegal name " << original_string << '\n' << end();
 }
 
+//:: update refcounts when copying addresses
+
+:(scenario refcounts)
+def main [
+  1:address:number <- copy 1000/unsafe
+  2:address:number <- copy 1:address:number
+  1:address:number <- copy 0
+  2:address:number <- copy 0
+]
++run: {1: ("address" "number")} <- copy {1000: "literal", "unsafe": ()}
++mem: incrementing refcount of 1000: 0 -> 1
++run: {2: ("address" "number")} <- copy {1: ("address" "number")}
++mem: incrementing refcount of 1000: 1 -> 2
++run: {1: ("address" "number")} <- copy {0: "literal"}
++mem: decrementing refcount of 1000: 2 -> 1
++run: {2: ("address" "number")} <- copy {0: "literal"}
++mem: decrementing refcount of 1000: 1 -> 0
+# the /unsafe corrupts memory but fortunately we won't be running any more 'new' in this scenario
++mem: automatically abandoning 1000
+
+:(before "End write_memory(reagent x) Special-cases")
+if (x.type->value == get(Type_ordinal, "address")) {
+  // compute old address of x, as well as new address we want to write in
+  int old_address = get_or_insert(Memory, x.value);
+  assert(scalar(data));
+  int new_address = data.at(0);
+  // decrement refcount of old address
+  if (old_address) {
+    int old_refcount = get_or_insert(Memory, old_address);
+    trace(9999, "mem") << "decrementing refcount of " << old_address << ": " << old_refcount << " -> " << (old_refcount-1) << end();
+    put(Memory, old_address, old_refcount-1);
+  }
+  // perform the write
+  trace(9999, "mem") << "storing " << no_scientific(data.at(0)) << " in location " << x.value << end();
+  put(Memory, x.value, new_address);
+  // increment refcount of new address
+  if (new_address) {
+    int new_refcount = get_or_insert(Memory, new_address);
+    assert(new_refcount >= 0);  // == 0 only when new_address == old_address
+    trace(9999, "mem") << "incrementing refcount of " << new_address << ": " << new_refcount << " -> " << (new_refcount+1) << end();
+    put(Memory, new_address, new_refcount+1);
+  }
+  // abandon old address if necessary
+  // do this after all refcount updates are done just in case old and new are identical
+  assert(old_address >= 0);
+  if (old_address == 0) return;
+  if (get_or_insert(Memory, old_address) < 0) {
+    DUMP("");
+    cerr << old_address << ' ' << get_or_insert(Memory, old_address) << '\n';
+  }
+  assert(get_or_insert(Memory, old_address) >= 0);
+  if (get_or_insert(Memory, old_address) > 0) return;
+  // lookup_memory without drop_one_lookup {
+  trace(9999, "mem") << "automatically abandoning " << old_address << end();
+  trace(9999, "mem") << "computing size to abandon at " << x.value << end();
+  x.set_value(old_address+/*skip refcount*/1);
+  drop_from_type(x, "address");
+  // }
+  abandon(old_address, size_of(x)+/*refcount*/1);
+  return;
+}
+
+:(scenario refcounts_2)
+def main [
+  1:address:number <- new number:type
+  # over-writing one allocation with another
+  1:address:number <- new number:type
+  1:address:number <- copy 0
+]
++run: {1: ("address" "number")} <- new {number: "type"}
++mem: incrementing refcount of 1000: 0 -> 1
++run: {1: ("address" "number")} <- new {number: "type"}
++mem: automatically abandoning 1000
+
+:(scenario refcounts_3)
+def main [
+  1:address:number <- new number:type
+  # passing in addresses to recipes increments refcount
+  foo 1:address:number
+  1:address:number <- copy 0
+]
+def foo [
+  2:address:number <- next-ingredient
+  # return does NOT yet decrement refcount; memory must be explicitly managed
+  2:address:number <- copy 0
+]
++run: {1: ("address" "number")} <- new {number: "type"}
++mem: incrementing refcount of 1000: 0 -> 1
++run: {2: ("address" "number")} <- next-ingredient
++mem: incrementing refcount of 1000: 1 -> 2
++run: {2: ("address" "number")} <- copy {0: "literal"}
++mem: decrementing refcount of 1000: 2 -> 1
++run: {1: ("address" "number")} <- copy {0: "literal"}
++mem: decrementing refcount of 1000: 1 -> 0
++mem: automatically abandoning 1000
+
+:(scenario refcounts_4)
+def main [
+  1:address:number <- new number:type
+  # idempotent copies leave refcount unchanged
+  1:address:number <- copy 1:address:number
+]
++run: {1: ("address" "number")} <- new {number: "type"}
++mem: incrementing refcount of 1000: 0 -> 1
++run: {1: ("address" "number")} <- copy {1: ("address" "number")}
++mem: decrementing refcount of 1000: 1 -> 0
++mem: incrementing refcount of 1000: 0 -> 1
+
+:(scenario refcounts_5)
+def main [
+  1:address:number <- new number:type
+  # passing in addresses to recipes increments refcount
+  foo 1:address:number
+  # return does NOT yet decrement refcount; memory must be explicitly managed
+  1:address:number <- new number:type
+]
+def foo [
+  2:address:number <- next-ingredient
+]
++run: {1: ("address" "number")} <- new {number: "type"}
++mem: incrementing refcount of 1000: 0 -> 1
++run: {2: ("address" "number")} <- next-ingredient
++mem: incrementing refcount of 1000: 1 -> 2
++run: {1: ("address" "number")} <- new {number: "type"}
++mem: decrementing refcount of 1000: 2 -> 1
+
+:(scenario refcounts_array)
+def main [
+  1:number <- copy 30
+  # allocate an array
+  10:address:array:number <- new number:type, 20
+  11:number <- copy 10:address:array:number
+  # allocate another array in its place, implicitly freeing the previous allocation
+  10:address:array:number <- new number:type, 25
+]
++run: {10: ("address" "array" "number")} <- new {number: "type"}, {20: "literal"}
+# abandoned array is of old size (20, not 25)
++abandon: saving in free-list of size 22
+
+//:: abandon and reclaim memory when refcount drops to 0
+
+:(scenario new_reclaim)
+def main [
+  1:address:number <- new number:type
+  2:number <- copy 1:address:number  # because 1 will get reset during abandon below
+  1:address:number <- copy 0  # abandon
+  3:address:number <- new number:type  # must be same size as abandoned memory to reuse
+  4:boolean <- equal 2:number, 3:address:number
+]
+# both allocations should have returned the same address
++mem: storing 1 in location 4
+
+//: When abandoning addresses we'll save them to a 'free list', segregated by size.
+
+:(before "End routine Fields")
+map<int, int> free_list;
+
+:(code)
+void abandon(int address, int size) {
+  trace(9999, "abandon") << "saving in free-list of size " << size << end();
+//?   Total_free += size;
+//?   Num_free++;
+//?   cerr << "abandon: " << size << '\n';
+  // clear memory
+  for (int curr = address; curr < address+size; ++curr)
+    put(Memory, curr, 0);
+  // append existing free list to address
+  put(Memory, address, get_or_insert(Current_routine->free_list, size));
+  put(Current_routine->free_list, size, address);
+}
+
+:(before "ensure_space(size)" following "case ALLOCATE")
+if (get_or_insert(Current_routine->free_list, size)) {
+  trace(9999, "abandon") << "picking up space from free-list of size " << size << end();
+  int result = get_or_insert(Current_routine->free_list, size);
+  trace(9999, "mem") << "new alloc from free list: " << result << end();
+  put(Current_routine->free_list, size, get_or_insert(Memory, result));
+  for (int curr = result+1; curr < result+size; ++curr) {
+    if (get_or_insert(Memory, curr) != 0) {
+      raise << maybe(current_recipe_name()) << "memory in free list was not zeroed out: " << curr << '/' << result << "; somebody wrote to us after free!!!\n" << end();
+      break;  // always fatal
+    }
+  }
+  if (SIZE(current_instruction().ingredients) > 1)
+    put(Memory, result+/*skip refcount*/1, ingredients.at(1).at(0));
+  else
+    put(Memory, result, 0);
+  products.resize(1);
+  products.at(0).push_back(result);
+  break;
+}
+
+:(scenario new_differing_size_no_reclaim)
+def main [
+  1:address:number <- new number:type
+  2:number <- copy 1:address:number
+  1:address:number <- copy 0  # abandon
+  3:address:array:number <- new number:type, 2  # different size
+  4:boolean <- equal 2:number, 3:address:array:number
+]
+# no reuse
++mem: storing 0 in location 4
+
+:(scenario new_reclaim_array)
+def main [
+  1:address:array:number <- new number:type, 2
+  2:number <- copy 1:address:array:number
+  1:address:array:number <- copy 0  # abandon
+  3:address:array:number <- new number:type, 2  # same size
+  4:boolean <- equal 2:number, 3:address:array:number
+]
+# reuse
++mem: storing 1 in location 4
+
 //:: helpers for debugging
 
 :(before "End Primitive Recipe Declarations")