Arc Forumnew | comments | leaders | submitlogin
*Really* weird bug when table keys are tables
1 point by akkartik 5756 days ago | 2 comments
Let's create a hash table where the keys are tables.

  arc> (= tab (table))
  #hash()
  arc> (= key (table))
  #hash()
  arc> (= (tab key) 3)
  3
  arc> tab
  #hash((#hash() . 3))
  arc> (tab key)
  3
Quick: is the key a table value or reference?

  arc> (tab (table))
  3
Ok, so it's by value. But wait:

  arc> (= (key 5) 6)
  6
  arc> tab
  #hash((#hash((5 . 6)) . 3))
  arc> (keys tab)
  (#hash((5 . 6)))
So it's a reference, right?

  arc> (tab key)
  nil
Weird. So is it accessible by value?

  arc> (tab (obj 5 6))
  nil
  arc> (tab (table))
  nil
Finally:

  arc> (= (key 5) nil)
  nil
  arc> (tab key)
  3
  arc> (tab (table))
  3
Curiouser and curiouser.

How the #$%# heck does the table remember what the state of its key was at the time of binding?

Bug found while playing with palsecam's transparent persistence idea (http://arclanguage.org/item?id=10696).

Update: confirmed with original arc3/v372 and arc3.1/v4.2.2 under Snow Leopard, and with arc3.1/v4.1.3 under Ubuntu Jaunty.



2 points by rocketnia 5756 days ago | link

Quick: is the key a table value or reference?

It's a reference, as you can see, since the keys in the table change when you change them from outside the table. It's just that the keys are being compared by structure rather than by reference.

(tab key) => nil

This is pretty much a casualty of using hash tables. The point of a hash table is that it redundantly stores a hash of each of the keys, then compares further keys' hashes to those hashes before doing more in-depth comparisons. If you modify a key, then the hash table's precalculated hash for that key may be incorrect, so the shortcut doesn't work properly anymore.

The PLT Scheme docs specifically don't promise anything about this:

Caveat concerning mutable keys: If a key in an equal?-based hash table is mutated (e.g., a key string is modified with string-set!), then the hash table's behavior for insertion and lookup operations becomes unpredictable. (http://docs.plt-scheme.org/reference/hashtables.html)

I imagine a language could make sure to update its hash table hashes whenever the keys were modified, but as far as I can tell, that involves either recalculating the hash values at table access time (which defeats the purpose somewhat), recalculating them at modification time (which I'm guessing would make all modification take time proportional to the number of things that (in)directly reference the modified thing), or trying to do the recalculation sometime in between when there's time to spare, like the same time the garbage collector runs. There are probably some cleverer solutions, but any way you slice it, I doubt it's a straightforward fix. ^_^;

-----

2 points by akkartik 5756 days ago | link

Ah, that's most illuminating, thanks! So that's why palsecam used an alist ^_^

-----