Arc Forumnew | comments | leaders | submitlogin
How to read mutable hash tables
2 points by lark 3974 days ago | 5 comments
I'm providing a function that can read a hash table from disk and return it as a mutable data structure. This function works with Arc 3.1.

  (def make-mutable-cons (data)
    (withs (ret nil)
           (each el data
                 (push el ret))
           ret))

  (def make-mutable-table (data)
    (withs (ret (table))
           (maptable (fn (k v)
                         (if (is (type v) 'table)
                             (= (ret (sym k)) (make-mutable-table v))
                             (is (type v) 'cons)
                             (= (ret (sym k)) (make-mutable-cons v))
                             (= (ret (sym k)) v)))
                     data)
           ret))

  ;; read a hash table from a file; but what you get back is mutable                                                                                                       
  (def riff-table (file)
    (withs (immutable nil)
           (= immutable (w/infile i file (read i)))
           (make-mutable-table immutable)))

Here are some test data:

  arc> (= h (obj this "this" that 3 nested-hash-table (obj one "one" two "two" three 3) some-list (list 1 2 "three" 4)))

  $ cat nested
  #hash((nested-hash-table . #hash((three . 3) (one . "one") (two . "two"))) (some-list . (1 2 "three" 4 . nil)) (this . "this") (that . 3))
And here is an example how to use this function:

  arc> (= nt (riff-table "nested"))
  #hash((nested-hash-table . #hash((three . 3) (one . "one") (two . "two"))) (some-list . (4 "three" 2 1 . nil)) (this . "this") (that . 3))
  arc> (= (nt "this") "THIS")
  "THIS"
  arc> (= ((nt 'nested-hash-table) 'one) "ONE")
  "ONE"
  arc> (= (nt 'some-list) (list 5 6 "seven"))
  (5 6 "seven")
  arc> nt
  #hash((nested-hash-table . #hash((three . 3) (one . "ONE") (two . "two"))) (some-list . (5 6 "seven" . nil)) (this . "this") (that . 3) ("this" . "THIS"))


4 points by akkartik 3974 days ago | link

Cool. Some comments:

a) You're reversing your lists, like the value of some-list above.

b) If you include a table inside a list, I think riff-table won't make it mutable. Here's a slightly different file

  arc> (= h (w/infile f "nested" (read f)))
  #hash((this . "this") (nested-hash-table . #hash((three . 3) (one . "one") (two . "two"))) (some-list . (1 2 "three" #hash((four . "foo")) . nil)) (that . 3))
  arc> h!some-list
  (1 2 "three" #hash((four . "foo")))
  arc> h!some-list.3
  #hash((four . "foo"))
  arc> h!some-list.3!four
  "foo"
  arc> (= h!some-list.3!four "bar")
  Error: "hash-set!: contract violation\n  expected: (and/c hash? (not/c immutable?))\n  given: '#hash((four . \"foo\"))\n  argument position: 1st\n  other arguments...:\n   'four\n   \"bar\""
You might want to see how anarki fixes the read and write primitives to be fully general for tables as well as user-defined types:

  arc> (tofile "x" (write:obj a 1 b 2 c (list 2 3 (obj d 4 e 5 f 6))))
  nil
  arc> (= h (fromfile "x" (read)))
  #hash((a . 1) (c . (2 3 #hash((f . 6) (e . 5) (d . 4)))) (b . 2))
  arc> (= h!a 34)
  34
  arc> (= h!c.2!g 7)
  7
  arc> h
  #hash((a . 34) (c . (2 3 #hash((g . 7) (f . 6) (e . 5) (d . 4)))) (b . 2))
c) Why you no like let and with? :) I'd rewrite your expressions above as:

  (def make-mutable-cons (data)
    (let ret nil
      (each el data
        (push el ret))
      ret))

  (def make-mutable-table (data)
    (w/table ret  ; w/table implicitly returns ret
      (maptable (fn (k v)
		  (if (is (type v) 'table)
		       (= (ret (sym k)) (make-mutable-table v))
		      (is (type v) 'cons)
		       (= (ret (sym k)) (make-mutable-cons v))
		      (= (ret (sym k)) v)))
		data)))

  (def riff-table (file)
    (make-mutable-table
      (w/infile i file
	(read i))))
d) Why do you have those calls to sym?

-----

2 points by lark 3972 days ago | link

Thank you for the feedback. Thanks also to fallintothis: I wish the copy functions you wrote omitted unnecessary lines of code.

a,b) Thanks, fixed:

  (def make-mutable-cons (data)
    (withs (ret nil)
           (each el data
                 (if (is (type el) 'table)
                     (push (make-mutable-table el) ret)
                     (is (type el) 'cons)
                     (push (make-mutable-cons el) ret)
                     (push el ret)))
           (rev ret)))
c) let doesn't work with more than one variable; so withs is more general (should let even exist?). I can't tell what the difference between with and withs is, and I had picked the latter when I started needing more than one variable. What is their difference?

d) Seems sym's not needed; removed.

-----

2 points by akkartik 3972 days ago | link

Yes, let is just a simpler form of withs, but it's worth using because it signals that you only need one var and therefore skips extra parens.

withs is the sequential form of with, where you'd like each variable available in defining later variables within the with. You have to say:

  arc> (withs (x 1 y (* x 2)) (+ x y))
..but you can use the simpler with in:

  arc> (with (x 1 y 2) (+ x y))
Not only can you, you should. Using the most general form when a simpler form will do is akin to crying wolf; it's very useful when reading code to be able to tell simpler parts from more complex parts at a single glance.

So use with to signal that you're defining multiple variables at once. Use withs to signal that there's a dependency between the definitions. If there's no dependency, with is more idiomatic. And if there's only one variable, let is more idiomatic. Finally, a withs with multiple bodies and a trivial body is easier to read if you put the final expressions into the body. This:

  (withs (x 1
          y (f x z))
    y)
is better written as:

  (withs (x 1)
    (f x z))
and therefore:

  (let x 1
    (f x z))
(and further to (f 1 z), but let's pretend x is being bound to something less trivial.)

Even if a body is a little more complex:

  (withs (x complex-expr1
          y (complex-expr2 x)
          z (complex-expr3 y))
    (car z))
it's sometimes nicer to read as:

  (withs (x complex-expr1
          y (complex-expr2 x))
    (car (complex-expr3 y)))
let, with and withs are nothing but macros, so if you avoid using them you're giving up the benefits of lisp. Might as well go back to a weaker language. In general, my idea when learning lisp was: "Since I come from an imperative background my tendency is to define intermediate variables. Therefore I will start by avoiding all temporaries, and only after I get it working will I insert the fewest possible temporaries to make the code readable." It's stood me in good stead.

-----

3 points by lark 3968 days ago | link

Thank you for writing up such a clear explanation.

There are definitely safety benefits to using the simplest form. But I would prefer to use the most general form (withs) because I won't need to worry about using the most idiomatic form or to switch from one form to another as the program changes. It frees me to worry about other things.

Being able to tell simpler parts from more complex parts at a single glance isn't as much of a concern for me. I noticed I don't go back to even read code of programs that didn't solve a big enough problem. The programs reach a dead end and don't develop further. There are more important things to worry about when writing a program.

Also note what let, with, and withs offer could be provided in other languages too. Them being implemented as macros in Arc is incidental. You're not giving up the benefits of lisp by not using them.

-----

2 points by fallintothis 3974 days ago | link

Disk serialization of variables & tables is already part of Arc 3.1's fromdisk, diskvar, disktable, and todisk:

  $ arc
  Use (quit) to quit, (tl) to return here after an interrupt.
  arc> (disktable h "/tmp/nested")
  #hash()
  arc> (= h (obj this "this" that 3 nested-hash-table (obj one "one" two "two" three 3) some-list (list 1 2 "three" 4)))
  #hash((that . 3) (some-list . (1 2 "three" 4 . nil)) (this . "this") (nested-hash-table . #hash((one . "one") (two . "two") (three . 3))))
  arc> (todisk h)
  ((nested-hash-table #hash((one . "one") (two . "two") (three . 3))) (this "this") (some-list (1 2 "three" 4)) (that 3))
  arc> (quit)
  $ cat /tmp/nested && echo
  ((nested-hash-table #hash((one . "one") (two . "two") (three . 3))) (this "this") (some-list (1 2 "three" 4)) (that 3))
  $ arc
  Use (quit) to quit, (tl) to return here after an interrupt.
  arc> (disktable nt "/tmp/nested")
  #hash((that . 3) (some-list . (1 2 "three" 4)) (this . "this") (nested-hash-table . #hash((one . "one") (two . "two") (three . 3))))
But, as I'm sure you're aware, the error with the above is that the serialized table (returned by todisk) has a literal Scheme #hash object nested inside, and when you read that back in, it's immutable:

  arc> (= (nt "this") "THIS")
  "THIS"
  arc> (nt "this")
  "THIS"
  arc> (= (nt 'some-list) (list 5 6 "seven"))
  (5 6 "seven")
  arc> (nt 'some-list)
  (5 6 "seven")
  arc> nt
  #hash((that . 3) (some-list . (5 6 "seven" . nil)) (this . "this") (nested-hash-table . #hash((one . "one") (two . "two") (three . 3))) ("this" . "THIS"))
  arc> (= ((nt 'nested-hash-table) 'one) "ONE")
  Error: "hash-set!: contract violation\n  expected: (and/c hash? (not/c immutable?))\n  given: #hash((one . \"one\") (two . \"two\") (three . 3))\n  argument position: 1st\n  other arguments...:\n   one\n   \"ONE\""
So, the actual salient operation here has little to do with immutable tables, but rather "deep copying" them, an operator that vanilla Arc lacks. It only has copy:

  (def copy (x . args)
    (let x2 (case (type x)
              sym    x
              cons   (copylist x) ; (apply (fn args args) x)
              string (let new (newstring (len x))
                       (forlen i x
                         (= (new i) (x i)))
                       new)
              table  (let new (table)
                       (each (k v) x 
                         (= (new k) v))
                       new)
                     (err "Can't copy " x))
      (map (fn ((k v)) (= (x2 k) v))
           (pair args))
      x2))
But we could model a deep-copy off of that:

  ; (map deep-copy xs) wouldn't work on dotted lists, hence this helper
  (def deep-copylist (xs)
    (if (no xs)
         nil
        (atom xs)
         (deep-copy xs)
         (cons (deep-copy (car xs))
               (deep-copylist (cdr xs)))))

  (def deep-copy (x . args)
    (let x2 (case (type x)
              sym    x
              char   x
              int    x
              num    x
              cons   (deep-copylist x)
              string (copy x)
              table  (let new (table)
                       (each (k v) x
                         (= (new (deep-copy k)) (deep-copy v)))
                       new)
                     (err "Can't deep copy " x))
      (map (fn ((k v)) (= (x2 (deep-copy k)) (deep-copy v)))
           (pair args))
      x2))
Then we have

  arc> (disktable nt "/tmp/nested")
  #hash((some-list . (1 2 "three" 4)) (this . "this") (nested-hash-table . #hash((one . "one") (two . "two") (three . 3))) (that . 3))
  arc> (= nt (deep-copy nt))
  #hash((nested-hash-table . #hash((three . 3) (one . "one") (two . "two"))) (this . "this") (that . 3) (some-list . (1 2 "three" 4 . nil)))
  arc> (= (nt "this") "THIS")
  "THIS"
  arc> (= ((nt 'nested-hash-table) 'one) "ONE")
  "ONE"
  arc> (= (nt 'some-list) (list 5 6 "seven"))
  (5 6 "seven")
  arc> nt
  #hash((nested-hash-table . #hash((three . 3) (one . "ONE") (two . "two"))) (this . "this") (that . 3) ("this" . "THIS") (some-list . (5 6 "seven" . nil)))
You could build a macro atop of fromdisk to handle deep copying automatically.

I seem to remember discussion about deep copying before, but all I could find was http://arclanguage.org/item?id=16979.

-----