Arc Forumnew | comments | leaders | submitlogin
1 point by palsecam 5613 days ago | link | parent

No-down patch at http://dabuttonfactory.com/res/arc-no-down.patch

Thanks my arrogance/guts for pushing me to try to remove 'down, because it showed me the Arc codebase confirms my own experience of programming:

- you never use 'for directly, but in cases where you are sure the bounds are OK.

Where "directly" means, not in a library {mac|fn} definition, because here you must anyway validate your input, if you agree w/ "Worse is not always better" (i.e: the {sys|lang|lib} writer does the hard work, not you, the user). If you don't agree, well, one problem is, it leads to incoherences/bugs. See below.

The "problematic" (few) occurrences of 'for only appears in arc.arc and strings.arc which are typical librairies files. Not even "normal" librairies, but "core" ones. The kind of ones were I'd strongly apply "worse is not better".

You'll not see 'for used with expressions in any other files, i.e: "application" (blog.arc, news.arc, etc.) or even other libs files. You'll not even see it at all in news.arc, srv.arc, code.arc, prompt.arc. You'll see it used directly twice, here:

  blog.arc:      (for i 0 4                ; no bounds pb
  html.arc:(for i 0 255 (= (hexreps i      ; no bounds pb
- you sometimes, rarely, also need to directly use a descendant 'for ('down). Only once in all Arc (but once = it is needed):

   news.arc:      (down id maxid* 1
Where maxid* is a global, and the kind of one which is nearer IMO to a litteral than to a (complex) expression, so no pb. See below.

So it's a pity that for this one time, you can't use 'for, and have to ressort using yet another loop construct that is here for... non-existing problems.

- for the vast, vast majority of looping, you use higher-level loop constructs (each/repeat/etc.), so there is no problem w/ incorrect bounds, assuming the lib writer is not a moron.

----

In arc.arc:

Is it coherent than 'posmatch will return nil when pat > seq, where 'headmatch will throw an error in the same case (even stranger knowing 'posmatch actually calls 'headmatch)?

  arc> (headmatch "abcd" "abc")
  Error: "string-ref: index 3 out of range [0, 2] for string: \"abc\""
  arc> (posmatch "abcd" "abc")
  nil
w/no-down-patch:

  arc> (headmatch "abcd" "abc")
  nil
  arc> (posmatch "abcd" "abc")
  nil
Coherent, and correct IMO. We ask if it matches. If pat > seq, the answer is just "no", it's not an error per-se.

Or: how 'headmatch is "incredibly fragile", and the so-called "solid" 'for hides this fact here. Thanks pseudo-solidity. Validate your input, and don't rely on the behaviour of something inherently fragile (using a raw construct), when writing a library fn.

In news.arc, I (obviously) changed:

      (down id maxid* 1
to:

      (for id maxid* 1
I feared it may not work when there are no item, tested this case (nsv), then access localhost:8080, and there were actually no problem. I don't use news.arc, so can't test for the rest, but it should be OK. (If pb, maybe just changing to (for id maxid* 0 ...) would solve it.)

----

"You claimed it'd make the code shorter! Prove it!"

Clever, interesting test:

  arc> (let toto 0 
         (each (k v) (tokcount '("arc.arc" "strings.arc" "news.arc")) 
           (++ toto v)) 
         toto)
  14756

  arc-no-down> (let toto 0 
                 (each (k v) (tokcount '("arc.arc" "strings.arc" "news.arc")) 
                   (++ toto v)) 
                 toto)
  14749
Harder, dumber, raw `wc' test:

  $ wc -m 3.1orig/*.arc
  [...]
  198017 total

  $ wc -m 3.1nodown/*.arc
  [...]
  198017 total     # Argh, failed! It's ==, not strictly <...
----

No-down patch was coded quickly and with nearly no testing afterwards, so there might be bugs. I hope someone prouve me I've introduced lots of bugs, like this I could be sure all this crap at least makes someone take a look at the reality (where the reality is, here, some pratical code, and not some books), and try to question things. One thing Arc got very right is "code.arc".

And no, telling me "it is buggy for me" doesn't count without showing some Arc code, in where you'll be effectively embarrassed by the new 'for behaviour. Else it's like with hygienic macros: "incredibly less fragile" but no one cares 'cause unhygienic is good enough/more powerful, according you live in the real world.

And anyway it doesn't count because everyone here more or less accept the fact that the Arc codebase is a superb piece of software (so if you don't have the same coding practice, you suck), that brevity is power, and that it is a valid codebase to test the necessity of an operator. All of this IS questionable. But too many people here are... not qualified to do so, unless they are sure their comments history will not reveal some stupid blind adoration for Arc.

I trust {my|other people} guts & feelings, but on the end I believe only in reality, in data (and you know as well as me that code is data :-D), and not in opinions and books.



2 points by fallintothis 5613 days ago | link

- you never use 'for directly, but in cases where you are sure the bounds are OK.

The "problematic" (few) occurrences of 'for only appears in arc.arc and strings.arc which are typical librairies files.

What makes arc.arc and strings.arc less valid examples of for usage? They're Arc programs, too. Should they not inherit the elegance they're attempting to define? (While still balancing efficiency, of course, cf. the tutorial: http://ycombinator.com/arc/tut.txt)

To the contrary, because arc.arc and strings.arc use for I think they make perfect examples -- which would make your first statement untrue, since you had to write extra bounds-checking.

- you sometimes, rarely, also need to directly use a descendant 'for ('down). Only once in all Arc (but once = it is needed):

So it's a pity that for this one time, you can't use 'for, and have to ressort using yet another loop construct that is here for... non-existing problems.

You're ignoring that down has another purpose. As you say, the need for a descending loop is rare. But the need for for to only go in one direction is much less rare (more on that later).

for the vast, vast majority of looping, you use higher-level loop constructs (each/repeat/etc.), so there is no problem w/ incorrect bounds, assuming the lib writer is not a moron.

So you'd also want to foist the responsibility of not being a "moron" onto every user of for? If other loops are already used to avoid silly bugs, why not for?

I count at least 12 different loop constructs in arc.arc: while, loop, for, down, repeat, each, whilet, whiler, forlen, on, until, noisy-each, and arguably others like evtil and drain.

I find that adding these makes code simpler: they express (and implement) purposeful loops. That's why I can do

  (each x xs (prn x))
instead of

  (forlen i xs (prn (xs i)))
which can be done instead of

  (for i 0 (- (len xs) 1) (prn (xs i)))
which can be done instead of

  (loop (= i 0) (< i (len xs)) (++ i) (prn (xs i)))
etc. If I wanted the most general & least to remember, I'd use a goto.

When for tries to infer the direction I want to go, I need to fight it to stop from going in the opposite direction -- to me, this is inconvenient.

Is it coherent than 'posmatch will return nil when pat > seq, where 'headmatch will throw an error in the same case (even stranger knowing 'posmatch actually calls 'headmatch)?

I agree that headmatch has odd behavior here. But with the fixed behavior (i.e., your patch):

  arc> (load "../arc3.1/trace.arc")
  nil
  arc> (trace posmatch headmatch)
  *** tracing posmatch
  *** tracing headmatch
  nil
  arc> (posmatch "a" "abc")
  1. Trace: (posmatch "a" "abc")
  2. Trace: (headmatch "a" "abc" 0)
  2. Trace: headmatch ==> t
  1. Trace: posmatch ==> 0
  0
  arc> (posmatch "abc" "a")
  1. Trace: (posmatch "abc" "a")
  2. Trace: (headmatch "abc" "a" 0)
  2. Trace: headmatch ==> nil
  2. Trace: (headmatch "abc" "a" -1)
  2. Trace: headmatch ==> nil
  2. Trace: (headmatch "abc" "a" -2)
  2. Trace: headmatch ==> nil
  1. Trace: posmatch ==> nil
  nil
Just because the function to which you funnel input sanitizes data doesn't mean you should be supplying bad values. Further, if we add more error-checking to posmatch to avoid the redundant calls, we're adding even more complexity -- wrestling against for to get it to go just one direction.

"You claimed it'd make the code shorter! Prove it!"

I believe only in reality, in data

Then let's inspect your patch closer:

inspect-patch.arc

  (def default (file)
    (+ "../arc3.1/" file))

  (def patched (file)
    (+ "../arc-patch/" file))

  (def sexp-tokcount (sexp)
    (len (flat sexp)))

  (= for-def*
    '(mac for (v init max . body)
       (w/uniq (gi gm)
         `(with (,v nil ,gi ,init ,gm (+ ,max 1))
            (loop (assign ,v ,gi) (< ,v ,gm) (assign ,v (+ ,v 1))
              ,@body))))
     down-def*
     '(mac down (v init min . body)
        (w/uniq (gi gm)
          `(with (,v nil ,gi ,init ,gm (- ,min 1))
             (loop (assign ,v ,gi) (> ,v ,gm) (assign ,v (- ,v 1))
               ,@body))))
     new-for-def*
    '(mac for (v init end . body)
       (w/uniq (gi gm gt gf)
         `(do
            (if (> ,end ,init)
                (= ,gt < ,gf +)
                (= ,gt > ,gf -))
            (with (,v nil ,gi ,init ,gm (,gf ,end 1))
              (loop (assign ,v ,gi) (,gt ,v ,gm) (assign ,v (,gf ,v 1))
                ,@body))))))

  ; if this calculation is wrong, it should be revealed in logic-savings
  (= max-diff* (- (+ (sexp-tokcount for-def*) (sexp-tokcount down-def*))
                  (sexp-tokcount new-for-def*)))

  (def token-total (file)
    (sum cadr (tokcount (list file))))

  (def token-diff (file1 file2)
    (- (token-total file1) (token-total file2)))

  (def compare-tokcount (filename)
    (let diff (token-diff (default filename) (patched filename))
      (if (> diff 0)
            (prn "The patch saved " (plural diff "token") " in " filename)
          (< diff 0)
            (prn "The patch added " (plural (- diff) "token") " to " filename)
            (prn "The patch didn't change the token count in " filename))))

  (def maximum-savings ()
    (prn "The patch could have saved at most (caveat lector) "
         (plural max-diff* "token")
         " in arc.arc"))

  (def logic-savings ()
    (let diff (token-diff (default "arc.arc") (patched "arc.arc"))
      (if (<= diff max-diff*)
          (prn "So, by changing 'for in arc.arc, "
               (plural (- max-diff* diff) "token")
               " got added to code that used the previous version of 'for")
          (err "miscalculated the maximum number of tokens you could save"))))

  (map compare-tokcount '("arc.arc" "strings.arc" "news.arc"))
  (prn)
  (maximum-savings)
  (logic-savings)
At the REPL

  arc> (load "inspect-patch.arc")
  The patch saved 9 tokens in arc.arc
  The patch added 2 tokens to strings.arc
  The patch didn't change the token count in news.arc

  The patch could have saved at most (caveat lector) 17 tokens in arc.arc
  So, by changing 'for in arc.arc, 8 tokens got added to code that used the previous version of 'for
  nil
To explain the "caveat", I assume the most this new for could change is: (a) remove the single-direction for and down, (b) add the bidirectional for, and (c) leave any other piece of code that used for/down unchanged (save switching the word "down" to the word "for").

With these assumptions (and by inspecting the code), the assessment seems correct: arc.arc nets 8 additional tokens to stop for from going backwards. It's not that the token count is shorter from having for go both directions; it's that the code you've added to avoid for's new behavior isn't quite enough to outweigh the savings from removing down's definition.

In actuality, you'll wind up saving far less than 9 tokens because of multiple evaluation bugs:

   (mac repeat (n . body)
     `(if (> ,n 1) (for ,(uniq) 1 ,n ,@body)))
with

  arc> (sexp-tokcount '(mac repeat (n . body)
                         `(if (> ,n 1) (for ,(uniq) 1 ,n ,@body))))
  18
should be

  (mac repeat (n . body)
    (w/uniq gn
      `(let ,gn ,n (if (> ,gn 1) (for ,(uniq) 1 ,gn ,@body)))))
with

  arc> (sexp-tokcount '(mac repeat (n . body)
                         (w/uniq gn
                           `(let ,gn ,n
                              (if (> ,gn 1) (for ,(uniq) 1 ,gn ,@body))))))
  25
i.e., 7 more tokens, and

  (mac forlen (var s . body)
    `(unless (empty ,s)
       (for ,var 0 (- (len ,s) 1) ,@body)))
with

  arc> (sexp-tokcount '(mac forlen (var s . body)
                         `(unless (empty ,s)
                            (for ,var 0 (- (len ,s) 1) ,@body))))
  21
should be

  (mac forlen (var s . body)
    (w/uniq gs
      `(let ,gs ,s
         (unless (empty ,gs)
           (for ,var 0 (- (len ,gs) 1) ,@body)))))
with

  arc> (sexp-tokcount '(mac forlen (var s . body)
                         (w/uniq gs
                           `(let ,gs ,s
                              (unless (empty ,gs)
                                (for ,var 0 (- (len ,gs) 1) ,@body))))))
  28
i.e., 7 more tokens, totaling 14 more tokens, which outweighs the original figure. So, nothing is even really saved in arc.arc. Though, of course, the rewrites could be shorter with something like once-only (see towards the end of http://gigamonkeys.com/book/macros-defining-your-own.html).

Further, strings.arc and news.arc did not get shorter (strings.arc even got a little longer). The only way it seems that un-patched code could get shorter is if it had to go either up or down and the order didn't matter -- unlike code in the files inspected.

Therefore, this patch can either make new code longer or make you hope that for doesn't iterate in a direction you don't want it to (as in news.arc), unless you needed to do the Arc 3.1 equivalent of

  (if (< start end)
      (for i start end ...)
      (> start end)
      (for i end start ...))
which, with this patch, could be replaced with

  (for i start end ...)
which is shorter.

As infrequently as such code occurs (0 times in the standard Arc 3.1 distribution, so far as I can tell), this does not yield big space savings. If it does occur frequently enough, it shouldn't outweigh the need for single-direction iterations, but would probably instead be made into a separate macro:

  (mac between (var bound1 bound2 . body)
    ...)
Additionally, you assert that having an extra loop construct entails an unnecessary mental burden for the programmer. I disagree. It's not a burden if its purpose is specific: if you want to repeat a block of code, use

  (repeat n ...)
instead of

  (for temp 1 n ...)
If you want to iterate over the length of a sequence, use

  (forlen i xs ...)
instead of

  (for i 0 (- (len xs) 1) ...)
Moreover, if you want to iterate upwards through a range of integers, use

  (for i start (- (len seq) (len pat)) ...)
instead of

  (if (and (>= (len seq) (len pat))
           (<= start (- (len seq) (len pat))))
      (between i start (- (len seq) (len pat))
        ...))

-----