Arc Forumnew | comments | leaders | submitlogin
3 points by CatDancer 5694 days ago | link | parent

Here we go!

  (def match-json-string ()
    (match-char #\"
      (liststr:accum a
        (while (~match-char #\")
          (atend-err "missing close quote")
          (a (or (match-json-backslash)
                 (readc (stdin))))))))
I took shader's suggestion to use an input-port with peekc and readc. I figured that using a parameter to keep track of the parsing state was the way to go, and then realized that we already had a parameter (stdin) which did everything I needed it to. And taking Adlai's suggestion I made match-char a macro which avoids the extra fn's I had in the original.

Getting the parse position out of the function really broke the logjam. The old version was doing two things at once -- keeping track of the parse position and matching characters -- and that made it really hard to pull out bits of functionality into their own functions to make the main function shorter and clearer. Now it's much easier to tell what the function is doing! :)

Here's the rest of the code:

  (def hexdigit (c)
    (and (isa c 'char)
         (or (<= #\a c #\f) (<= #\A c #\F) (<= #\0 c #\9))))

  (def json-unicode-digits ()
    (let u (n-of 4 (readc (stdin)))
      (unless (all hexdigit u) (err "need 4 hexadecimal digits after \\u"))
      (coerce (int (coerce u 'string) 16) 'char)))

  ; json-backslash-char is the same

  (mac match-char (c . body)
    `(when (is (peekc (stdin)) ,c)
       (readc (stdin))
       ,@body))

  (def atend-err (msg)
    (unless (peekc (stdin))
      (err msg)))

  (def match-json-unicode-escape ()
    (match-char #\u (json-unicode-digits)))

  (def match-json-backslash ()
    (match-char #\\
      (atend-err "missing char after backslash")
      (or (match-json-unicode-escape)
          (json-backslash-char (readc (stdin))))))

  (def liststr (l)
    (coerce l 'string))
As you can see there's some more that could be done: a next-char function for (readc (stdin)) perhaps, and match-json-unicode-escape is so simple now that it could be inlined. That will be easy to work on :-)

Thank you everyone!



1 point by shader 5694 days ago | link

Beautiful!

Only one comment: peekc and readc default to stdin, so you can leave out all of the (stdin) calls, unless you're leaving them for readability.

Either way, the new version is much more readable, shorter, and probably faster too. Congratulations!

-----

1 point by CatDancer 5694 days ago | link

You are using some other incarnation of Arc, perhaps?

  arc> (readc)
  Error: "procedure readc: expects 1 argument, given 0"
having readc and peekc default to stdin sounds like a real good idea though! :)

-----

1 point by shader 5694 days ago | link

Wow, a bug!

According to arcfn.com and the code that it references, it looks like it's supposed to default to stdin, but it's written in such a way to require at least one argument. If you pass nil, it reads from stdin because of the default.

The current code is:

  (xdef readc (lambda (str)
               (let ((p (if (ar-false? str)
                            (current-input-port)
                            str)))
                 (let ((c (read-char p)))
                   (if (eof-object? c) 'nil c)))))
Which requires at least on argument. It should probably be changed so that that argument is actually optional, but I don't know how. I should read up on my mzscheme ;)

-----

1 point by shader 5694 days ago | link

I think that optional arguments in scheme are just (var default) instead of just var.

So the new code would be:

    (xdef readc (lambda ((str (current-input-port))
               (let ((c (read-char p)))
                   (if (eof-object? c) 'nil c))))
I'm not certain - I haven't tested it, but it looks right. Unless scheme and arc have different ideas of false, which could cause more problems. Maybe it should just be:

    (xdef readc (lambda ((str (current-input-port))
               (let ((p (if (ar-false? str)
                            (current-input-port)
                            str)))
                 (let ((c (read-char p)))
                   (if (eof-object? c) 'nil c)))))
Which keeps the old false test just in case.

-----

1 point by CatDancer 5694 days ago | link

I think you should test it :-)

If it turns out not to work, see writec for an example of implementing an optional argument.

-----

1 point by shader 5694 days ago | link

hmmm. It seems that mzscheme (at least the version that I'm running arc on, 360) doesn't support optional args. I guess I'll have to do it some other way.

-----

3 points by shader 5694 days ago | link

Ok, I basically copied writec like you said, and the new version that actually works is:

  (xdef 'readc (lambda str
                 (let ((c (read-char
                           (if (pair? str)
                               (car str)
                               (current-input-port)))))
                   (if (eof-object? c) 'nil c))))
Now we just need to make the same transformation to readb and peekc, and then we're done.

btw, why is there no peekb? Or any other functions that work on bytes? (outside of binary.arc in Anarki) Did I overlook them?

-----

2 points by CatDancer 5693 days ago | link

It's fairly tedious to be doing this in Scheme, isn't it? We might let Scheme handle implementing the low level readc, and then in Arc redefine readc to be a more advanced function that can take an optional argument:

  (let original readc
    (= readc (fn ((o str (stdin)))
               (original str))))

  arc> (fromstring "abc" (readc))
  #\a
That pattern could be made into a macro:

  (mac redef (name args . body)
    `(let original ,name
       (= ,name (fn ,args ,@body))))
which makes writing the enhanced version of readc look like:

  (redef readc ((o str (stdin)))
    (original str))

-----

1 point by absz 5693 days ago | link

redef is already in Anarki; arc.arc, line 2446:

  (mac redef (name parms . body)
    " Redefine a function.  The old function definition may be used within
      `body' as the name `old'. "
    `(do (tostring
          (let old (varif ,name nilfn)
            (= ,name (fn ,parms ,@body))))
         ,name))
It's the same as yours, except (a) it suppresses the warning on re-assigning an identifier, and (b) it calls the original function old.

-----

2 points by CatDancer 5693 days ago | link

suppressing the warning doesn't appear to be necessary with =

  arc> (def foo () 3)
  #<procedure: foo>
  arc> (= foo 4)
  4
  arc>

-----

1 point by absz 5692 days ago | link

I was wondering if that was necessary... I think redef used to use set, which is why it was there.

-----

1 point by CatDancer 5693 days ago | link

btw, why is there no peekb? Or any other functions that work on bytes?

As part of the design process of finding the shortest Arc implementation, pg is careful not to include functions that he isn't actually using for news. This leads to some surprises (car, cdr, cadr, cddr, but no cdar?) but also removes cruft that builds up implementing things that people might need some day but turn out not to.

It turns out not to be a problem, since Arc is so concise it's really easy to extend, and so people quickly implement the things they need that pg happens not to be using for news.

I find I prefer the Arc approach, since libraries that try to provide everything I might need often have so much stuff that ironically they make it harder to implement what I actually need.

-----

2 points by Adlai 5694 days ago | link

Very useful... I'm surprised that it's not the "standard" yet. Nice job!

-----

1 point by pg 5693 days ago | link

Ok, readc, readb, and peekc now work this way.

-----

1 point by Adlai 5693 days ago | link

Probably a stupid question, but:

Does that mean that I should just manually patch ac.scm to comply with the new functions?

-----

1 point by CatDancer 5692 days ago | link

If you need the change right away you can patch ac.scm yourself, or, if you don't mind waiting, pg will eventually have a new arc3.tar containing the update.

-----

1 point by CatDancer 5694 days ago | link

I don't think it's a bug, readc has always required an argument. It would be an easy enhancement to make the argument optional though.

-----

1 point by Adlai 5694 days ago | link

It looks good! Thank you also for using my idea.

Side note: compare the "profiles" of this version, and the earlier versions -- this one has a much more "functional" profile.

I'm a bit confused what you mean about atend:err. Do you basically mean that there would be function composition between a macro, and a call to, for example, (err "missing char after backslash")? Something like

  (mac atend (alert)
    `(unless (peekc (stdin))
       ,alert))
I think I'm missing something...

EDIT: I get it now. I hadn't noticed that you only use /atend[-:]err/ at points where the next character might "correct" the error.

-----

1 point by CatDancer 5693 days ago | link

Do you basically mean that there would be function composition between a macro, and a call

Yes, a composition, though not a function composition. Because the Arc compiler rewrites (a:b ...) as (a (b ...)) when a:b appears in the first position in an expression, it works for macros also.

Thus

  (atend:err "missing close quote")
expands into

  (atend (err "missing close quote"))
which macro expands into

  (unless (peekc (stdin))
    (err "missing close quote"))

-----

1 point by CatDancer 5694 days ago | link

Say, I just realized that I could have an "atend" that would be a macro like match-char, and then the error check could look like:

  (atend:err "missing char after backslash")
funny how I only noticed that because of how I had spelled "atend-err" :-)

-----