Arc Forumnew | comments | leaders | submitlogin
Profiler on Anarki
12 points by raymyers 6134 days ago | 16 comments
I've thrown together a basic profiler with an interface loosely based on the SBCL deterministic profiler.

http://www.sbcl.org/manual/Deterministic-Profiler.html

Lo!

  arc> (load "lib/profiler.arc")
  nil
  arc> (load "lib/treeparse.arc")
  nil
  arc> (profile many anything range many-r)
  *** redefining many
  *** redefining anything
  *** redefining range
  *** redefining many-r
  nil
  arc> (time (do ((many anything) (range 1 5000)) nil))
  time: 1512 msec.
  nil
  arc> (profiler-report)
  ((many-r 863) (range 647) (many 0) (anything 0))
  nil
  arc> (unprofile many anything range many-r)
  nil


3 points by almkglor 6134 days ago | link

nice ^^. I'll be trying to hack this into Arki's parser and see how things go.

Another parsing approach I've seen floating around is to use continuations - have a "pass" continuation and a "fail" continuation. It would still approximately be using combinators, but basic parsers would look more like this:

  (def lit (v)
    (fn (remaining pass fail)
      (if (caris remaining v) (pass (list:car remaining) (cdr remaining) nil)
                              (fail))))

  (def alt-l (parsers)
    (if parsers
      (withs ((parser . parsers) parsers
              next (alt-l parsers))
        (fn (remaining pass fail)
          (parser remaining
            pass
            (fn ()
              (next remaining pass fail)))))
      (fn (_ _ fail) (fail))))
Hmm.

-----

1 point by raymyers 6134 days ago | link

Interesting. Wouldn't the continuation approach be slower though?

-----

1 point by almkglor 6134 days ago | link

Not sure; might actually be faster, since supposedly scheme is much more continuation-friendly than lazy-evaluation-friendly. Or not, I'm not sure

Anyway I haven't managed to wrap my head around how to implement 'seq in a continuation style; also, there's the problem of how 'cut on a parser would work in a nonlazy language - supposedly as soon as the parser given to 'cut consumes anything, 'cut will somehow dispose the 'fail continuation. I'd assume 'cut will be nearer to cut-seq, which would approximately be a cut:seq -

  (def cut-seq (parser . rest)
    (let next (seq rest)
      (fn (remaining pass fail)
        (parser
          ; since the name "remaining" is reused here,
          ; a smart compiler should be able to
          ; dispose of the outer function's remaining
          (fn (parsed remaining actions)
            (next remaining
                  (fn (parsed2 remaining actions2)
                      (pass (+ parsed parsed2)
                            remaining
                            (+ actions actions2)))
                  [err "parse error!"]]))
          fail))))
Since fail is disposed upon continuing to next (which is the rest of the sequence) everything which fail holds could then be reclaimed. 'alt's pass is its own pass, but its 'fail contains a closure.

-----

1 point by almkglor 6134 days ago | link

The nice thing here too is that we completely lose the 'return structure. The current code keeps composing and decomposing it anyway. Hmm:

  (def filt (fun parser)
    (fn (remaining pass fail)
      (parser remaining
        (fn (parsed remaining actions)
          (pass (fun parsed) remaining actions))
        fail)))
If we assume that the base compiler isn't very optimizing (i.e. captures closures when it doesn't need to) we could use helper functions for new 'pass functions:

  (let new-pass
       (fn (fun pass)
         (fn (parsed remaining actions)
           (pass (fun parsed) remaining actions)))
    (def filt (fun parser)
      (fn (remaining pass fail)
        (parser remaining
                (new-pass fun pass)
                fail))))

-----

2 points by almkglor 6134 days ago | link

I've just built a continuation-approach parser combinator library, pushed on the git. It's currently faster on 'many and 'seq, but slower on 'alt. It's considerably faster on Arki's use case, however (about 33% less time).

-----

2 points by raymyers 6133 days ago | link

From the comments: "...continuation-passing style, which is hard on the eyes..."

You weren't kidding :) However, I congratulate you on your superior understanding of continuations.

-----

1 point by almkglor 6133 days ago | link

Thanks. And #2 wasn't kidding either.

That said I think the main issues slowing down the monadic treeparse are:

1. Use of 'return. I'm not sure if mzscheme can optimize this away (it's just being used as a multiple-value return, and is immediately destructured by the calling function). Heck, I'm not even sure Arc destructuring is in a form optimizable by Scheme. Using continuations, this tuple becomes the parameters of the call to the continuation, and that I'm somewhat convinced that Scheme (all Schemes) are capable of optimizing. Also, by avoiding explicit destructuring, we can be reasonably sure that Arc gives the parameter list plain to Scheme (not that I've actually studied ac.scm much).

2. Copying of 'parsed. Unfortunately, we have to copy 'parsed, because of 'actions! This is because 'sem attaches the 'parsed part to the 'actions field (via closure), and so we can't reuse this 'parsed part (alternatively we could just copy 'parsed inside 'sem, instead of copying on 'many and 'seq). IMO: drop 'actions ^^. Copying 'parsed, I suspect, may be the single most time-consuming part of the monadic treeparse.

3. Using a 'tconc cell, while neater, is less efficient (about 20% IIRC) than explicit and separate head+tail variables. My goal in optimizing treeparse was to maintain legibility while adding speed; my goal in cps_treeparse was to optimize, legibility be damned^H^H^H^H^H^H^H consigned to hell.

4. There's a few more bits and pieces that may be optimizable - for example, most of my 'filt usage is (filt [map something _] parser). If we are sure that 'parsed will not need to be reused (assured by dropping 'actions, or giving the responsibility of copying 'parsed to 'sem) then we can define a (filt-map something parser) which returns the same list as 'parser, but with their car's transformed by 'something:

  ;monadic
  (def filt-map (fun parser)
    (fn (remaining)
      ; no actions!
      (iflet (parsed remaining) (parser remaining)
        ((afn (p)
           (when p
             (zap fun (car p))
             (self cdr p)))
         parsed)
        (return parsed remaining))))

  ; cps
  (def filt-map (fun parser)
    (fn (remaining pass fail)
      (parser
        (fn (parsed parsedtl remaining)
          ((afn (p)
             (when p
               (zap fun (car p))
               (self cdr p)))
           parsed)
          (pass parsed parsedtl remaining))
        fail)))
Using filt-map would reduce consing, and especially in the cps case would remove the scan through 'parsed to find the 'parsedtl.

Further optimizations abound: another common idiom I found in my wiki parser was (filt nilfn (seq-str "something")), which basically ignores the 'parsed return value of 'seq, It may be possible to create a variant of 'seq, 'seq-nil, which wouldn't bother catenating the 'parsed return values of subparsers.

So no, I don't think continuation-passing per se causes most of the speed difference, except possibly for eliminating the 'return structure.

Another possible issue is the use of a list-like structure. 'scanner-string also conses on each 'cdr, and the total memory usage actually exceeds (coerce "string" 'cons) - it's just that scanners are sexier ^^.

For this case it may be possible to create a variant of the CPS style which takes a tuple of (string, index, end) - as explicit and separate variables of course, so that underlying scheme has a shot at optimizing it away - and use indexing, which I suspect is faster on strings (and which scanner-string ends up doing anyway). This would remove the need for consing too.

-----

1 point by almkglor 6133 days ago | link

LOL forgot to give the parser the list to parse:

  ; cps
  (def filt-map (fun parser)
    (fn (remaining pass fail)
      (parser remaining ;oops!
        (fn (parsed parsedtl remaining)
          ((afn (p)
             (when p
               (zap fun (car p))
               (self cdr p)))
           parsed)
          (pass parsed parsedtl remaining))
        fail)))

-----

1 point by almkglor 6134 days ago | link

Results of a run:

  arc> (profiler-report)
  ((many-r 1522) (wiki-link 84) (formatting 48) (close-br 19)
  (elided-white 12) (bold 10) (italics 7) (nowiki 7) (bolded-text 5) 
  (italicized-text 5) (open-br 5) (ampersand-coded-text 4) 
  (many-format 3) (nowiki-text 2) (ampersand-codes 1) (nowiki-e 0) 
  (alt-r -4) (p-alphadig -4) (seq-r -690))
  nil
O.O? T.T

-----

1 point by raymyers 6134 days ago | link

Dang, I thought I fixed that bug :P

The last line of profiler-exit is supposed to prevent that.

  (def profiler-exit ()
    (let (name ms) (pop *profiler-stack)
      (let total (- (msec) ms)
        (++ (*profiler-times name)
            (- total *profiler-child-time))
        (= *profiler-child-time
           (if *profiler-stack total 0)))))
Did you do something weird, like Control-C out of a profiled call?

-----

1 point by almkglor 6134 days ago | link

No. But if you're talking about weird stuff, well yes: bolded-text and italicized-text also refer back to formatting via delay-parser:

        (*wiki-pp italicized-text
          (seq italics
               (filt in-italics
                     (many:seq (cant-see italics) (delay-parser formatting)))
               italics))
   ...

        (*wiki-pp formatting
          (alt
            (pred [let c (car _)
                    (or (alphadig c)
                        (whitec c)
                        (in c #\, #\.))]
                  anything)
            wiki-link
            bolded-text
            italicized-text
            ampersand-coded-text
            nowiki-text
            (filt [cons (escape:car _) nil] anything)))

-----

1 point by raymyers 6133 days ago | link

Crikey! Do you think delay-parser is the issue? Let me know if you can reproduce this.

-----

1 point by almkglor 6133 days ago | link

I don't think it's 'delay-parser, per se, just the interrecursion: 'formatting calls 'bolded-text calls 'formatting. I think this is what's making the profiler give weird numbers.

-----

1 point by raymyers 6131 days ago | link

I am having trouble to replicating your problem with mutual recursion. I get the expected result with this mutually recursive test. Can you give me an example that reports incorrectly?

  (def foo (li)
    (if li (do (sleep 0.25) (bar (cdr li)) (sleep 0.25))
        (sleep 2)))

  (def bar (li) 
    (if li (do (sleep 0.25) (foo (cdr li)) (sleep 0.25))
        (sleep 1)))

  (profile foo bar)
  (foo '(a a a a a a a))
  (profiler-report)

-----

1 point by almkglor 6131 days ago | link

Hmm. I'll check again later. In the meantime, what I did in the great-grandparent post was just this, on a fresh start:

1. turn on profiling in wiki-arc/wiki-arc.arc, by setting the variable whose name escapes me at the moment to t

2. Load wiki-arc/wiki-arc.arc and (thread (asv)) the server.

3. Load a single test page, with various formatting, including '''a bolded text's attempt to ''confuse the parser''''', [[link]]s, [[sample link|links with different text]]s etc. etc.

4. (profiler-report)

I'll check later on it, and possibly also post the sample wikitext.

-----

1 point by almkglor 6131 days ago | link

Even weirder now:

  ((alt-r 2) (many-r 1) (formatting 0) (close-br 0) (italics 0) (arc-code 0) 
  (wiki-link 0) (open-br 0) (arc-tag-e 0) (bolded-text 0) (nowiki-e 0) 
  (many-format 0) (elided-white 0) (p-alphadig 0) (italicized-text 0) 
  (ampersand-coded-text 0) (ampersand-codes 0) (arc-tag 0) 
  (nowiki-text 0) (nowiki 0) (bold 0) (seq-r -3))
This from a page which Arki reports as having been generated in 1666msec. It's funny, because the total time is 0.

Can you try reproducing the above setup? Maybe it's the threading thing which is bollixing it?

-----