Arc Forumnew | comments | leaders | submitlogin
3 points by rocketnia 5138 days ago | link | parent

There also seems to be a consensus that overloading + for concatenation is a bad idea

I thought there was a majority for concatenation. XD Well, either way, I'm for it. Maybe you've heard this all before, but I'll summarize:

I've convinced myself that '+ is a sensible standard name for monoid operations, including concatenation, and that '+ and '* together should describe a field whenever that makes sense (http://arclanguage.org/item?id=12850). To appease people who are concerned with efficiency, we can always leave in the math-only version of '+ as a separate variable, or just have the '$ drop-to-Racket operator so they can get it themselves.

As far as implementation goes, the technique I recommend is making a separate two-argument version of the varargs function, so that it's easier to extend consistently (preserving the left-associativity of the varargs version). Then a double dispatch technique comes in handy, but it only has to be a simple one. I've posted some rough examples before to illustrate this strategy: http://arclanguage.org/item?id=12561 http://arclanguage.org/item?id=12858



1 point by akkartik 5138 days ago | link

"I thought there was a majority for concatenation."

I was thinking of http://arclanguage.org/item?id=12347, where it seemed like PG and everyone else didn't like '+ for concatenation :) The thread on matrices and vectors was a separate issue. I think we're all ok overloading + for adding different kinds of mathematical entities, and so on. I'm not sure what sort of monoid allows + for list concatenation.

-----

2 points by rocketnia 5138 days ago | link

I was thinking of http://arclanguage.org/item?id=12347, where it seemed like PG and everyone else didn't like '+ for concatenation :)

I was thinking of the same thread. ^_^ We must have kept score in different ways, depending on which arguments convinced us and so on.

The thread on matrices and vectors was a separate issue.

I was using 'v+ as an example of an extensible, left-associative varargs operator, just to give an idea of how to implement '+ and '* so they're extensible.

I think we're all ok overloading + for adding different kinds of mathematical entities, and so on. I'm not sure what sort of monoid allows + for list concatenation.

Well, lists are mathematical entities. If that isn't intuitive, consider how we reason about basic properties of rational numbers. It's usually easiest to break them apart into numerators and denominators and form conclusions about those as integers. But if we can reason about something in terms of two pieces it has, it makes sense for our reasoning system to support arbitrary pairs rather than just the special case of rational numbers. Those pairs can be built up into linked lists.

If your concern is more about monoids, here's the Wikipedia version: "In abstract algebra, a branch of mathematics, a monoid is an algebraic structure with a single associative binary operation and an identity element."

Concatenation is associative (a . b . c is the same regardless of whether you do a . b or b . c first) and has an identity element (a . nil and nil . a are both just a), so the set of lists (or strings, etc.) together with the concatenation operator makes a monoid. Also, real numbers are a monoid under addition, since it too is associative and has an identity element (zero).

A few accumulation and repetition utilities here and there, like '++, 'mappend, and 'summing, can be generalized to all monoids... but really, the main benefit in my book is having a consistent excuse to use '+ for concatenation. :-p

-----

2 points by rocketnia 5137 days ago | link

Oh, I keep forgetting: I'm not sure I like having '+ be implemented using just a two-argument version. Naively concatenating two things at a time is embarrassingly inefficient (http://en.wikipedia.org/wiki/Schlemiel_the_Painter%27s_algor...). So, I'm sorry I suggested it!

Instead of that approach, I've been thinking about having one function that wraps the first argument in some kind of output stream, one function that submits a new argument to that stream, and one function that gets the value accumulated in that stream. So here's yet another untested, throwaway code example illustrating the functionality I'm talking about (but not the extensibility, unless you use 'extend):

  (= orig-inside inside orig-+ +)
  
  (def +streamer (x)
    (case type.x string  (w/outstring str (disp x str) str)
                 int     x
                 num     x
      (if alist.x
        (annotate 'basic-streamer
          (obj test alist func [apply join _] args list.x))
        (err "unrecognized case"))))
  
  (def fn-add-to (segment streamer)
    (if (and (isa streamer 'output) (isa segment 'string))
      (do (disp segment streamer)
          streamer)
        (and (in type.streamer 'int 'num) (in type.segment 'int 'num))
      (orig-+ streamer segment)
        (and (isa streamer 'basic-streamer) rep.streamer!test.segment)
      (do (push segment rep.streamer!args)
          streamer)
      (err "unrecognized case")))
  
  (mac add-to (segment place)
    (w/uniq g-streamer
      `(zap (fn (,g-streamer)
              (fn-add-to ,segment ,g-streamer))
            ,place)))
  
  (def inside (x)
    (case type.x output          orig-inside.x
                 basic-streamer  (rep.x!func rep.x!args)
                 int             x
                 num             x
      (err "unrecognized case")))
  
  (def + args
    (iflet (first . rest) args
      (let s +streamer.first
        (each arg rest
          (add-to arg s))
        inside.s)
      0))
Inefficient concatenation is probably what's bogging down Penknife's parser, so I'll put this technique to the test sooner or later to see how much it helps.

-----

1 point by rocketnia 5137 days ago | link

In order to get an apostrophe in that URL (http://en.wikipedia.org/wiki/Schlemiel_the_Painter%27s_algor...), I had to escape it as %27. Even when I went to edit my post, the quote had been stripped out of the recreated input. Bug?

-----

1 point by akkartik 5137 days ago | link

You can probably do a quick performance comparison and count conses like waterhouse does in that + thread. I'd be curious to see your results.

-----

1 point by rocketnia 5137 days ago | link

I haven't found my way into cons-counting.... I know it's probably sloppy, but I allocate as though it takes constant time. Given that assumption, I'm less concerned with an overall constant-time slowdown and more concerned with allowing a '+ of N sequences of length L to take O(NL) time rather than O(N^2 L) time.

This is also a mostly[1] less limited design from an interface point of view. It's straightforward to port an extension from my other design to this one: The '+streamer and 'inside functions will be extended with (fn (x) x), and 'fn-add-to will be extended with the desired two-argument behavior. (I'd personally make a macro for this, and I'd use it to (re)implement the number-and-number case.)

[1] The exception is if you want to use '+ on streamer types themselves, since a '+streamer behavior of (fn (x) x) will cause that type to be confused with whatever other type wraps itself up as that streamer type.

-----

2 points by akkartik 5138 days ago | link

"I was thinking of the same thread. ^_^ We must have kept score in different ways, depending on which arguments convinced us and so on."

Ack, you're right. So it's just me and waterhouse against +?

I tried replacing + on anarki and got immediate pushback. join does seem a long name, especially inside prn's to generate html, but what other one-character name can we use besides +? I'm leaning back towards + again, perhaps mirroring the experience of people who've tried this before.

-----

1 point by evanrmurphy 5137 days ago | link

> So it's just me and waterhouse against +?

IIRC, pg found himself against it too, but rtm was for it.

> what other one-character name can we use besides +?

One feature I like in PHP is the use of the dot (.) for concatenation. We've already loaded up that character quite a bit here in Arc, with its use in conses, rest parameters and for the ssyntax `a.b` => `(a b)`. But concatentation is at least vaguely isomorphic to consing. I wonder...

Probably not. `+` is your best bet, IMHO.

-----

1 point by akkartik 5137 days ago | link

I should mention, just for completeness, that haskell uses ++.

-----

1 point by evanrmurphy 5137 days ago | link

Didn't know that. Could you give a quick example?

-----

2 points by akkartik 5137 days ago | link

In haskell you can designate any sequence of characters as an infix operator. Here's the definition of ++ from the prelude (http://www.haskell.org/onlinereport/standard-prelude.html):

  (++) :: [a] -> [a] -> [a]
  []     ++ ys = ys
  (x:xs) ++ ys = x : (xs ++ ys)
so [1, 2, 3] ++ [4, 5] = [1, 2, 3, 4, 5]

-----

1 point by akkartik 5138 days ago | link

:) I'm convinced.

"if we can reason about something in terms of two pieces it has, it makes sense for our reasoning system to support arbitrary pairs rather than just the special case of rational numbers. Those pairs can be built up into linked lists."

Perhaps lists are continued fractions? :)

-----