Arc Forumnew | comments | leaders | submitlogin
Ask The Community: Lexical Scoping
2 points by nlavine 5967 days ago | 21 comments
This is a question, because I don't understand something. Lexical scoping for functions is usually considered good. People really like the fact that the symbols you use to refer to things will always reference the lexically obvious objects, not whatever happened to be bound to those symbols last.

But as far as I can tell, the equivalent idea for macros is hygiene, and that is not embraced. My question is, why not?

Is there a difference between macro hygiene and lexical scoping that I don't understand?

Are macros the one place where interfaces are easier with dynamic rather than lexical scoping?

Or (just thought of this now), does dynamic scoping seem better for macros because it's the only way to get higher-order macros? (Because you can't pass a macro to a macro, so a lot of the common functions don't have macro equivalents.)

Thanks for your help.



3 points by rntz 5964 days ago | link

The main reason is that it's simpler.

The second reason is that the only (AFAIK) example of standardized hygienic macros - scheme - is a hell of a lot harder to use than it really needs to be.

The third reason is that unhygienic macros are more powerful, because they LET you subvert lexical scope if you want to. Sometimes macros want to mess around with variables you didn't explicitly pass in as arguments but which are not available in the scope of the macro's definition. Rare, but it happens.

For example, what if a macro wants to define something that is not passed in as a symbol, and have it available in the scope in which the macro was invoked? All of the anaphoric macros (eg. "aif") are of this type, and they cannot be done in a pure hygienic system.

IMO, the best idea would be to have a system which is hygienic by default, but in which you can explicitly say "hey, I want to use this variable dynamically". In fact, I also think this is the way that scoping should work in general: sometimes you'd like to be able to bind a variable dynamically - for example, you want to debug a function call within a certain period of code execution, so you dynamically bind the symbol to a wrapper function. This can be emulated by setting it beforehand and unsetting it afterward, but this doesn't handle call/cc, threading, etc. correctly.

However, lexical scope only for variables is usually "good enough", and it's simpler to implement than a hybrid system; likewise dynamic scope only for macros is usually "good enough", and it's simpler to implement than either a hybrid or a lexical system. (In fact, it used to be thought that dynamic scope for variables was "good enough", and it is easier to implement [for lisp at least: just keep a running alist-stack of variable bindings] than either lexical or hybrid scope.)

-----

1 point by nlavine 5964 days ago | link

Okay, I think I get what's going on with macros. So the idea of current macros is that they can only insert references to variables in the scope of wherever they're expanding. Therefore they do operate purely on syntax. If they want to insert references to functions, you need to make sure that the functions are bound in the expansion scope, but on the other hand, the macros are conceptually clean and really really easy to implement.

For what you're saying, I suggest the old MIT Scheme macro system as an example (before they implemented Scheme macros correctly): essentially, a macro is a function of three arguments - the form it's applied to, the environment of that form, and the environment that the macro was defined in. There is a procedure to return a "symbol" that refers specifically to a name bound in a given environment, so you can make specific references to it. The macro procedure returns some sort of code (just like normal lisp macros), but it can optionally include some of these special "symbols" that refer to bindings in a specific other environment.

That is much more complicated to implement, though, and requires environments as first-class objects.

-----

1 point by rntz 5963 days ago | link

Well, I wouldn't say "on syntax", I'd say "on code" - and code only contains symbols, not "references to variables in a specific scope"; scope is determined by context - the surrounding code, which the macro can't alter: you can't substitute code into other code while retaining the same surrounding code, that's a contradiction in terms! But this is just terminology.

The old MIT Scheme macro system seems interesting from what you say - is there any place I could go to find an implementation which has this behavior? Or at least examples of code which use it? It seems like it lets code do precisely what I said it couldn't above: contain "references to variables in a specific scope", which is pretty cool. I don't think you'd need to implement environments as first-class run-time objects, merely second-class compile-time objects, with this technique, unless you also allow macros themselves to be first-class.

-----

2 points by nlavine 5963 days ago | link

Okay, I think I'm starting to see. There is quite a big difference between the way Lisp people think of macros and the way Scheme people think of them.

From the documentation, I think that the current version of MIT scheme has this behavior, so look at http://groups.csail.mit.edu/mac/projects/scheme/. (By the way, in case you run Ubuntu, the version of MIT Scheme in the repositories is broken for me.) Look in the documentation for macros (under "Special Forms"), and it's their non-standard low-level macro system. If you're interested in stuff like that, you should also check out syntax-case, which I don't know much about, but I understand is the new, cool way to write Scheme macros. It includes hygienic and unhygienic functionality. Google search for "syntax case" and you'll get some documentation.

The more I look at it, though, the more I think that Scheme macros solve a different problem than Lisp macros. I don't know what it is yet, but it would be interesting to know.

-----

1 point by cchooper 5963 days ago | link

I think you've hit the nail on the head. Hygenic macros and unhygenic macros are very different things (unlike dynamic vs lexical scoping, which are just different ways to create a function). Lisp macros are 'true' macros (Wikipedia: "Macro: a set of instructions that is represented in an abbreviated format"). Hygenic macros are more like a new abstraction that was inspired by Lisp macros.

-----

1 point by nlavine 5962 days ago | link

Well, I'd rather not argue about what 'true' macros are, but I would point out that your definition is basically data compression for programs (which, by the way, I think is an interesting approach to take to programming language design). I'm pretty sure both types of macros and normal functions would all fall under it.

As for the hygienic vs. unhygienic difference, unhygienic macros are certainly easier to define: they rearrange source code into other source code.

The one thing I can think of that hygienic macros can do that unhygienic ones can't is that while they are rearranging source code, hygienic macros can insert references to things that aren't bound to any variable in the expansion scope. The common example I've seen for this is that it lets you protect against people redefining your variables weirdly. For instance, if you insert a reference to 'car', it means whatever 'car' meant where you defined your hygienic macro, even if 'car' has been redefined to be something crazy in the place where your macro is used. The Scheme hygienic macro system also has a way to break hygiene if you want to, so it can do everything other Lisp macros can do.

I guess the question then is, is it useful to be able to do that?

And if you decide you want to be able to do that, are Scheme-style hygienic macros the right way to go about it?

(One option would be to just let you insert objects straight into forms, instead of symbols that you think should reference those objects. This would be fine unless you wanted to be able to set those things later, in which case you'd need some way to get at the actual underlying variable boxes.)

-----

1 point by stefano 5963 days ago | link

> That is much more complicated to implement, though, and requires environments as first-class objects.

Given the interpreted nature of Arc, first class environments shouldn't be too hard to implement, but in a compiled implementation it would be a lot more difficult.

-----

2 points by rntz 5962 days ago | link

They could not be implemented on top of the existing arc-to-scheme translator, because that's what it is: an arc-to-scheme translator, not an interpreter. Scheme doesn't have first-class environments, so we can't shove them into arc without writing a full-fledged arc interpreter.

-----

1 point by stefano 5961 days ago | link

I've had a closer look at Arc2.tar. You're right. I thought that runtime environments were handled explicitly.

-----

1 point by almkglor 5963 days ago | link

Depending on how environments are handled.

If the environments are temporaries created only while processing macros, and are discarded afterwards, they don't necessarily have to be difficult for compilers.

-----

1 point by absz 5964 days ago | link

For what it's worth, (PLT) Scheme can in fact do what you require. After all, there's an implementation of defmacro that ships with it (though it might be called something else)! And there's a way (if I recall correctly, a very long, verbose way, shoring up your second reason) to implement aif and the like in their system by "requesting a shadower" or something like that.

-----

3 points by almkglor 5967 days ago | link

I mostly agree with absz, but in thinking deeply about lisp-likes and their implementation, I've stumbled on the formal equivalency of dynamic scoping on the expression-side to lexical scoping on the program-logic-side. Try taking a look at 'ac in ac.scm - it has an 'env parameter which is formally equivalent to a dynamic scoping rule for lexical compilation.

Basically, suppose we have the following code:

  (fn (x)
    (fn (y)
      (fn (x) x)))
Now suppose that our macro/code-walker starts at the topmost branch, with an empty dynamic environment, '()

It then enters the first fn and creates a dynamic entry:

  (fn (x)           <-- I'm here: ((x . nil))
    (fn (y)
      (fn (x) x)))
Then it enters the inner fn and inserts a new dynamic entry to the top of the dynamic context:

  (fn (x)
    (fn (y)           <-- I'm here: ((y . nil) (x . nil))
      (fn (x) x)))
Finally we enter into the innermost function:

  (fn (x)
    (fn (y)
      (fn (x) x))) <-- I'm here: ((x . nil) (y . nil) (x . nil))
Note that the macro now "knows" that the "x" here is a different "x" from the one higher in the hierarchy ^^

-----

1 point by nlavine 5967 days ago | link

Oh, I agree, dynamic scoping and lexical scoping are formally equivalent. For one thing, basic computer architectures are all dynamically scoped, with 2^32 (or whatever) variables, one at each memory address. Also, as you say, you can just pick one dynamic variable and use it to hold the lexical environment as you walk through a program. (Unfortunately, I don't know enough to see what the really cool implications of this are, but I'm sure it has some.)

My question is, why is lexical scoping considered better for functions, but dynamic scoping for macros? And more specifically, are these actually cases where different solutions are better, or is it an artifact of the fact that we don't know how to implement lexically scoped macros very well?

-----

1 point by bOR_ 5967 days ago | link

Sounds intriguing ;), but I miss the implications of your observation. Can something nifty be done with this?

-----

1 point by almkglor 5966 days ago | link

Yes: a code-walker interpreter with dynamic lookup can implement lexical scope, provided you don't pass functions around (on the interpreted-language side).

If you do pass functions around though, you need to keep separate slightly-different versions of the environment variable, and attach those to the functions you pass around on the interpreted-language side.

-----

3 points by stefano 5967 days ago | link

Functions can be passed around, so the lexical scope is really handy in this case. Macros, instead, are expanded in place: e.g. the macro

  (mac m (a)
    `(f ,a))
"f" is just a symbol. In hygienic macros f would be changed to some unique symbol (actually you wouldn't write such code in a hygienic macro if your intent was to make the expansion call f). In a standard macro system, you voluntarily capture the symbol 'f, maybe because it will be bound to a function at runtime. In that example f doesn't have dynamic scope nor lexical scope: it is just a symbol that the macro has put in the result of its computation. Its meaning (variable or not, lexical scope or not) will be decided on the successive compilation pass.

-----

1 point by nlavine 5967 days ago | link

Yes, exactly. I see why that is true. But think about a parallel function:

  (def n (f a)
    (f a))
In the function, you can pass f as an argument to call. In a macro, f is passed implicitly in the environment the macro is expanded in. The method of passing f to the macro m seems very much like using dynamic scoping to pass arguments to functions. My question is, what about a macro system where you could pass real arguments to macros? I.e., other macros (or functions, etc.)? What makes these things different?

-----

2 points by stefano 5966 days ago | link

> what about a macro system where you could pass real arguments to macros?

Like this

  (mac n (f a)
    `(,f ,a))

  (n [+ _ 1] 9) 
  ==> 10
where you pass a form that is then evaluated by the macro, or did you mean something else? Macros' arguments aren't evaluated, so you can pass only forms. To pass the result of a computation in CL (not in Arc) you can use read macros, that are evaluated before macro expansion time:

  (n #.(computation) 9)
This is quite unusual, because macros are intended mainly to modify the syntax, so it's quite natural to make them work on the syntax itself (i.e. the forms).

-----

1 point by nlavine 5965 days ago | link

Ah, I see this. I think I have been thinking of macros differently than you have (and probably wrongly). I suppose lexical scoping for macros would make the most difference in the case where a macro expands to a function call, like this:

  (let x 0  ; call (count) to keep a count of things
    (def count () (set x (+ x 1))))

  ; count-ops: count how many lines of code you run.
  (mac count-ops body
    (if body
        (cons (car body)
              (cons '(count)
                    (count-ops (cdr body))))
        '())

  (def foo ()
    (count-ops ; this count-ops call fails
      (with (count 1 step 2)
         ; do some loopy stuff here
      )))
If that's not a compelling example, pretend that count-ops is inserting an interrupt check between every two lines of code. Why is dynamic scoping better for cases like these?

As for real arguments to macros, yes, I meant something like the CL stuff. You're right, though, that macros modify syntax, and I wasn't thinking about them that way. Two posts up you said that macros are "expanded in place". I think that you were thinking about the effect macros have on the code they're called on, whereas I was thinking about the call to the actual macro procedure, and passing arguments to it.

-----

3 points by absz 5967 days ago | link

The biggest reason Arc has unhygienic macros is that they are conceptually simpler in terms of implementation: the code that the macro outputs is spliced right in, with no preprocessing/wrapping/whatever. If macros were hygienic, they would have to be preprocessed somehow before being substituted in.

Whether this simplicity is better than hygiene is, as you have probably noticed, argued about, but that's the rationale.

-----

2 points by cchooper 5966 days ago | link

But what is the 'lexically obvious object' for symbols in a macro expansion?

  (= z '(list y))

  (with (x 1 y 2)
    (mac foo () z))

  (with (x 3 y 4)
    (pr (list x)
    (pr (foo)))
which expands to

  (with (x 3 y 4)
    (pr (list x)
    (pr (list y)))
To my mind, the lexically obvious value of y is 4 in that last expression, just as the lexically obvious value of x is 3. I see nothing lexically obvious about giving y the value 2, because there is no lexical reference to y in that scope.

So in my opinion, lexical scoping and unhygenic macros are both doing the lexically obvious thing. Hygenic macros introduce a whole new kind of scoping that is neither lexical nor dynamic, but looks like lexical scoping in specific circumstances.

Unhygenic macros preserve true lexical scoping while hygenic macros try to override it. So from my point of view, the question should be "If lexical scoping is a good idea, why would you want hygiene?"

-----