Arc Forumnew | comments | leaders | submitlogin
Do programmers really need namespaces or modules?
5 points by akkartik 5159 days ago | 12 comments
I found myself rereading this old thread on namespaces (http://arclanguage.org/item?id=11971) and realized hey, arc already has protection against collisions! def warns you when you redefine a name. Isn't that enough? Is there something else namespaces buy us?

Warnings seem like a far more lightweight solution than namespaces. You only need to change one def: def. And you don't need ugly boilerplate in every. single. file.



3 points by waterhouse 5158 days ago | link

I've had a couple of ideas...

1. Create a special form that rebinds 'assign (which is the assignment operator that '= and pretty much everything else expand into--except not things like 'scar or the equivalent (= (car x) ...)), so that instead of affecting global variables, all '= and 'def forms evaluated within this special form will modify values within a hash table. This special form can return the hash table. You'd put (load "file.arc") inside that special form.

This will probably have to be done by the Arc compiler (or something that duplicates its functionality), because global references to preexisting variables will need to be compiled normally, while global references to variables within the module will need to be compiled into references to the hash-table (or perhaps to variables in a local environment that spans the whole module), and references to local variables will need to be compiled normally. I think this will be somewhat hard to implement, but if done, then it could do exactly what hasenj asks for.

2. A hacky idea that seems it'll work... very well, in fact. It has several caveats, which I'll get into later. Here are the pieces of the idea.

A. Scheme has a built-in (namespace-mapped-symbols) function. It returns a list of all variables that have values in the current namespace.

B. We can use this to extract a list of all Arc variables.

  (def arc-namespace ()
    (map [sym:cut string._ 1]
         (keep [is string._.0 #\_]
               ($.namespace-mapped-symbols))))
C. Now, we're going to find out what global variables the module file creates. And also what it modifies; there may be some definition overlaps (that case is what such a module system is meant to deal with). So, we're going to store the entire Arc environment (i.e. make an assoc-list or hash-table of symbol-value pairs), load the file, find out what variables were modified or created, restore the old environment, create a massive local environment in which all the newly created/modified variables are initialized to nil, load the file in that local environment, and do what we want with all the local variables that now refer to the variables created in the file (like stuff them in a hash table).

Details. First, note that 'eval doesn't respect local variables, and I think it kinda has to be that way, so the way we should load the file is like this:

  (eval `(with (newvar nil newvar2 nil modvar nil ...)
            ,@(readfile "stuff.arc")
            (let gtb (table)
              (= gtb!newvar newvar
                 gtb!newvar2 newvar2
                 ...)
              gtb)))
The stuff involving 'newvar will, of course, be macro-generated as well. This may be a complicated beast of a macro, but quite doable.

Less importantly, note: a) I don't think Arc has something like (let u 'ach (setf (symbol-value u) desired-value)). This is easily patchable with something like (let u 'ach (($ namespace-set-variable-value!) ($.ac-global-name u) desired-value)). You can also do (eval `(= ,u ',desired-value)) but that sucks in a few ways and it really should be put into the language. (Note that $.[name-containing-bang!] breaks due to ssyntax, so you need to put it in parentheses form.)

b) Currently, hash tables can't contain nil; with table tb, (= tb!x nil) deletes the value associated with x. This is worked around in the definition of 'defmemo by creating a "nilcache" table. Not sure if that's the best way to handle it. For the moment, I'd probably use an assoc-list--and note that 'alref doesn't allow an optional "fail-value" argument, so I'd use 'assoc and deal with it.

c) I don't think Arc has a way to undefine variables. Again, this is easily patchable: (($ namespace-undefine-variable!) ($.ac-global-name var)).

...And here is an implementation, and it in action.

http://pastebin.com/qJRepGcM

(I modified it from the above to respect macro definitions within the file. Interestingly, this leads me to load a file three times: once to extract changed variables, once to get macros, once to evaluate all definitions. I've heard of "load, compile, and run". Maybe this is like that.)

We see that creating the module in a local environment containing all the variables makes each one independent. You can just extract 'ppr, and its lexical environment will contain all its dependencies; you can throw away the rest of the hash table, and your namespace is totally unpolluted; additionally, there is no cost of doing hash table lookups (in fact, they're lexical-variable lookups) or of figuring out how to make the compiler optimize them out.

Also, take note of this: Suppose a module contains a huge amount of stuff and all you want is one little function that depends on just a couple of things. If we were, say, going to save the Arc process into a self-contained executable and distribute it, we wouldn't want to stick it with 40 MB of mostly unnecessary libraries (I'm looking at you, SBCL). Well, guess what happens if you use this local-environment method and extract only the function you want: All the unnecessary crap will have nothing referencing it and will get garbage-collected. This is so good. I believe it is the perfect solution. (Returning a hash-table and extracting the desired functions and throwing away the table also has this effect.)

Caveats. First, there are fundamental issues with this design. All toplevel expressions will be evaluated more than once (three times currently; two is easily possible, but not one without compiler power); if you want the module to do any multiple-evaluation-sensitive initialization, put that inside a function named 'initialize or similar, and the person loading the module should call (initialize) afterwards. Also, your module shouldn't generate new names when loaded multiple times. And it shouldn't contain any functions (including 'initialize) that do global assignment to variables that don't exist yet. So if you want it to do (= counter* 0) within a function, you should do a global (= counter* 'nonce). Finally, don't even think about using macros from a module (though the definitions in the module can themselves contain macros). Macros will be put into the table, and you can extract their functions with 'rep and put them into your own macros, but if their expansions are supposed to reference things from the module, that seems impossible without at least code-walking power.

Second, there are some issues that can be fixed with some surface modifications of the design. If you want to load a module and then modify anything inside it (global functions or variables), that currently will have no effect unless the module provides specialized functions to do so (the functions depend on what's in the lexical environment of the module, which you can't touch by modifying the hash table). However, it is trivial to make 'load-module insert 'get-var and 'set-var functions into the module that will get and set the lexical variables. A user of the module will have to use (ns!set-var 'x val) instead of (= ns!x val).

Issues with my implementation: It uses 'is to determine what variables were changed. If, say, your code goes (= thing* nil) and the module goes (= thing* nil), my implementation won't notice the difference and won't localize it. Likewise for (= counter* 0) and (= name* "Bob"). This could be worked around by, say, using something like "defvar" to define global variables (not to be confused with the defvar from Common Lisp, which creates special variables with dynamic scope), where 'defvar would record everything it does in a table that 'affected-vars could access. (For helpfulness, 'load-module could give warnings when attempting to load a file containing toplevel '= things.) Or: We could modify the body of '= to contain a clause like

  (if we-are-loading-a-module*
      `(do (= (things-changed* ',var) t)
           ,(proceed-as-usual))
      (proceed-as-usual))
Then 'affected-vars could set we-are-loading-a-module* to t, do its work, and set it back to nil. Then it could work perfectly (assuming one doesn't do global assignment with 'assign).

Also, btw, it doesn't reset the 'sig function signature table. Easily fixable.

With all these issues in mind, here is my evaluation of my implementation:

1. If your module just defines functions (possibly using macros to define them) and possibly hash tables, then it is flawless and requires no boilerplate.

2. If your module creates global variables with initial values that 'is could think are identical to those already created (integers, nil, strings, symbols), there's a small chance that bad things will happen (if your code and the module's code happen to initialize the same global variable to the same value). This could be fixed in a few different ways.

3. If you want to load a module and then modify and use its global variables, then, with a fix I have described above, you can do that with a little bit of boilerplate.

4. If your module does weird stuff like modifying different variables the second time it's loaded, or performing structural modifications (= (car ...) ...) on existing global variables or the universe (writing to files), this will die badly, although in non-weird cases it could be fixed with the minor boilerplate of putting stuff in an 'initialize function.

5. If you want to export macros... bad. Sorry. Note, however, that a certain class of macros can be covered by creating functions that accept thunks, and then the person using the module can write a macro on top of it. (See 'on-err as an example; imported from Scheme.)

Summary: I agree with hasenj that returning a hash-table containing all variables in a module would be nice. How to do this? Either you could build something on top of the Arc compiler, which I expect to be hard, but it would let your implementation be perfect; or you could do something like my 'load-module system. My system is easy to implement (I did it just now, though it is the result of a lot of thought plus a bit of work), and it works pretty well (it works perfectly in the first large class of cases, almost-perfectly and fixably in the second, fails in the third but is easily fixable, and it works somewhat in the fourth and fifth cases).

-----

3 points by aw 5158 days ago | link

I have a couple thoughts, not necessarily consistent with each other :)

I'm interested in ways that putting code into a module or a namespace is something that I'd do to code that I want to use, instead of something that the library author is supposed to do for me. (That is, something perhaps along the idea of hasenj's load into a namespace idea mentioned in another comment).

Then as a library author, you focus on making your code as clean and simple as possible, instead of guessing how I might want to use it or in what kind of module I might want to encapsulate it in.

Another thought I have is that in human communication, we often use abbreviations (pronouns, nicknames, acronyms), initially describing what we're referring to; for example, I might say "My cousin Bob" and later mention that "Bob is tall" and you know that I'm still talking about my cousin Bob instead of any of the other Bob's that we might know.

So for example in arc.arc:

  (let expander 
       (fn (f var name body)
         `(let ,var (,f ,name)
            (after (do ,@body) (close ,var))))

    (mac w/infile (var name . body)
      (expander 'infile var name body))

    (mac w/outfile (var name . body)
      (expander 'outfile var name body))
    ...
"expander" is a helper function, used only by the public macros, and it's hidden to avoid bulking up the global namespace. I forget what I was working on at the time, but I was amused to notice that I actually wanted to use expander for my own w/something macro that I was writing.

I don't know which of our dwindling supply of ASCII symbols we'd sacrifice for this, but I imagine we might have a kind of abbreviation facility:

  (def (expander{with-file-expander} f var name body)
     `(let ,var (,f ,name)
        (after (do ,@body) (close ,var))))

  (mac w/infile (var name . body)
     (expander 'infile var name body))
the actual name of the symbol is "with-file-expander", while in the context of this load the reader will replace the abbreviation "expander" with "with-file-expander".

-----

2 points by thaddeus 5154 days ago | link

I see many technical solutions listed out, but I didn't see any answers to your 'Isn't that enough?' question.

As some of you know the first prog. lang. I learned was Arc, but since then I've been working with Clojure and have been really happy to have namespaces.

For starters:

* I don't need to spend brain power coming up with stupid hacked names to prevent collisions. I can use the ones that flow naturally.

* I'm not a fan of having gobs of documentation, so for me a defvar or fn name should be meaningful yet as short as possible, I can achieve this with namespaces and lose this otherwise.

* I think having the option for a few lines at the top of your files to represent namespace management is far less a problem than making your code less readable.

Also, for example, using namespaces one could normalize[1] much of the code between Arc and Clojure, such that converting programs over becomes a cinch:

i.e. You could create an arc library in Clojure that overrides Clojure names:

  (ns arc.core
    (:refer-clojure :exclude [remove]))

  (defn remove [x y]
     (code))
Then you could copy/paste arc code to a program file and direct the code via namespaces to use the arc library:

  (ns myprogram.core
    (:refer-clojure :exclude [remove])
    (:use arc.core))

  (defn myfn []
    (do-stuff
      (remove list1 list2)))

Optionally you could use the library inline:

  (arc.core/remove list1 list2)
In the end name-spacing, even though some consider it boilerplate, is a valuable tool for code management and it would be nice to see it come to arc.

[1] Obviously some arc sugar will not covert over, but you could write a program in arc knowing in advance what semantics to avoid, making the conversion work, or at least lessening the impact.

[edit:] Lol, I forgot 'remove', is actually 'rem' in arc, but you get the point .

-----

2 points by akkartik 5154 days ago | link

"I see many technical solutions listed out, but I didn't see any answers to your 'Isn't that enough?' question."

:) I noticed that as well. Thanks for the writeup.

"..having the option for a few lines at the top of your files.."

Oh, I'd love the option. It's good to have the ability to rename declarations, but I don't need it all the time. The common case isn't collisions. The common case is just one definition to a name, and when I use the name I want the definition. It seems a false economy to foist verbosity on the common case.

More concretely, I want to phrase your examples as:

  (use arc.core :exclude [remove])
No need for a namespace declaration inside arc.core, or in the caller.

Or an interactive session like this:

  > (load "a")
  > (load "b")
  warning: b.foo shadows a.foo ; now a.foo isn't available
  > (rename b.foo :to b_foo) ; a.foo becomes available again
It's really interesting to see languages that started out extremely dynamic copy namespaces and turn gradually more static. The idea of inference seems to not have rippled into namespaces. PLT added modules at some point, and suddenly you can't load a module or require a non-module, and to be able to rename declarations you had to require. every. single. module you ever use within every single module.

-----

1 point by thaddeus 5154 days ago | link

In Clojure the core names from the core language are automatically imported into all/any namespaces(even the default one 'user'). So you don't have to use them, you can just do the same thing arc does -> (load-file "filename.clj").

And you can choose to incorporate name-spacing only when you need to.

That being said, once you start crafting projects having more than one file you will end up adding a bare min namespace, since it's just actually easier than load-file:

  i.e.
    (ns myprog.core) 
  does the same as: 
    (load-file "/myprog/core.arc")
  only it creates the names in the namespace 'myprog.core'.
Interactively at the REPL you can switch between namespaces,

  user=>(ns mynamespace)
  mynamespace=>
Or you can just load all your library files into a single namespace.

Since Clojure core libraries are already loaded, my example actually had to exclude the 'remove' causing a little bit of boilerplate:

  (ns arc.core
    (:refer-clojure :exclude [remove]))
but, you can easily stack the names of interest:

  .i.e  to add more items to exclude...

  (ns arc.core
    (:refer-clojure :exclude [remove find others]))
The same can be said if you want to selectively import:

  (ns myprog.core
   (:use [somelibrary.core :only (every re-sub pull]))
As opposed to loading everything from some extra library:

  (ns myprog.core
   (:use somelibrary.core))
In my mind it's really slick, and there's a plethora of options to manage them, should you need/want to.

> 'No need for a namespace declaration inside arc.core, or in the caller'.

None, but as stated above - you typically have one since it's easier.

[edit: None, assuming you choose to use load-file (load-file "arc/core.clj") which loads the code into the default namespace, or where ever you ran load-file)

I hope all that made sense.

-----

1 point by akkartik 5154 days ago | link

"Since Clojure core libraries are already loaded, my example actually had to exclude the 'remove' causing a little bit of boilerplate."

I'm not too concerned about the verbosity when you need to exclude something. What clojure does seems fine.

I didn't realize that ns is like load, and not like PLT's module. It goes in the caller, not the callee. That's cool. But once I use:

  (ns arc.core) ; provides say find
Can I use all its declarations as just find and not arc.find?

If so that's pretty much what I want :)

-----

1 point by thaddeus 5154 days ago | link

> I didn't realize that ns is like load,

I'm not sure if Clojure is doing this for me or leiningen.

https://github.com/technomancy/leiningen

As I started right off using it.

[edit: yeah lein is doing the load for me, but you can still just use load if you like. To run as a script Clojure uses 'java -cp ....', so your files need to be on your classpath location. It's been a long time, since I bothered with that way.]

> Can I use all its declarations as just find and not arc.find?

Yup (well, find is actually taken by clojure core, so you need to exclude it if you wanted your own version, but for everything else, which I believe was the intent of your question, you're golden).

-----

2 points by bogomipz 5157 days ago | link

How about a loader that renames all global names to gensyms by default, but with the ability to preserve the ones you want?

  (load-hidden "arc.arc"
    w/infile w/outfile expander)
This basically means you have to declare every global you want to make available from the file you load. You could even specify what names to give them:

  (load-hidden "arc.arc"
    w/infile w/outfile expander.w/close)
Nested loads would be tricky though. The outer would have to hide names that are preserved by the inner, except when they are wanted of course.

Alternatively, the loader could add a prefix (the file name?) rather than rename to gensyms.

-----

4 points by hasenj 5158 days ago | link

It would be nice if one could load some definitions into a namespace

  (load "module.arc" ns)

  ; ns is a hashtable
  ; module.arc defines 'foo'
  ; we access it with ns!foo

  (ns!foo a b c)

-----

2 points by akkartik 5158 days ago | link

Hmm, here's an idea for lightweight namespacing. Let's create a second-class function:

  (def foo(a) :inner
    ...)
Now you can suppress the redefinition warning when a def redefines an inner.

That seems to help the problem of proliferating helper functions that fallintothis brought up (http://arclanguage.org/item?id=11996). It lets you keep helpers in the namespace without them getting in the way. And if you just don't want them polluting your namespace at all, I think you suffer from programmer's OCD :)

-----

1 point by rocketnia 5158 days ago | link

If "a def redefines an inner," won't that still break whatever depended on that inner?

Maybe a def should shadow an inner? In my pursuits, I've been thinking about something vaguely like http://c2.com/cgi/wiki?HyperStaticGlobalEnvironment (which I got to from http://lambda-the-ultimate.org/node/3991), where definitions essentially replace the binding, leaving the previous binding intact for whatever closures captured it. I'd go farther than that: I'd have certain operations just replace the environment entirely by shadowing variables, etc., usually putting the old environment somewhere we can get to it again.

Also, in my mind closures in this hyper-static approach should automatically define any not-yet-defined free variable bindings they need as they're created. This approach requires the default kind of definition to be set!, so that the not-yet-defined bindings don't just get shadowed instead of defined. That means it's hardly a hyper-static philosophy at all; it's just an approach that use hyper-static capabilities where it needs to for namespacing (and in order to access them, there's likely to be boilerplate).

Sorry, this is sort of a ramble, I know. I don't have a lot of time right now to clean it up. ^_^;

-----

1 point by akkartik 5158 days ago | link

"If "a def redefines an inner," won't that still break whatever depended on that inner?"

Eep, you're right. I think I want something like this:

  > (= innerdef mac)
  > (innerdef foo() 3)
  > (def bar() (foo))
  > (bar)
  3
  > (def foo() 2)
  > (bar)
  3
Would that always work? It wouldn't let bar be defined before foo, which I still kinda care about. (http://arclanguage.org/item?id=12668) That's the problem with the hyperstatic idea as well (thanks for the link, btw.)

Or perhaps we should just have an explicit undef. Heh, I want to rename def back to defun just so undef is a rotation of defun.

-----