It looks like that's thanks to a March 8, 2012 commit by akkartik (https://github.com/arclanguage/anarki/commit/547d8966de76320...)... which, lol... Everything I was saying in a couple of recent threads about replacing the Arc reader to read mutable tables... I guess that's already in place. :)
"The best part is that the server should be able to reboot without losing the closures."
Might want to remember to wipe them out if you make certain changes to the code, so that you don't have to think about the effects of running old code and new code in the same system.
(Edit: Oops, I replied out of order and didn't read shawn's comment with the elisp examples before writing this.)
I suspect what pg means by it is primarily that it's tricky to do in Racket (though I'm not sure if it'd be because there are too few options or too many).
Essentially, I think it's easy to display a closure by displaying its source code and all the captured values of its source code's free variables. (Note that this might be cyclic since functions are often recursive.)
But there is something tricky about it, which is: What language is the source code in? In my opinion, a language design is at its most customizable when even its built-in syntaxes are indistinguishable from user-defined ones. So the ideal displayed format of a function would ideally involve some particular set of user-definable syntaxes. Since a language designer can't anticipate what user-defined syntaxes will exist, clearly this decision should ultimately be up to a user. But what mechanism does a user use to express this decision?
As a baseline, there's at least one straightforward choice the user can make: The one that expresses the code in terms of Arc special forms (`fn`, `assign`, `if`, `quote`, etc.) and Arc functions that aren't implemented in Arc (`car`, `apply`, etc.). In a language that isn't aiming for flawless customizability, this is enough.
Now suppose we try to generalize this so a user can choose any set of syntaxes to express things in terms of -- say, "I want the usual language, but with `=` treated as an additional built-in." If the code contains `(= x (rfn ...))`, then the macroexpander at some point needs to expand the `rfn` without expanding the `=`. That's not really viable in terms of Arc's macroexpander since we don't even know `(rfn ...)` is an expression in this context until we process `=`. So this isn't quite the right generalization; the right generalization is something trickier.
I suppose we can have every function printed in terms of its pre-macroexpansion source code along with printing all the macros and other macroexpansion-time state it happened to rely on as it macroexpanded the first time. That would narrow down the problem to how to print the built-in functions. And we could solve that problem in the language design by making it so nothing ever captures a built-in function as a first-class value, only as a late-bound reference to the global variable it's accessible from.
Or we could have the user specify their own macroexpander and make it so that whenever a function is printed, if the current macroexpander hasn't expanded that function yet, it does so now (just to determine how the function is printed, not how it behaves). This would let the user specify, for instance, that `assign` expands into `=` and `=` expands into itself, rather than the other way around.
These ideas are incomplete, and I think making them complete would be pretty tricky.
In Cene, I have a different take on this: If a function is printable (and not all are), then it's printable because it's a callable struct with a tag the printer knows about. It would be printed as a struct. The function implementation wouldn't be printed. (The user could look up the source code information based on the struct tag, but that's usually not printable.) There may be some exceptions at the REPL where information is printed that usually isn't available, because the REPL is essentially a debugging context and the debugger sees all. (Racket's struct inspectors express a similar debugger-sees-all principle, but I haven't seen the REPL take advantage of it.)
"[Racket] has every feature you could want. And yet it is very difficult to do even the simplest things. For example, when an error is raised in Anarki, you'll see a stack trace that points to ac.rkt rather than the actual location within the arc file that caused the error."
Is that really "the simplest things"? :-p It seems to me Arc goes out of its way to avoid interleaving source location information on s-expressions the way Racket does. Putting it back, without substantially changing the way Arc macros are written, seems to me like it would be pretty complicated, and that complication would exist whether Arc was implemented in Racket or not. (I think aw's approach is promising here.)
---
It's fun to see Lumen is designed for compiling to other languages even when it's self-hosted. That's always how I figured Arc or pretty much any of my languages would have been written once they were self-hosted.
I've been trying to write Racket libraries that (among other benefits) let me implement Cene in Racket the way I'd like to implement Cene in Cene, which should make it an easier process to port it into a self-hosting Cene implementation. But I certainly don't have it working yet, the way Lumen clearly is. :)
"There's one thing you can't do with functions that you can do with data types like symbols and strings: you can't print them out in a way that could be read back in. The reason is that the function could be a closure; displaying closures is a tricky problem."
I've sometimes wondered just what the connotation of 'tricky' is here. Is it hard in some theoretic sense, or just "Arc is a prototype so we don't have this yet"?
You may have noticed one of elisp's interesting features: closures are printed out.
> (define (adder n)
(lambda (x)
(+ x n)))
(closure (t) (n) (function (lambda (x) (+ x n))))
> (adder 42)
(closure ((n . 42) t) (x) (+ x n))
That means you can theoretically write and read closures to/from disk.
Anyone remember the "Unknown or expired link" errors in the old days of Hacker News? Arc generates closures on the fly, then stores them on the server keyed by a random ID. It sends this random ID down to the browser as links: https://www.laarc.io/x?fnid=ls1wNQuImEEYyvWtGKs1Sj. When the user visits the link, arc looks up the closure and calls it. This effectively allows Arc code to pause computation and then resume later. It's kind of like traditional node-style fs.readFile callbacks, but way more interesting because Arc uses macros to make it feel very natural to write server-side code in this style. You don't feel like you're writing nested callbacks. I've never caught myself wishing for async/await, for example.
Now, the drawback of this technique is that it starts to consume a lot of memory. The closures don't need to be stored forever, but they do need to be stored for a reasonable amount of time. Arc solves this by "harvesting" the closures, meaning it walks through the global closures table and casts out the old ones. Like a startup. (Hmm.)
The other drawback is that a server reboot will wipe all the closures. That's why you'd sometimes get "Unknown or expired link" in the middle of the day on HN. The memory usage got pretty extreme, and a periodic reboot was a quick fix (that no doubt made pg wince each time he ^C'd the server).
So, emacs lisp is very interesting here, because it illustrates that it ought to be possible to store Arc's dynamic closures on disk rather than in memory. That would solve the problem completely: you can generate as many functions as you want, and you don't need to worry about a thing till you blow through >10GB of disk space.
The best part is that the server should be able to reboot without losing the closures.
All the functions in news.arc operate either on objects in memory, like users, or on ephemeral objects, like requests. (The arc server creates a new `request` instance for each incoming connection, and it stores things like the user's cookies and the query arguments.) Both of these can already be serialized straight to disk. And the dynamic closures are almost always generated in the middle of building an HTML page -- i.e. you want to describe "build a form; here's what to do when the user submits it". That latter half is a function, which becomes a dynamic closure keyed by fnid. The form is sent straight down to the user's browser. When the form is submitted, the server picks up right where it left off. That means you can interweave your "view" code and your "model" code, in the MVC sense. And there's no need for a controller; the controller is the closure, which knows what to do thanks to lexical context.
HN eventually solved the fnid problem by getting rid of them. You'll rarely see any dynamic closures on the site. My hypothesis is (a) it's faster to write features using the fnid technique, and (b) the fns can be serialized to disk.
(a) seems true, and (b) is worth exploring. As https://www.laarc.io/ gains momentum, we're hoping to preserve this technique. If it works out, users shouldn't notice any "Unknown or expired link" messages – the closure will be on disk, and they'll last a long time, to put it mildly.
It would be important to ensure that circular structures don't cause an infinite loop, and I'd be nervous about straying too far from Racket's `write` facility. For better or worse, it's a limitation of racket that you can't `read` a table you've written. But it could be worth doing.
That's really cool, thanks for making the case for Lumen. I now have it on my todo list to determine how much the standard library of Lumen matches the names Arc uses.
On a slight tangent, ycombinator.lol is not affiliated with Y Combinator, and https://github.com/lumen-language is not affiliated with Scott Bell the creator of Lumen. I clarify this because it took me a while to figure out. Am I characterizing it right, Shawn?
Wow. Freaky fast. Thanks!
I was thinking of going even further and finding out if arc could output tables and read in tables in that structure?
{todo:({id 1 name "get eggs" done nil} {id 8 name "fix computer" done t})}
looks so much better than the #hash() equivalent and this gets extreme with nested tables. It's also much easier to think through a table structure writing it out.
No. I feel slightly bad for being so blunt, but Racket's language facilities have absorbed more of my mental time and energy than any other aspect of Racket.
Let's put it this way. Racket is an excellent choice for implementing languages. It has every feature you could want. And yet it is very difficult to do even the simplest things. For example, when an error is raised in Anarki, you'll see a stack trace that points to ac.rkt rather than the actual location within the arc file that caused the error. This is because, frankly, ... Well, let's just say I've redacted some expletives here. But those expletives were aimed at the fact that it's really quite difficult to translate the vision in your mind into an implementation using Racket.
Now, that being said, if you sit down with Racket and take the time to learn it, you will find that it's one of the most robust, flexible, and performant language runtimes in existence. The thread-local variable support is a killer feature. The custodian support is rock solid, which makes sandboxing trivial and super reliable. And it's the premiere implementation of Scheme, so it might continue to have a thriving community for decades to come. But it took me years to become very effective with Racket. (This says more about my own shortcomings than about Racket, though!)
It's a Lisp that runs in both JS and Lua. It's self-hosted, meaning all .js and .lua code is generated by Lumen itself. Diff this technique with the fact that ac.scm is written in Scheme rather than Arc. Very mysterious! I remember how electrifying it felt that first day I stumbled across it and realized the significance of what I was looking at.
It's one of the most incredible projects I've ever seen. It's so small and clear, yet does so much – a perfect example of why simple systems run circles around competitors.
There is a branch that adds Python support at https://github.com/shawwn/lumen/tree/features/python which speaks to Lumen's flexibility and power. (I wonder if there is another Lisp that you can use natively from three different languages without any FFI. And if there is, I doubt it's <3000 lines.)
At this point I use Lumen to explore new languages. E.g. I learned R by implementing a Lumen to R compilation target: https://github.com/sctb/lumen/pull/193
You can also do some things with Lumen that I'm not sure you can do with any other Lisp. I hesitate to claim that, but... Well. Wanna see a magic trick?
https://news.ycombinator.com/item?id=17958650
I think so. It definitely had the same style as https://news.ycombinator.com/login in the sense that it was a plain html page consisting solely of textarea inputs. And YC launched in 2005, which was long after pg began work on Arc in 2001. http://paulgraham.com/arcll1.html
I once replicated the YC application by writing it in Arc (similarly to how we extended Hacker News at http://laarc.io/), and the resulting app looked mostly identical in style without much tweaking.
The first question I'd ask is, what kind of language do you want to create?
That makes it a lot easier to answer a question like "Is X a good choice for Y?" It depends on Y! :-)
Then, along with asking here (which is fine), you might also want to ask the Racket folks. Go to https://racket-lang.org/ and scroll down to "Community". Then you can ask, "I'd like to create a language like Y, would Racket be a good choice? If so, how would I go about it?"
Would anyone recommend Racket for a toe-headed newbie who wants to learn how to implement their own language after ruining Arc for a bit? Language development is something I've wanted to do for a while but I have no background in type theory, compilers, or anything of that sort.
"Isn't there a dialect out there that uses a different bracket to mean 'close all open parens'? So that the example above would become (foo (a b c (d e f]. I can't quite place the memory."
Oh, apparently it's Interlisp! They call the ] a super-parenthesis.
The INTERLISP read program treats square brackets as 'super-parentheses': a
right square bracket automatically supplies enough right parentheses to match
back to the last left square bracket (in the expression being read), or if none
has appeared, to match the first left parentheses,
e.g., (A (B (C]=(A (B (C))),
(A [B (C (D] E)=(A (B (C (D))) E).
Here's a document which goes over a variety of different notations (although the fact they say "there is no opening super-parenthesis in Lisp" seems to be inaccurate considering the above):
They favor this approach, which is also the one that best matches the way I intend for Parendown to work:
"Krauwer and des Tombe (1981) proposed _condensed labelled bracketing_ that can be defined as follows. Special brackets (here we use angle brackets) mark those initial and final branches that allow an omission of a bracket on one side in their realized markup. The omission is possible on the side where a normal bracket (square bracket) indicates, as a side-effect, the boundary of the phrase covered by the branch. For example, bracketing "[[A B] [C [D]]]" can be replaced with "[A B〉 〈C 〈D]" using this approach."
That approach includes what I would call a weak closing paren, 〉, but I've consciously left this out of Parendown. It isn't nearly as useful in a Lispy language (where lists usually begin with operator symbols, not lists), and the easiest way to add it in a left-to-right reader macro system like Racket's would be to replace the existing open paren syntax to anticipate and process these weak closing parens, rather than non-invasively extending Racket's syntax with one more macro.
If you have a small number of key/value pairs and care about order then alists are both efficient and practical, but otherwise I would not give them too much weight.
So an example I can give is found in html.arc. The start-tag function has an alist used for generating tag attributes:
Notice (pair (cdr spec)) is an alist. Now if you wanted to extend start-tag with conditional operations on the tag spec, you could bind (pair (cdr spec)) to say var 'attrs' and use arcs built in functions to inspect and modify as you see fit. As you can see a table is probably not needed when there's only a half-dozen items (at max) plus you would lose the order. A list would prevent you from pairing up to do anything meaningful like an inspection or conditional modification.
It's worth noting that a serious downfall for alists is having no real means to detect a data type, because really it doesn't have one (unlike a table). You could inspect the first item attempting to detect the type that way, but the logic's soundness/efficiency breaks down pretty quickly in all but the simplest cases.
So, again, a small number of pairs where you are confident in the shape of the data and have total control of the data usage (i.e where and how it get's passed around) then they are good, but otherwise you would need a pretty good or nuanced reason for it.
Some of pgs other idea's [1] are: "having several that share the same tail, and preserve old values".
i.e. You could have some function that accumulates pairs (where some or many have the same key, but a different value). This is where you don't want the obvious behaviour a table provides (where the last one added wins), and/or you need previous entries. Noting that you can save an alist to a file, and reload them easily while still having access to the history of values for a given key, thus you're able to re-construct your operations based on historical data. You just can't do that with a table.
Just to recap my opinion that I gave you over chat: the advantage alists have is that they're simple to support. Along pretty much any other axis, particularly if they grew too long, you'd be better off using some other data structure. Which one would depend on the situation.
> You've got me curious now about how this relates to Amacx :)
Why, everything! :-) E.g. I start with: what if top level variables were implemented by an Arc table, and top level variable references were implemented by an Arc macro? That is, what if top level variables were built out of lower level language axioms, instead of being built in?
We end up with something that's kind of like modules, but doesn't do everything that we'd typically expect modules to do (though perhaps we could implement modules on top of them if we wanted to), and also does some things that modules don't do (for example we can load code into a different environment where the language primitive act differently).
To give a name to this thing that is kind of like modules but different, I called them "containers", because they're something you load code into.
Are containers useful? Well, I'm guessing it would depend on whether we'd ever want to want load code into different environments in our program. If we only want to load code once, and all we want is a module system, I imagine it'd probably be more straightforward to just implement a module system directly.
On the other hand, suppose we have a runtime that gives us some nifty features, but is slower than plain Arc. Now it seems like containers could turn out to be a useful idea. Perhaps I have some code that I want to load in plain Arc where it'll run fast, and other code that I want to run in the enhanced runtime and I don't mind that it's slower.
> I find that having tests allows me to start out in a sort of engineering mindset, in your terms, where I just get individual cases working one by one. But at the same time they keep me from growing too attached to a single implementation and leave me loose to try to think up more axiomatic generalizations over time.
Exactly!
This is the classic test driven development refactoring cycle: add features with tests, then refactor e.g. to remove duplicate code, and/or otherwise refactor to make the code more axiomatic.
Since "Any sufficiently complicated C or Fortran program contains an ad-hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp", one could, in theory, start with such a C or Fortran program and refactor towards an axiomatic approach until you had reinvented Lisp, with the program written in Lisp :-)
But in practice I think going the other way is sometimes necessary: that is, starting with some axioms, and seeing what can be implemented out of them.
I'm not sure why that is (why doesn't anyone keep refactoring a large C program and end up with Lisp?) but I suppose it might be because it's too cognitively difficult, or you end up at some kind of local maximum in your design, or something.
In any case, I find tests absolutely essential for working on Amacx... not just "nice to have" or "saves me a lot of time", but "impossible to do without them"!
You've got me curious now about how this relates to Amacx :)
I find that having tests allows me to start out in a sort of engineering mindset, in your terms, where I just get individual cases working one by one. But at the same time they keep me from growing too attached to a single implementation and leave me loose to try to think up more axiomatic generalizations over time. You can kinda see that in http://akkartik.name/post/list-comprehensions-in-anarki if you squint a little; I don't show my various working copies there, but the case analysis I describe does faithfully show how I started out thinking about the problem, before the clean general solution suddenly fell out in a flash of insight.
(Having tests doesn't preclude more systematic thinking about the space, and proving to myself that a program is correct. But even if I've proved a program correct to myself I still want to retain the tests. My proofs have been shown up too often in the past ^_^)
It's interesting to be taking an axiomatic approach. That is, in this case, to add to the language axioms that expressions can be labeled with their source file locations.
It might not work: it might turn out that the feature I want (to be able to track source locations through macro expansions) can't be expressed in terms of this particular set of axioms. Or, it might be that it can, but the result is a runtime too slow for me to want to use it.
But, if it does work, it has its own internal logic. What does (cdr x) mean when x has been labeled with source locations? Well, clearly, what it ought to mean is the tail of x, labeled with the source locations of the tail of x. Theorems such as (apply (fn args args) xs) ≡ xs should continue to work.
On the other end of the spectrum from an axiomatic approach is engineering. Have a list of features you want, and design a system that implements all of them. This too might fail sometimes (perhaps the features you want turn out to be incompatible, or you design yourself into a corner that's hard to get out of)... but most of the time it's more reliable, in the sense that usually we can come up with some design that implements all (or at least most!) of the features we want... even if maybe the result isn't very pretty.
The downside of engineering is design complexity. Complexity will probably at least scale linearly with the number of features, if not more likely by some power law. If we're lucky we may see some simplifications in the design along the way that we can refactor into, some axioms of the design that become apparent that we can incorporate... but most of the time, in practice, the design gets more and more complex as we add features.
Engineering is attractive because it gets things done. "I just want X, let's implement X". There are a lot of times when what I want is just to implement X, and I engineer a design, and it works out fine.
The axiomatic approach is more uncertain. Will it work? I don't know. It's also harder. Oops, ssyntax stopped working. Why? `some` stopped working. Why? `recstring` stopped working. Why? `+` stopped working. Why? Is it because my implementation of `apply` is broken, or because I broke the compiler and its now outputting broken code, or because my runtime is broken? It could be any of these. Another day, another week of debugging.
It's also more fun. There are many macro systems. Many of them are practical. Some have features I don't care about, some are more complicated than I like, but I don't have much interest myself in engineering yet another macro system. Axioms are more interesting. Perhaps it will turn out that for these particular set of axioms, it doesn't work out for this particular feature. But then at least I know why :-)
No, you can interleave bindings and body expressions however you like, but the catch is that you can't use destructuring bindings since they look like expressions. It works like this:
(lets) -> nil
(lets a) -> a
(lets a b . rest) ->
If `a` is ssyntax or a non-symbol, we treat it as an expression:
(do a (lets b . rest))
Otherwise, we treat it as a variable to bind:
(let a b (lets . rest))
The choice is almost forced in each case. It almost never makes sense to use an ssyntax symbol in a variable binding, and it almost never makes sense to discard the result of an expression that's just a variable name.
The implementation is here in Lathe's arc/utils.arc: