Arc Forumnew | comments | leaders | submitlogin
Http + web.arc: an alternative combo for web development
16 points by palsecam 5428 days ago | 7 comments
Thaddeus was asking for advice on eventually running Arc behind nginx. I believe doing so is an interesting choice.

In a way, I see a lightweight reverse-proxy as part of my (web server) OS: it does a number of low-level common tasks so that each app server doesn't have to (gzipping, logging, etc.).

And nginx is a very good reverse-proxy. I don't like Apache very much, for it's half an heavy web framework, half just a proxy. It does too much, it eats too much RAM for what I expect it to do, etc. Things like mod_perl are a plague. Nginx is better designed IMO.

But aw pointed out that because of srv.arc poor respect of RFC 2616, running srv.arc behind a proxy is quite hard. This is bad, because (1) I have to serve several websites/domains on the same computer but I have only one 80 port, (2) nginx does some things better and faster than Arc will ever do, and an app server should not have to reinvent the uninteresting parts of the wheel (logging, slow clients handling, static files serving, etc.), (3) don't kill the "clients and servers don't (need to) know who is really at the other end of the wire" REST principe.

Because of this, and because I think srv.arc and app.arc suffer from too many other problems anyway (bad layering, makes too many choices for me, fucks RESTful paths, code too bloated and too specific to news.arc, etc.), I decided to write a better web dev combo for my own uses.

I use it to power dabuttonfactory.com and a couple of internal apps, and so far it has prouved robust and useful to me.

http.arc is about parsing HTTP messages, and building generic "low-level" http servers and clients. It is available here: http://pastebin.com/jiXSX8yV (124 LOC). I consider the need to patch it a bug, for it should just be a choice-agnostic, blind implementation of HTTP. It is not, however, a strict implementation of RFC 2616. It can run behind a proxy or standalone.

web.arc is a web (site|app) toolkit to use on top of http.arc. It is available here: http://pastebin.com/9GmhRWqc (96 LOC). It used to be features-full (session & user logins & more) but the version linked here is more lightweight. I removed a lot of stuff, because I've not yet found a good enough, generic enough, solution that suits me. So I chose to extract its core into a toolbox instead of making it a limit(ed|ing) framework. Adding user sessions is quite easy by reusing the stuff defined or app.arc or whatever, if you need it.

Using this combo may require a bit more effort than using srv/app.arc (see below), but you may not have to remove stuff that is just useless to you (like ken removing /whoami and co), nor to patch it to get over the limited 'defop feature. You may have to extend it, but not throw away parts of it.

  $ arc http.arc web.arc -
   arc> (defpath / (req) (prn "Hello, World"))
   #<procedure>
   arc> (defpath /: (req id) (prn "Hello " req!ip ", you requesting post # " id))
   (("/:" ((":") #<procedure>)) ("/x/:" (("x" ":") #<procedure>)))
   arc> (defpath /user/: (req usr) 
          (htmlpage ((tag title (pr "User infos")) 
                     (icss "body { text-align: center; }"))
            (prn "User infos for user " usr))))
   (("/user/:" (("user" ":") #<procedure>)) ("/:" ((":") #<procedure>)) ("/x/:" (("x" ":") #<procedure>)))
   arc> (= httpd-handler dispatch)
   #<procedure: dispatch>
   arc> (start-httpd)
   httpd: serving on port 8080
   #<thread: start-httpd>

  $ curl http://localhost:8080
  Hello, World

  $ curl http://localhost:8080/42
  Hello 127.0.0.1, you requesting post # 42

  $ curl http://localhost:8080/user/pal
  <!doctype html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"><title>User infos</title><style>body { text-align: center; }</style></head>User infos for user pal

Inspiration: CGI.pm, Hunchentoot, web.py, in short the "anti-framework frameworks". Real frameworks are attracting at first ("OMG so easy"), but when you need more power/control, you realized you are screwed. Frameworks are for sissies. And srv/app.arc is a framework.


5 points by palsecam 5428 days ago | link

Here is a "plugin" example to handle the kind of "web best practises" I respect, those that make it faster for my customers.

  ;;; web-static.arc: module to deal with static files (CSS/JS)
  ; designed to leverage a reverse proxy to serve the files when in production

  ; currently specific to my own needs, will certainly always be.
  ; main requirements are: eternal caching when possible, CSS/JS 
  ; minification, ability to correctly handle external files we can't
  ; easily control (think images url in CSS), ** minimal overhead **
  ; (i.e: code as fucking simple as fucking possible)

  ; CSS/JS minification: we use the YUI compressor and put the
  ; files in minified-dir*.  nginx is told to look first
  ; in this directory and fallback to static-dir* if not found 
  ; (for the files "out of control")
  ;
  ; eternal caching: we set a query string "?<mtime_of_file>", and told
  ; nginx to inform the client to cache this URL for 1 year
  ; (1 year is max allowed by RFC and anyway sufficient)
  ;
  ; (wipe testing*) to activate minification and query string.
  ; doesn't matter if you do this while not being actually behind nginx,
  ; nothing will break, httpd.arc is still serving the static files correctly

  (= static-dir*          "res/static/"  
     minified-dir*  	"res/minified/"
     static-path*      	"/static/"     ; URL (not filesystem) path
     code-compress-prog*  "yuicompress"  ; sh wrapper around yuicompressor.jar
     testing*	        t)

  (def sendfile (fname (o mt (mimetype fname)))
    (resphead http-ok+ (copy httpd-hds* 'Content-Type mt))  
    (prfile fname))

  (register-path (string "/" static-path* "/*")  ; never reached in production
     (fn (req file)
       (aif (file-exists (string static-dir* "/" file))
            (sendfile it)
            (resp-err))))

  (def static-url (fname) 
    (string static-path* fname
      (when no.testing*
        (+ "?" (mtime (compress-ifstale (+ static-dir* "/" fname)
	      	    		        (+ minified-dir* "/" fname)))))))


  (defs csss (fname)  (css:static-url fname)
        jss  (fname)  (js:static-url fname))


  (def compress-codefile (fsrc fdest)
    (ensure-dir:dirname fdest)
    (system (+ code-compress-prog* " " fsrc ">" fdest)))

  (def compress-ifstale (fsrc fdest)
    (when (and (in (file-ext fsrc) 'js 'css)
    	       (or (~file-exists fdest) (> (mtime fsrc) (mtime fdest))))
      (compress-codefile fsrc fdest))
    fdest)

  (defmemo compress-code (str (o type 'js))
    (w/tmpname tmpf
      (w/outfile s tmpf (disp str s))
      (out-from code-compress-prog* " --type " type " < " tmpf)))

  (with (_ijs ijs  _icss icss)  ; redef web.arc ones
    (defs ijs  (str)  (_ijs (if testing* str (compress-code str)))
    	  icss (str)  (_icss (if testing* str (compress-code str 'css))))
  )


  ;; todo: 
  ; * X-Accel-Redir in 'sendfile if behind nginx.
  ;   heuristic: look if X-Real-IP present.  or make the proxy pass
  ;   a header with its name to be more correct (X-Forwarded-By)
  ;
  ; * img-compress-prog* (`optipng')?
  ;
  ; * a clean way to do the call to `yuicompress' asynchronously
  ;
  ; * gzip here to not have nginx do it on-the-fly each time?  not sure
  ;   if the gain is that valuable
  ;
  ; * 'compress-code[...] bad names?
  ;
  ; * use GG Closure compiler and not YUI, use its REST API, and therefore
  ; be obliged to make it asynchronous
  ; 
  ; * hash instead of mtime maybe.
  ; 
  ; * like for web.arc, '=once macro or init procedure so that one can use a !=
  ; path without having to change the file.

'mtime, 'file-ext 'mimetype are defined somewhere else. 'mtime is just calling the `stat' program via 'system. I don't have access to the file they're defined in right now (they are in a "files.arc" file) but I'll post it next week.

Nginx config sample to use this with:

  root	/home/<user>/res/;

  rewrite "/static/(.*)" "/minified/$1" break;

  location /minified/ {
     internal;

     if (!-f $request_filename) {
        rewrite "/minified/(.*)" "/static/$1" break;
     }

     if ($query_string) {
     	expires	+1y;
     }	   
  }

  location /static/ {
     internal;
  }
----

Obviously, using a reverse proxy makes the need of 'setuid irrelevant (it is such a low-level syscall anyway. even plain old unix daemons should use the daemontools and don't do this by themselves). Nginx could be made to keep-alive and gzip, which are huge perf wins. Not serving the static files by the app server is so obvious, even news.ycombinator.com does this know.

The manual "wait 30 seconds, then kill the 'slow' client" handling of srv.arc is a crappy solution (but the crappy threading model asks for it): sometimes my wifi connection is so slow, I couldn't finish a POST to this forum (yes it happened for real, I should retried each POST several times). A reverse-proxy, by buffering and handling slow clients in the good manner (i.e: not killing them brutally: if they don't write for some time, it's OK it's just an idle fd in the select() poll) removes this problem.

----

Old version of web.arc (then called wf.arc) that does session and login handling: http://pastebin.com/3amqH2h8

I'll try to post an example of a login procedure as I do it with this combo next week, but like for "files.arc", I don't have access to it right now.

----

Clickable links: http.arc: http://pastebin.com/jiXSX8yV , web.arc: http://pastebin.com/9GmhRWqc

----

An nginx basic config file for proxying to an http/web.arc powered app server (add the previous sample in the server block if you use web-static.arc):

  server {
    listen           example.com:80;

    location / {
      access_log  /var/log/nginx/examplecom-access.log;
      proxy_pass  http://localhost:8080;
      proxy_set_header        X-Real-IP  $remote_addr;
      proxy_pass_header       Server;
    }
  }

-----

4 points by palsecam 5426 days ago | link

files.arc (where 'mtime and co are defined ; web-static.arc depends on it) is available at http://pastebin.com/YGNZA6SG

----

An example of a login procedure, taken from one of my websites:

  (def new-login-handler ((o redir "/"))
    [let user (arg _ "user")
      (redirect 
        (if (login user (arg _ "pwd"))
    	    redir
	    (opurl:new-login-op redir "Bad credentials" user)))])

  (def login-page (req (o redir "/") (o msg) (o userval))
    (page req "Login"
      (tag (p class "err") (prt msg))
      (fnform (new-login-handler redir)
        (lblinp "Username: " "user" "text" userval)
        (br)
        (lblinp "Password: " "pwd" "password")
        (br2)
        (but "login"))
      (ijs "document.getElementById('user').focus();")))

  (def new-login-op ((o redir "/") (o msg) (o userval))
    (newop [login-page _ redir msg userval]))

  (defpath /login (req)  (login-page req))
'page is a macro on top of 'htmlpage to create a page with the look&feel of the project site.

'login is something like: (goodcrypt pwd (get-passwd-of-user user)) ('goodcrypt is in files.arc).

"op" means "operation" in my lexicon, and is for /x/... paths, stateful actions. I know, this is confusing, it's not the same notion than in srv.arc ('defop). But in my mind, even when it comes to webapps, the default is statelessness and resources-oriented, not stateful operations. People (me the first) basically only care about "resources" (informations, content), be it a rich Ajax-full webapp or a basic HTML page, anyway.

In srv.arc, the "operations" system is +/- the 'fns* / 'fnids* / 'flink / etc. stuff.

----

The complete nginx config file for the above project site:

  ##
  # Nginx configuration file for <proj> on localhost.
  #
  # Install with:
  # ln -s /home/<PROJ>/res/nginx-localsite.conf /etc/nginx/sites-enabled/<PROJ>
  ##

  server {
       listen		8030;

       root		/home/<PROJ>/res/;
       error_log	/var/log/nginx/<PROJ>-error.log;
       access_log	off;

       
       rewrite "/static/(.*)" "/minified/$1" break;

       location /minified/ {
       		internal;

		if (!-f $request_filename) {
		   rewrite "/minified/(.*)" "/static/$1" break;
		}

		if ($query_string) {
             	   expires		+1y;
		}
       }

       location /static/ {
             	internal;
       }

       location ~ "^/(favicon\.ico|robots\.txt)$" {
		expires	       		+2M;
       }


       location / {
		access_log		/var/log/nginx/<PROJ>-access.log;
       		proxy_pass		http://localhost:8020;
		proxy_set_header	X-Real-IP  $remote_addr;
		proxy_pass_header	Server;
       }
  }
Gzipping and other general config directives are defined in the main /etc/nginx/nginx.conf file and are invisibly "inherited" here.

I don't log accesses to /static/* and /favicon.ico / /robots.txt, but I do log accesses to the rest of the website (access_log directive in the "location /" block).

Nginx doc @ http://wiki.nginx.org/NginxModules

----

A comment in web.arc mentions scheme2js, more infos here: http://www-sop.inria.fr/mimosa/scheme2js/ It's a scheme to javascript compiler, which is smart enough to substitute the TCO with the use of a while loop or a trampoline (depending of the case), and that can do a bunch of other optimizations too (like inlining calls to +).

It is used in the HOP project (http://hop.inria.fr/) which is a framework to develop rich webapps (i.e: w/ a rich javascript-backed client GUI and w/ Ajax) using Scheme for the server and the client. It is quite impressive (the website is a demo). To Thaddeus: wtf would you want to compile Arc to CoffeeScript and not to raw Javascript directly?! If you decide to write an Arc to JS compiler, be sure to check out scheme2js!

-----

2 points by thaddeus 5426 days ago | link

-> why compile Arc to CoffeeScript?

good question. scheme 2js looks better.

BTW Thanks for posting all this information! This gives me more to sink my teeth into. :)

-----

1 point by shader 5412 days ago | link

Any idea how hard it would be to port the forum app to web.arc? It would give us an interesting comparison of the two systems.

I'm partly interested because I've been working very slowly on creating an fcgi interface for arc. I wanted to run it on a shared host that allows arbitrary scripts to be run, but only as long as they don't open their own network sockets.

Unfortunately, while I got the fcgi interface to work (not fun by the way; fcgi is horrible), and simple "hello world" scripts running directly on srv.arc work fine, more complicated things like the forum have mysterious bugs. It could easily be that something is wrong with my code, but given the extreme lack of debugging tools, and the mysteriousness of the bugs, I wonder sometimes if it isn't something wrong with the arc app stack itself. It would be awesome if switching the forum over to your web stack would make it more stable.

Btw, you should also put your code in a more permanent location, like the anarki repo.

-----

1 point by aw 5412 days ago | link

I'm curious why you want to run on a shared host? Is it to save money or for some other reason?

-----

1 point by shader 5412 days ago | link

The main reason is that I already have a shared host, so I figured I might as well run arc on it, instead of pay for separate vps. Especially if I can get better performance/reliability than running it locally.

Also, I figured that there might be some other people interested in doing the same thing.

-----

1 point by evanrmurphy 5415 days ago | link

Exciting stuff, palsecam. Though I've been really pleased by how quickly srv + app.arc has let me get some working apps going ("OMG so easy" as you say), I do foresee needing more control in the future. Will definitely consider http + web.arc.

-----