Arc Forumnew | comments | leaders | submitlogin
Arc, Plan 9, and UTF-8
2 points by eekee 5956 days ago | 2 comments
Plan 9 is an operating system which derives a wide range of very powerful features from a few very simple principles. I think Plan 9 is to operating systems what Arc wants to be to languages. I'm surprised at the lack of mention of Plan 9 on these forums, but I'm not writing this post to praise Plan 9.

If you were designing an operating system which derived a wide range of powerful features from a few very simple principles, wouldn't you just hate to have to support all the different language encodings out there? That's exactly the issue the Plan 9 team faced. I'll let the following archived emails tell the rest of the story:

http://doc.cat-v.org/bell_labs/utf-8_history

The end of the story is that UTF-8 was implemented so ubiquitously in Plan 9 that you can even use UTF-8 characters in C function and variable names.

I'm writing because I can't understand why Arc is stuck with ASCII. Granted, UTF-8 might impose some difficulties with mutable strings, although a paragraph from the emails is worth repeating:

    6) It should be possible to find the start of a character
    efficiently starting from an arbitrary location in a byte
    stream.
If the ability to modify any character in a string is still desirable, Arc could use a 32-bit representation internally. Being internal the 32-bit representation needn't be standardised, although standardisation might be useful to some.


4 points by stefano 5956 days ago | link

There have been some discussion about UTF-8 after the first release of Arc. Arc should now support UTF-8 correctly.

-----

1 point by projectileboy 5944 days ago | link

I agree. Of course, there are still some ASCII-based little helper functions, but I recall seeing PG post something in Icelandic characters in this forum after the UTF-8 thing was addressed.

-----