A blog about software – researching it, developing it, and contemplating its future.

Archive for April 2008

Only 20,000 Lines

leave a comment »

A while back I posted a big ol’ post titled A Growable Languge Manifesto which argued strongly for extensible languages.

Well, I just ran across the one-year progress report from Alan Kay‘s current research group, and it’s some extremely exciting work that is all about extensible languages!

The group is the Viewpoints Research Institute, and the progress report lays out their plan to implement a complete software stack — everything from the graphics driver, to the language interpreter / compiler, to the TCP/IP stack, to the windowing and rendering system, to the IDE and programming environment — in 20,000 lines of code. Total.

As they point out, a TCP/IP stack alone in many conventional languages is more than 20,000 lines. So how can they possibly pull it off?

The answer, it turns out, is extensible languages. Specifically, they have an extensible parsing system — OMeta, cited heavily in my manifesto — which allows them to easily and quickly extend their languages. They also have a “meta-meta-language runtime and parametric compiler” named IS, which is how they actually get their languages and metalanguages into executable form.

One especially cool example is their TCP/IP stack. The TCP/IP specification has text diagrams of packet formats. So they wrote a grammar to parse those specifications directly as ASCII. And lo and behold, they could use the TCP/IP RFCs themselves to generate their source code. They also can use their parsing framework to analyze the structure of TCP/IP messages — they basically define a grammar for parsing TCP/IP messages, and action rules for handling the various cases. (OMeta lets executable code be attached to matching productions in a grammar.)

They also wrote domain-specific languages for just about every area. One example is low-level pixel compositing, basically giving them the functionality, and most of the efficiency, of a generative 2D pixel processing library such as Adobe’s Generic Image Library, cited in Bourdev and Jaaki’s LCSD 2006 paper. Another example is polygon rendering (450 lines of code that implements anti-aliased rasterization, alpha compositing, line and Bezier curve rendering, coordinate transformations, culling, and clipping). Though evidently they have yet to fully define a custom language for polygon rendering, and they hope to cut those 450 lines by “an order of magnitude”.

Basically, they take almost all parsing and optimization problems and express them directly in their extensible language, which gives them almost optimal flexibility for building every part of the system in the most “language-centric” way possible.

They have no static typing at all, which doesn’t work for me (though their line counts make a compelling argument), but there’s no reason in principle that these techniques couldn’t also apply to static type system construction.

In fact, there is work right now on intuitive language environments for creating written formal definitions of programming languages. A system like Ott lets you write semantics declarations that look like they came straight out of a POPL paper, and convert them into TeX (for printing) or Coq/Isabelle/HOL (for automated verification). I don’t know how far the implementation of Ott or Coq/Isabelle/HOL would change if Viewpoint’s techniques were aggressively applied, but I look forward to finding out!

I think this kind of programming is the wave of the future. Reducing lines of code has always been an excellent way to improve your software, and the expressiveness of a language has always shaped how succinctly you can write your code. From that perspective, it seems obvious that an extensible language would provide maximum potential increase in expressiveness, since you can tune your language to the details of your specific problem. It’s a force multiplier, or possibly a force exponentiator.

Object-oriented class hierarchies, query languages, XML schemas, document structures, network protocols, display lists, parse trees… they all share a common meta-structure that the “extensible languages” concept subsumes, and the Viewpoint project is the clearest evidence I’ve seen of that yet. It’s going to be a dizzyingly exciting next few years in the programming language world, and programming might just get a lot more interesting soon — for everybody.

There’s more to say on that topic, but I’ll save it for another post!

Written by robjellinghaus

2008/04/15 at 04:37