Archive for the ‘open source’ Category
It’s been a very interesting month despite the fact that I haven’t touched a line of Holofunk code! I want to deeply thank everyone who’s expressed excitement about this project — it has been a real thrill.
First I have a favor: if you like Holofunk, please like Holofunk’s Facebook page — that is a great way to stay in touch with this project and with other links and interesting things I discover.
In this post I want to mention a variety of other synesthetic projects that people have brought to my attention, and I want to recap the places that have been kind enough to mention Holofunk.
First and foremost, let me say that, as with my first Holofunk post, I find all of these projects very thought-provoking and impressive, and I am linking them here out of appreciation and excitement. Since I have many plans for Holofunk, I do find myself wanting to take various aspects of these projects and build them into Holofunk. I sincerely hope that the artists and engineers who have produced this work are appreciative of this, rather than threatened or irritated by it. There are obviously a lot of us creating new musical/visual art out there, and I hope that others are as inspired by my work as I am by theirs.
Holofunk is and will remain open source, under the very permissive Microsoft Public License, so if anyone who’s inspired me winds up wanting to make use of something I’ve done, it is entirely possible. (Please let me know if you do, though, as I’ll be very interested and pleased!)
Synesthesia On Parade
One project Beardyman mentioned to me was Imogen Heap’s musical data gloves. It took me a while to get around to looking them up, but when I eventually did I was gobsmacked:
Imogen Heap is of course a brilliant and well-known artist, and these gloves are her vision for where she wants to take her performance. Her technical partner in this project is Tom Mitchell, a Bristol professor of music who was kind enough to reply when I wrote him a gushing email.
The system he’s developed with Imogen is best documented by this paper in the proceedings of the New Interfaces for Musical Expression 2011 conference. And now I need to go off and download and read the complete proceedings, because it’s all right up Holofunk’s alley.
Tom and Imogen are using 5DT data gloves, which are $1,500 for a pair with a wireless connection, as well as a pair of AHRS position sensors (about $500 each). So their hardware is out of my hobby-only price league. I am interested in the Peregrine glove (only $150 per), but unfortunately it’s exclusively left-handed at present, though I wrote them and they said Holofunk was quite exciting and they would love to be involved, so there’s hope! Anyway for now I will stick with Wiimotes as they are cheap and relatively ubiquitous.
Latency is a huge concern for Tom — the AHRS position sensors have a 512Hz update cycle, which is extremely impressive. The Kinect will never come close to that, which again motivates sticking with some additional lower-latency controls. Plenty of people I showed Holofunk to at Microsoft want me to build a Wiimote-less version, and I probably will experiment with that — including using the Kinect beam array as the microphone — but it honestly can’t compete with a direct mike and button/glove input as far as latency goes. Darren (Beardyman) specifically mentioned how impressed he was that I’d gotten the latency right (or at least close to right) on Holofunk; evidently lots of programmers he talks to build things that are very latency-unaware, making them useless for performance. So while a pure-Kinect version would be very interesting (and obviously quite marketable!), it’s not my priority.
I am hoping to make some waves inside Microsoft as far as getting better low-latency audio support in Windows… ASIO shouldn’t be necessary at all, Windows — and Windows Phone — should be able to do low-latency audio just as well as the iPhone can! And for proof that the iPhone gets this right, here’s our friend Darren rocking the handheld looper:
The app there is evidently Everyday Looper, and dammit if it shouldn’t be possible to write that for Windows Phone 7, but I don’t think it can be done yet. This will change, by Heaven. In fact, writing this post got me to actually look the app up, and that turns up this stunningly cool video demonstrating how it works. Plenty of inspiration here too:
Good God, that’s cool.
One other project Tom mentioned is the iPhone / iPad app, SingingFingers:
That’s synesthesia in its purest form: sound becomes paint, and touching the paint lets the sound back out. I totally want to build some similar interface for Holofunk. Right now a Holofunk loop-circle is dropped wherever you let go of the Wiimote trigger while you’re recording it, but it would be immensely straightforward to instead draw a stroke along the path of your Wiimote-waving, and then animate that stroke with frequency-based colors. It would also be fascinating to allow those strokes to be scratched back and forth, though I’m not yet sure that a freeform stroke is the most usable structure for scratching.
I am sure I will turn up a colossal quantity of other excellent projects as I move forward with Holofunk, and I will certainly blog the pants off of them because it’s dizzying how much work is being done here, now that every computer and phone you touch can crank dozens of realtime tracks through it. Wonderful time to be an electronic musician, and the future is dazzling…
Holofunk Gets Press
I also very much appreciate the sites that have linked to Holofunk.
Bill Harris, an excellent sports/gaming blogger, was nice enough to mention Holofunk.
The number one Kinect hacking site on the web, KinectHacks.net, asked me to write up a description of Holofunk, which they posted. They get mad hits, so this is lovely. An experimental music/art collective in Boston, CEMMI, already contacted me as a result of the kinecthacks post!
…And now that I am surfing kinecthacks.net, I find that I might be wrong about how possible it is to do Holofunk with just Kinect. This guy seems to get a lot of pretty fast wiggle action going on here:
Getting effects like that into Holofunk is definitely on the agenda for early next year.
Still Taking It A Bit Easy
Now, all that wonderfulness having been well documented , I must confess that I am still on low hacking capacity, Holofunk-wise. And here’s where this post veers into totally off-topic territory, so you’ve been warned!
I’m a gamer, you see, and Q4 of every year is the gamer’s weak spot. I’ve been playing the heck out of Deus Ex: Human Revolution, a really excellent homage to a famous game from ten years ago. I played that game then, and I’m totally digging this one now.
Then on November 11th, the unbelievably huge game Skyrim ships. My friend Ray Lederer is one of the lead concept artists on the game (check out this video of him at work), and the game could take over a hundred hours to complete, so that’s a month and a half shot right there.
And THEN, soon after THAT, the expected-to-be-superb Batman: Arkham City game comes out. I played the first Batman game from these guys to smithereens, and I am expecting to do likewise with this one.
So… yeah… the next few months have some stiff competition. However, given how much excitement there is around Holofunk, I do plan to make these be the only games I play in 2011. There are just not enough hours in the day to read, watch, listen to, or play every good book, movie, track, or game in the world, LET ALONE do any actual work of one’s own! So one has to be picky, and the above are my picks.
But Once That’s Over With…
My only specific goal for Holofunk in 2011 is to rewrite the core audio pump in C++ to get away from the evil .NET GC pauses.
Then, in 2012, I plan to get seriously down to business again, feature-wise.
The number one feature is probably going to be areas — chopping up the sound space into six or so regions, and allowing entire areas to be muted or effected as a whole. That will allow Holofunk to become useful for actual song creation, since you’ll be able to bridge into other portions of a song in a coherent way.
The second feature will probably be effects. Panning, volume, filtering, etc. — adding that stuff will do a huge amount for making Holofunk more musically interesting.
Then will come visuals — SingingFingers meets Holofunk. Should make the display radically more interesting and informative.
After that, probably scratching / loop-cutting. I have no idea what the interface will be, but being able to chop up loops and resample them is part of every worthwhile looper out there (see Everyday Looper’s awesome video above), so Holofunk has got to have it. Going to be challenging to do it with just a Wiimote, but it’s got to be possible, it’s GOT to be!
And then, most likely, video. Stenciling out Kinect video and time-synchronizing it with the loops could be all kinds of wacky fun — I cited this in my last blog post as the “live Monkey Jazz” possibility.
All that together should hopefully take only until mid-2012 or so, at which point I want to start rehearsing with it in earnest and actually performing with it. If I can’t get a slot at a TEDx conference, I’m just not trying hard enough.
Thanks as always for your interest, and stay in touch — 2012 will be an epic year! I feel much more confident saying things like that now that I’ve actually gotten this project off the ground 🙂
Dang it’s been quiet around here lately. Too quiet. One might think I had no intention of ever blogging again. Fortunately for us all, the worm has turned and it’s time to up the stakes considerably, as follows:
I mentioned in a blog some time ago that I had a pet hacking concept called Holofunk. That’s what I’ll mostly be blogging about for the rest of the year.
There has been a lot of competition for my time — I’ve got two awesome kids, three and six, which is an explanation right there; and I spent the first half of the year working on a hush-hush side project with my mentor. Now that project has wound down and Holofunk’s time has finally come.
One thing I know about my blogging style is that it works much better if I blog about a project I’m actively working on. Back in the day (e.g. 2007, still the high point for blog volume here), I was contributing to the GWT open source project, and posting like mad. Since joining Microsoft in 2008, though, I’ve done no open source hacking to speak of. That’s about to change.
Holofunk is a return to the days of public code, since I’ll be licensing the whole thing with the Microsoft public license (that being the friendliest one legally, as well as quite compatible with my goals here). So now I can hack and talk about it again, and that’s what I intend to do. The rest of 2011 is my timeframe for delivering a reasonably credible version of Holofunk 1.0. Feel free to hassle me about it if I slack off! It never hurts motivation to have people interested.
So What The Pants Is Holofunk Anyway?
My post from last year gave it a good shot, but I think some videos will help a great deal to explain what the heck I’m thinking here. Plus it livens up this hitherto pure wall-of-text blog considerably.
First, a video from Beardyman, who is basically my muse on this project. This video is him performing live, recording himself and self-looping with two Korg KAOSS pads, while being recorded from multiple cameras. The audio is all done live. Then a friend of his edited the video (only) such that the multiple overlaid video images parallel the audio looping that he’s doing. In other words, the pictures reflect the sounds. Check it:
OK. So that’s “live looping” — looping yourself as you sing. (Beardyman is possibly the best beatboxer in the world, so he’s got a massive advantage in this artform, but hey, amateurs can play too!)
Now. Here’s a totally sweet video of a dude who’s done a whole big bunch of gesture recognition as a frontend to Ableton Live, which is pretty much the #1 electronic music software product out there:
You can see plenty of other people are all over this general “gestural performance” space! In fact, given my limited hacking bandwidth, it’s entirely possible someone else will develop something almost exactly like what I have in mind and totally beat me to it. That would be fine — if I can play with their thing, then great! But working on it myself has already been very educational and promises to get much more so.
Here’s one more Kinect-controlled Ableton phenomenon. This one a lot more ambient in nature, and this guy is even using a Wiimote as well. He includes views of the Ableton interface:
So those are some of my inspirations here.
My concept for Holofunk, in a nutshell, is this: use a Kinect and a Wiimote to allow Beardyman-like live looping of your own singing/beatboxing, with a gestural UI to actually grab and manipulate the sounds you’ve just recorded. Imagine that dude in the second video had a microphone and was singing and recording himself even while he was dancing, and that his gestures let him manipulate the sounds he’d just made, potentially sounding a lot like that Beardyman video. That’s the idea: direct Kinect/Wiimote manipulation of the sounds and loops you’re making in realtime. If it still makes no sense, well, thanks for making the effort, and hopefully I’ll have some videos once I have something working!
Ideas Are Cheap, Champ
One thing I’ve deeply learned since starting at Microsoft is that big ideas are a dime a dozen, and without execution you’re just a bag of hot wind. So by brainstorming in public like this I run a dire risk of sounding like (or actually being) a mere poser. Let me first make very clear that all the projects above, that already actually work, are awesome and inspiring, and that I will be lucky if I can make anything half as cool as any one of them.
That said, I am going to soldier on with sharing my handwavey concepts and preliminary investigations, since it’s what I got so far. By critiquing these other projects in the context of mine, I’m only trying to be clear about what I’m thinking; I’m not claiming to have a “better idea”, just (what I think is) a different idea. And as I said, everyone else is free to jump on this concept, this is open source brainstorming right here!
The general thing I want to have, that none of the projects above have quite nailed, is a clear relationship between your gestures, your singing, the overall sound space, and the visuals. I want Holofunk to make visual and tangible sense. Loops should be separately grabbable and manipulable objects, that pulse in rhythm with the system’s “metronome”, and that have colors based on their realtime frequency. (So a bass line would be a throbbing red circle and a high-hat would be a pulsing blue ring.) It should be possible for people watching to see the sounds you are making, as you make them, and to follow what you’re doing as you add new loops and tweak existing ones. This “visual approachability” goal will hopefully also make it much easier to actually use Holofunk, not just watch it.
For an example of how this kind of thing can go off the rails, check out this video of Pixeljunk Lifelike, from a press demo at the E3 video gaming conference:
This is cool, but too abstract, as this review of the demo makes clear:
Then a man got up and began waving a Move controller, and we heard sounds. The screen showed a slowly moving kaleidoscope. I couldn’t tell how his movements impacted the music I was hearing or the images I was seeing. This went on for over 20 minutes and it felt like a lifetime.
Beardyman is also notoriously challenged to communicate what the hell he is actually doing on stage. He admits as much in this clip from him performing on Conan O’Brien (at 1:20):
My ultimate dream for Holofunk is to make it so awesomely tight that Beardyman himself could perform with it and people could more easily understand what the hell is going on as his piece trips them out visually as well as audially. That’s the ultimate goal here: make the audible visible, and even tangible. Holofunk.
(Now, realistically there’s no way Beardyman would actually do better with a single Wiimote than with four full KAOSS pads — he’s just got a lot more control power there. Still, let’s call it an aspirational goal.)
Ableton Might Not Cut It
I knew jack about realtime audio processing when I started researching all this last year. I actually started out by getting a copy of Ableton Live myself, since I figured that it already did all the sound processing I could possibly want, and more. People hacking it with Kinect are all over the net, too, and it’s got a very flexible external API. I fooled around with it at home, recording some tracks myself.
But the more I played with it, the more I started questioning whether it would ultimately be the right thing. Ableton was originally engineered on the “virtual synthesizer & patch kit” paradigm. It’s a track-based, instrument-based application, in which you assemble a project from loops and effects that are laid out like pluggable gadgets.
The problem is that the kind of live looping I have in mind for this project is going to have to be very fluid. Starting a new track could happen at the click of a button. Adding effects and warps is going to be very dynamic. Literally every Ableton-based performance I have seen is structured around creating a set of tracks and effects, and then manipulating the parameters of that set in realtime. Putting Kinect on top of Ableton seems to basically turn your body into a very flexible twiddler of the various knobs built into your Ableton set. The “Kin Hackt” video above shows the Ableton UI “under the hood”, but even the much more dynamic and involving “dancing DJ” above is still fundamentally manipulating a pre-recorded set of tracks (though he’s recording and looping his gestural manipulations of those tracks).
I was pretty sure that while I could get a long way with Ableton, I’d ultimately hit a wall when it came to really getting to slice up a realtime microphone track into a million little loops. So I was finding myself itching to just start writing some code, building callbacks, handling fast Fourier transforms, and just generally getting my hands directly on the samples and controlling all the audio myself. Perhaps it’s just programmer hubris, but I ultimately decided it was too risky to climb the full Ableton/Live/MAX learning curve only to perhaps finally discover it wouldn’t be flexible enough.
The second video above calls itself “live looping with Kinect and Ableton Live 8”, and it is live looping in that he’s obviously recording his own movements, such that the gestures he makes shape one of the tracks in his Ableton set, and he then loops the shaped track. Perhaps it would be trivial to add a microphone to the experience and loop a realtime-recorded track. Looks like I’ll be looking that dude up! But on my current path I’ll be building the sound processing in C# directly.
Latency Is Death: The Path To ASIO
When first firing up Ableton, with an M-Audio Fast Track Pro USB interface, I found things laggy. I would sing or beatbox into the microphone, and I would hear it back from Ableton after a noticeable delay. Just as a long-distance phone call can lead to people tripping over each other, even small amounts of latency are seriously annoying for music-making.
So latency is death. It turns out that Windows’ own sound APIs are not engineered for low latency, as they have a lot of intermediate buffering. The most common solution out there is ASIO, a sound standard from steinberg.net. There is a project named ASIO4ALL which puts out what amounts to a universal USB ASIO driver, enabling you to get low-latency sound input from USB devices generally. Installing ASIO4ALL immediately fixed the latency issues with Ableton. So it’s clear that given that I’m developing on Windows, ASIO is the way to go for low-latency sound input and output.,
On the latency front, it’s also worth mentioning this awesome article on latency reduction from Gamasutra. I will be following that advice to a T.
.NET? Are you crazy?
I’m going to be writing this thing in C# on Windows and .NET. The most obvious reason for this is I work for Microsoft and like Microsoft products. The less obvious reason is that I find C# a real pleasure to program in, and very efficient when used properly.
My boss is fond of pointing out that pointers are essentially death to performance, in that object references generally directly imply garbage collector pressure and cache thrashing, both of which are terrible. But in C#, with struct types, you can represent things much more tightly if you want. You can also avoid famous problems like allocating lambdas in hot paths.
In the particular case of Holofunk, the most critical thing to get right is the buffer management. I will need to make sure I know how much memory fragmentation I’m getting and how many buffers ahead I should allocate. My hunch is I’ll wind up allocating in 1MB chunks from .NET, and having a sub-allocator chop those up into smaller buffers I can reference with some BufferRef struct.
Anyway the point is that I know there are performance ratholes in .NET, but my day job has given me extensive experience at perf tuning C# programs generally, so I am not too concerned about it right now.
And, of course, Microsoft tools are pretty darn good compared to some of the competition. Holofunk will be an XNA app for Windows, giving me pretty much the run of the machine with a straightforward graphics API that can scale up as far as I’m likely to need. I’ve taken the classic “adapt the sample” approach to getting my XNA project off the ground, and I’m developing some minimal retained scene graph and state machine libraries.
What about Kinect?
Microsoft just released the Windows Kinect SDK beta, which is dead simple to use — maybe a page of code to get full skeletal data at 15 to 20 frames per second in C# (on my Core 2 Quad Q9300 PC from three years ago). So that’s the plan there.
It doesn’t support partial skeletal tracking, or hand recognition, or a variety of other things, and it has a fairly restrictive noncommercial license. But none of those are at all showstoppers for me, and the simplicity and out-of-the-box it-just-works factor are high enough to get me on board.
Why a Wiimote? And how?
I’ve mentioned “Wiimote” a few times. The main reason is simple: low-latency gesturing.
It’s no secret that Kinect has substantial latency — at least a tenth of a second or so, and probably more. What is latency? Death. So having Kinect be the only gestural input seems doomed to serious input lag for a music-making system. Moreover, finger recognition for Kinect is not available with the Microsoft SDK. I could be using one of the other open source robot-vision-based Kinect SDKs (there’s one from MIT that can do finger pose recognition), but that would still have large latency, and would require the Kinect to be closer to the user. I want this to be an arm-sweeping interface that you use while standing and dancing, not a shoulders-up interface that you have to remain mostly still to use.
I can’t see how to do a low-latency direct manipulation interface without some kind of low-latency clicking ability. That’s what the Wiimote provides: the ability to grab (with the trigger) and click (with the thumb), and a bunch of other button options thrown in there into the bargain.
A sketch of the interaction design (I am not an interaction designer, can you tell?) is something like this:
- Initial screen: a white sphere in the center of a black field, overlaid with a simple line drawing of your skeleton. Hands are circles.
- Sing into microphone: sphere changes colors as you sing.
- The central sphere represents the sound coming from your microphone.
- (First color scheme to try: map frequencies to color spectrum, and map animated spectrum to circle, with red/low in center and violet/high around rim.)
- Reach out at screen with Wiimote hand: see skeleton track.
- Move Wiimote hand over white sphere: hand circle glows, white sphere glows.
- Pull Wiimote trigger: white sphere clones itself; cloned sphere sticks to Wiimote hand.
- The cloned sphere is a loop which you are now recording.
- Sing into microphone while holding trigger: cloned sphere and central sphere both color-animate the sound.
- Release Wiimote trigger: cloned sphere detaches from Wiimote hand and starts looping.
- Letting go of the trigger ends the loop and starts it playing by itself. The new sphere is now an independent track floating in space, represented by an animated rainbow circle.
That’s the core interaction. And the key is that the system has to respond quickly to trigger presses. You really want to be able to flick the trigger quickly to make separate consecutive loops, and less latency in that critical gesture is going to make life much simpler.
So a Wiimote it is. Fortunately there is a .NET library for connecting a Wiimote to a PC via Bluetooth. It was written by the redoubtable Brian Peek, who, as it happens, also worked on some of the samples in the Windows Kinect SDK. This project would not be nearly as feasible without his libraries! I got a Rocketfish Micro Bluetooth Adapter at Best Buy, and the thing is shockingly tiny. With a bit of finagling (it seems to need me to reconnect the Wiimote from scratch on each boot), I was able to rope it into my XNA testbed.
You don’t really want to write a whole DSP library from scratch, do you?
Good God, no. Without Ableton Live, I need something to handle the audio. It has to play well with C#, and with ASIO. After a lot of looking around, multiple parties wound up recommending the BASS audio toolkit.
In my fairly minimal experimentation to date, BASS has Just Worked. It was able to connect to the ASIO4ALL driver and get sound from my microphone with low latency, while linked into my XNA app. So far it’s been very straightforward, and it looks like the right level of API, where I can manage my own buffering and let the library call me whenever I need to do something. It also supports all the audio effects I’m likely to need, and — should I want to actually include prerecorded samples — it can handle track acquisition from anywhere.
It also has a non-commercial license, but again, that’s fine for this project.
The Fun Begins… Now
So… that’s what I have. I feel like a model builder with parts from a new kit spread out all over the floor, and only a couple of the first pieces glued together. But I’m confident I have all the pieces.
Another thing I want to get right is I want Holofunk to record its performances, so you can play them back. This means not only the sounds, but the visuals. So I need an architecture that supports both free-form direct manipulation, and careful time-accurate recording of both the visuals and the sounds.
Over the next six months I will be steadily chipping away at this thing. Here’s a rough order of business:
- Get Kinect skeleton data into my XNA app
- Render minimal skeleton via scene graph based on Kinect dataa
- Integrate Wiimote data to allow hand gesturing
- Define “sound sphere” class (I think I might call them “loopies”)
- Support grabbing, manipulating loopies (interaction / graphics only, no sound yet)
- Performance recording:
- Define core buffer management
- Implement microphone recording
- Implement buffer splitting from microphone recording
- Define “Performance” class representing an evolving performance
- Define recording mechanism for streams of positional data (to record positions of Loopies)
- Holofunk comes to life
- Couple direct manipulation UI to recording infrastructure
- Result: can grab to make a new loopie, can let it go to start it playing
If I can get to that point by the end of the year, I’ll be happy. If I can get further, I’ll be very happy. Further means:
- Ability to click loopies to select them
- Press on loopies to move them around spatially
- Some other gesture (Wii cross pad?) to apply an effect to a loopie
- Push up and wave your Wiimote arm, and it bends pitch up and down
- Push right, and it applies a frequency filter, banded by your arm position (dubstep heaven)
- Push down, and it lets you scratch back and forth in time (latency may be too high for this though)
- Hold the trigger while doing such gestures, and the effect gets recorded
- This lets you record effects on existing loopies
- Segment the screen into quarters; provide affordances for muting/unmuting a quarter of the screen, merging all loopies in that quarter, etc.
- This would let you do group operations on sets of sounds
AND THEN SOME. The possibilities are pretty clearly limitless.
My most sappily optimistic ambition here is that this all becomes a new performance medium, a new way of making music, and that many people find it approachable and enjoyable. Let’s see what happens. Thanks for reading… and stay tuned!
Last week I griped about some hassles I was having with Seam 2, Java 6, Tomcat, and JBoss Embedded. I grumpily grouched about the grim gloom of an open source world where things break and people don’t get along very well, and how non-fun it is.
And hey presto, what should happen but two Red Hatters read my kvetching and chimed in with some downright helpfulness. First, Jason Greene mentioned this workaround for the bizarre Java 6 ClassNotFoundException I mentioned. Seems there is a magic JVM flag you can provide to revert to the Java 5 classloading behavior if it causes you too much trouble. That’s good to know, and I appreciated the clue.
Second, Chris Alfonso — who’s been asking me when I’m going to get around to fixing my GWT/JSF/Seam demo to work with Seam 2 — said that he’s got a Maven 2 version of the GWTJSF demo building as an EAR that runs under JBoss with EJB3. Which is cool, because I now have a plain old Hibernate version of the Seam 2 blogging demo that deploys under vanilla Tomcat, so it’s nice to have both flavors. (Chris, can you send me your version? I’d like to poke at it some — I won’t post it without your say-so.)
So that’s the great thing about the open source community — it is a community, and people offer each other helping hands. Thanks for brightening my week, guys! You put the fun back in 🙂
One of the tough things about open source is that the world never stands still. Actually, it’s not just open source, it’s software in general, but open source exacerbates it. See, one of the main motivations for open source developers is fun and the enjoyment of sharing something cool you did. So you’d like to be able to do something cool and then have it stay cool.
But the problem is, once you get something cool working, it’s almost guaranteed that it’s going to stop working soon. Why? Because everyone else is doing other cool things. Including all the developers whose projects yours depends on. So the odds approach 100% that before long your cool thing is going to break with the latest version of FooBarLib. And broken things are no longer cool.
And fixing your cool thing, that broke for no fault of your own and for reasons you don’t agree with, is the absolute opposite of fun.
Even large developers are vulnerable to this. Right now I’m banging my head against trying to get my Seam + GWT example working with the latest 2.0 release of Seam. But Seam made a number of changes that are frustrating me rather deeply right now. For one thing, they changed the packaging for their embeddable EJB support such that you have to install the JBoss embedded EJB container directly onto your Tomcat. No more delivering a nice simple WAR file that uses Seam and EJB3, now it’s a multi-step process that leaves your Tomcat fairly JBossified.
The nominal rationale for doing this is that it lets the EJB3 services be shared by all the apps inside that Tomcat instance. So you might think, great, now my Seam+EJB3 WAR file is 6MB instead of 20MB. (Which is indeed a problem with the demo version on my page — it’s stinkin’ HUGE.) But, BUT, it turns out you can’t actually run multiple Seam apps under the Embedded EJB3 container!
Why not? Well, because Bill Burke and Gavin King, both of whom work at JBoss and both of whom are fairly bullheaded, can’t agree on whose problem it is that multiple Seam apps collide in the Embedded EJB3 component registry. Not only that, but the Embedded EJB container’s development has stalled. So, Tomcat is a seriously second-class citizen as far as Seam is concerned now. Of course from a JBoss perspective this is arguably good, because JBoss doesn’t get support revenue from Tomcat installations. For me, though, it sucks, because I don’t want my Seam / GWT example to require JBoss.
And the icing on the cake is that the Embedded JBoss container doesn’t run under Java 6. I like to run on the latest Java. It’s just a funny preference I have, I don’t know why. But JBoss does classloader tricks that Sun apparently considers invalid loopholes, which Sun closed in Java 6. This results in charming errors like:
ERROR 01-11 21:56:07,921 (AbstractController.java:incrementState:456) -Error installing to Instantiated: name=DeploymentFilter state=Described java.lang.IllegalStateException: Class not found: [Ljava.lang.String;
And JBoss and Sun are finger-pointing about who broke who. So Seam isn’t at all guaranteed to run under Java 6 in any way, and JBoss and the Seam team consider it Sun’s problem, not theirs.
So what am I going to do? I’ve got a conference presentation coming up and I wanted to use this demo with the latest Seam and latest GWT. But it’s starting to look like I’m going to have to sink way too much time into mucking about with Seam for reasons I’d rather not have anything to do with. I’ll probably follow the path of least resistance, which is just to roll back to Java 5 and grin and bear it. It’s definitely demotivating, though. Definitely. Demotivating.
A while ago (circa 2000), Rob Pike wrote a rant about how systems software research is becoming irrelevant, due to the difficulty in getting new operating systems adopted.
Since then, he’s gone on to work at Google, where systems software research is the lifeblood of one of the (if not the) most important Internet sites in history. And he’s happily doing plenty of systems software research that’s become fundamental to the company’s operations.
So his original concern about the irrelevance of operating system research got effectively sidelined, because the action moved from single-machine operating systems to wider distributed systems, especially as used at Google. And Google is as good as anyone at turning research ideas into production practice.
Meanwhile, Jim Waldo at Sun last year wrote another paper — a little less of a rant, but not much — about how systems software design is suffering greatly, largely from the lack of opportunity to learn from experience. Waldo makes good points about the difficulty of teaching system design except through example and experience.
His main concern is that opportunity to learn by doing is very hard to come by. In academia, systems tend to be small and rapidly discarded, due to the need to publish frequently and produce results quickly. In industry, systems tend to be proprietary, encrusted by patents, and impossible to discuss or talk about publicly. This leaves only limited latitude for public construction or discussion of systems large enough and interesting enough to really learn from.
Waldo suggests that open source projects are one of the few ways out of this dilemma. They are in many cases fairly large in scope, they are fully visible to anyone wishing to critique, extend, or adapt them, and they provide not only a code base but (in the best cases) a community of experienced designers from whom new contributors can learn. They therefore are in some ways the best hope for spreading effective education about system design, being unencumbered by either the short-term problems of academia or the proprietary problems of industry.
Recently, coincidentally enough, some Googlers working on the Google lock service — a key part of Google’s distributed infrastructure — wrote a paper describing their experiences building a production implementation of the Paxos protocol for distributed consistency. What’s especially interesting about this paper is how neatly it both decries and embodies the very dilemma Waldo is talking about.
The Google Paxos paper has a lot of extremely interesting technical content in its own right. It’s one of my favorite types of papers — a discussion of problems encountered when trying to take compelling theory and make it into something that really works in a live system. Without that kind of effort, excellent ideas never actually get their chance to make a difference in the world, because until they’re embodied in a real system, they can’t deliver tangible value. So this paper is very useful to anyone working on implementations of the Paxos protocol — it’s exactly the kind of experience that Waldo wishes more people could learn from.
The writers themselves have the following gripes:
Despite the large body of literature in the field, algorithms dating back more then 15 years, and experience of our team (one of us has designed a similar system before and the others have built other types of complex systems in the past), it was significantly harder to build this system then originally anticipated. We attribute this to several shortcomings in the field:
- There are significant gaps between the description of the Paxos algorithm and the needs of a real-world system. In order to build a real-world system, an expert needs to use numerous ideas scattered in the literature and make several relatively small protocol extensions. The cumulative effort will be substantial and the final system will be based on an unproven protocol.
- The fault-tolerance computing community has not developed the tools to make it easy to implement their algorithms.
- The fault-tolerance computing community has not paid enough attention to testing, a key ingredient for building fault-tolerant systems.
As a result, the core algorithms work remains relatively theoretical and is not as accessible to a larger computing community as it could be. We believe that in order to make a greater impact, researchers in the field should focus on addressing these shortcomings.
The ironies here are so deep it’s hard to know where to start. Their implementation itself is not only proprietary to Google (and not open sourced), but it also relies on many other proprietary Google systems, including the Google file system. Hence their work itself is not directly available to the wider community for development and further discussion! Their paper has a number of interesting allusions (such as exactly why they needed to make their local log writing multi-threaded) that are not followed up. Unless they write many more papers, we will never know all the details of how their system works.
They criticize the fault-tolerant systems community for not having provided a more solid experience base from which to build. Waldo’s paper makes it crystal clear exactly why this base has been lacking: where is it to come from? Not from academia; research projects in academia tend to be too short-term and too small in scope to encounter the kinds of issues the Googlers did. And not from industry; Google is not known for open sourcing its core distributed software components, yet Google is arguably ahead of anyone else in this area!
The only alternative would be a true open source project. But large-scale distributed systems are probably among the least likely to achieve real momentum as an open source project, because actually using and testing them requires substantial dedicated hardware resources (many of the failure cases the Google team encountered arise only after running on dozens or hundreds of machines), and those resources are not available to any open source projects I’m aware of.
The Googlers are part of the problem, even while their paper seeks to be part of the solution. To some extent it’s a chicken-and-egg dynamic; without access to a truly large pool of machines, and a truly demanding set of developers and applications, it’s hard to get real-world experience with creating robust distributed infrastructure — but you almost have to be inside a large business, such as Google, in order to have such access at all.
So, unfortunately, it would appear that in the near term the Googlers are doomed to disappointment in their expectations of the research community. Google itself is likely to remain the preeminent distributed systems research center in the world, and the fewer of its systems it open sources, the less assistance the rest of the world will be able to provide it.
One can only hope that several years from now, Google’s applications will have evolved so greatly on top of its base infrastructure that it will no longer consider the fundamental systems it uses — MapReduce, BigTable, GFS, Chubby — to be key competitive differentiators, and will choose to open source them all. Of course, by then Google’s real difficulties will still be with problems the rest of the world wishes they had the resources to encounter….
A coda to this: John Carmack, of id software and Armadillo Aerospace fame, is known for open sourcing his game engines after five years or so. Recently he’s been doing work in mobile games, cellphone programming. Here’s a quote from a liveblog about his keynote at Quakecon last week:
Met with mobile developers at the Apple thing, all talking about how they make mistakes all the time. Carmack: “Can’t the guys who made the mistakes the first time just make the chips right this time?” Other devs: “Yeah, but most of those guys are too rich to care anymore.”
So that’s the other reason the field doesn’t make good progress… proprietary stuff gets built, developers get rich, technology gets sold and eventually back-burnered, and then it all has to get reinvented all over again. Open source: the only way to not reinvent the wheel every five years!