robjsoftware.org

A blog about software – researching it, developing it, and contemplating its future.

The Future Is Further Than You Think

leave a comment »

I failed to give a demo to Thomas Dolby tomorrow.  He’s visiting Microsoft, but my amazing music project Holofunk is currently not working, so I’ve missed a chance to show it to one of my heroes.  (I banged on it last night and it wouldn’t go, and I’ve learned from bitter experience that if it ain’t working 48 hours before go time, it’s time to bail.)

And therein lies a tale.  If I can’t have the fun with Mr. Dolby, I can at least step back and think about exactly why not, and share that with you all, many of whom have been my supporters and fans on this project for quite a few years.  Sit back and enjoy, this is going to be a bit long.

Holofunk: Amazing, yet Broken

Holofunk at its best is a pretty awesome wave-your-hands-and-sing-and-improvise-some-surprising-music experience.  I’ve always wanted to try to get it out into the world and have lots of people play with it.

But it uses the Kinect.  The Kinect is absolutely key to the whole experience.  It’s what sees you, and knows what you’re gesturing, and knows how to record video of you and not the background.  It’s a magic piece of hardware that does things nothing else can do for the price.

And the Kinect is a failure.  As astonishing as it is, it is still not quite good enough to be really reliable.  Holofunk has always had plenty of glitches and finicky interactions because of it, and really rapid motions (like video games tend to want you to make) confuse it.  So it just hasn’t succeeded in the marketplace… there are almost no new games or indeed any software at all that use it.  It doesn’t fully work with modern Windows 10 applications, and it’s not clear when it will.

Moreover, Holofunk also uses an audio interface standard called ASIO, that is actually quite old.  Support for it varies widely in quality.  In fact, support for sound hardware in general varies widely in quality… my current miseries are because my original audio interface (a Focusrite Scarlett 6i6) got bricked by a bad firmware upgrade, with tech support apparently unable to help; and an attempted replacement, a TASCAM US-2×2, has buggy drivers that blue-screen when I run my heavily multi-threaded, USB-intensive application.

So the bottom line here is:  Holofunk was always more technically precarious than I ever realized.  It’s probably kind of a miracle that I got it working as well as I did.  In the two-plus years since my last post, I actually did quite a lot with it:

  • Took it to a weekend rave in California, and got to play it with many of my old school techno friends from the nineties.
  • Demonstrated it at the Seattle Mini Maker Faire, resulting in the video on holofunk.com.
  • Got invited to the 2015 NW Loopfest, and did performances with it and a whole lineup of other loopers in Seattle, Portland, and Lincoln City.
  • Played it at an underground art event sponsored by local luminary Esse Quam Videri, even getting people dancing with it for the first time!

But then Windows 10 was released, and I upgraded to it, and it broke a bunch of code in my graphics layer.  I don’t hold this against Microsoft — I work for Microsoft in the Windows org, and I know that as hard as Microsoft works to keep backwards compatibility, there are just a lot of technologies that simply get old.  So after some prolonged inertia in the first half of 2016, I finally managed to get the graphics fixed up… but then my audio hardware problems started.

It’s now clear to me that:

  • Holofunk is an awesome and futuristic experience that really does represent a new way to improvise music.
  • But it’s based on technologies that are fragile and/or obsolescing.
  • So in its current form, I need to realize that it’s basically my baby, and that no one else is realistically going to get their hands on it.
  • Nonetheless, it is a genuine glimpse of the future.

And all of this leads me to the second half of this post:  I realized this morning that Holofunk is turning out to be a microcosm of my whole career at Microsoft.

Inventing The Future Is Only The Beginning

It’s often said about art that it cannot exist without an audience.  That is, art is a relationship between creator and audience.

I have learned over my career that a similar truth holds for technology.  Invention is only the beginning.  Technology transfer — having people use what you’ve invented — is in some ways an even harder problem than invention.

I got hired into Microsoft to work on the Midori technical incubation project.  I started in 2008, and we beavered away on an entirely new operating system (microkernel right up to web browser) for seven years, making staggering amounts of progress.  We learned what was possible in a type-safe language with no compromises to either safety or performance.

We got a glimpse of the future.

But ultimately the piper had to be paid, and finally it was time to try to transfer all this into Windows… and the gap between what was there already, and what we had created, was unbridgeable.  It’s not enough to make something new and wonderful.  You have to be able to use that new wonderful thing together with all the old non-wonderful stuff, because technology changes slowly and piece by piece (at least if you are in a world where your existing customers matter, as Microsoft very, very much is).

So now I am working on a team that is taking Midori insights and applying them incrementally, in ways that make a difference here and now.  It’s extremely satisfying work, since even though in some sense we’re reinventing the wheel, this time we get to attach it to an actual car that’s already driving down the road!  Shipping code routinely into real products is a lot more satisfying than working in isolation on something you’re unsure will actually get used.

The critical part here is that we have confidence in what we’re doing, because we know what is possible.  The invaluable Midori glimpse of the future has given us insights that we can leverage everywhere.  This is not only true for our team; the Core CLR team is working on features in .NET that are based entirely on Midori experience, for example Span<T> for accessing arbitrary memory with maximal efficiency.

So even though it is going to take a lot longer than we originally thought, the ideas are living on and making it out into the world.

Holofunk in Metamorphosis

Microsoft has learned the lessons of the Kinect with its successor, HoloLens.  The project lead on both Kinect and HoloLens is very clear that HoloLens is going to stay baking in the oven until it is irresistibly delicious.  The Kinect launched far too soon, in his current view.  It was a glimpse of the future, and it gave confidence in what was possible, but the world — and all the software that would make its potential clear — was not ready.

So now I view Holofunk as part of the original Kinect experiment.  Holofunk in its current form may never run very well (or even at all) for anyone other than me, or on anything other than the finicky hardware I hand-pick.

But now that I’ve admitted this to myself, I am starting to have a bajillion ideas for how the overall concepts can be reworked into much more accessible forms.  For instance:

  • Why have the Kinect at all?  The core interaction in Holofunk is “grab to record, let go to start looping.”  So why not make a touchscreen looper that lets you just touch the screen to record, and stop touching to start looping?
  • Moreover, why not make it a Universal Windows Platform application?  If I can get that working without the ASIO dependency, suddenly anyone with a touch-screen Windows device can run it.
  • I can also port it to HoloLens.  I can bring Unity or some other HoloLens-friendly 3D engine in, so now I will be free of the graphics-layer issues and I’ll have something that’ll work on HoloLens out of the box.
  • I can start on support for networking, so multiple people could share the same touch-screen “sound space” across multiple devices.  I really would need this for HoloLens, as part of the Holofunk concept has always been that an audience can see what you are doing, but HoloLens will have no external video connections for a projector or anything.  So a separate computer (perhaps with a Kinect!) will need to be involved to run the projector, and that means networking.

All of these ideas are much, much more feasible given my existing Holofunk code (which has a bunch of helpful stuff for managing streams of multimedia data over time), and my Holofunk design experience (which has taught me more about gestural interface design than many people have ever learned, all of which will be immediately applicable to a HoloLens version).

I’ve had a glimpse of the future.  It’s a fragile glimpse, and one which I can’t readily share.  But now that I’ve accepted that, I can look towards the next versions, which if nothing else will be much easier for the rest of the world to play with.

Holofunk is going into a cocoon, and what emerges is going to be something quite different.

Thanks to everyone who’s enjoyed and supported me in this project so far — it’s brought me many new friends and moments of genuine musical and technical joy.  I look forward to what is next… no matter how long it takes!

 

 

Written by robjellinghaus

2016/10/20 at 08:40

Posted in Uncategorized

Stuff Been Happenin’

leave a comment »

[This post was originally written in April 2014, but is only now getting posted in July 2014. Hobby projects are like that!]

Hacking stuff, that is!  Holofunk stuff, to be precise!  Working code, to be sure!

I’ve drifted into posting more on Facebook and on one particular forum of my long acquaintance, so now I get to be a bit lazy and sum up those scraps of news a little more centrally and with more detail. 

In January I got the lead out and ported Holofunk to the new Kinect.  This required first porting it to x64.  The main casualty there was that VST support — sound effect plugin support — stopped working, for no reason I could sort out after a couple of nights.  That’s the maximum length of time I can spend on any blocking issue before I start looking for a workaround.  So now I’m just using the sound effects built into the BASS audio library, which are quite sufficient for the time being.

I did get it working again, and modulo a couple of performance issues that I’m discussing with them, it’s pretty stunning.  I was able to get green-screened color video working in almost no time based on their sample code, for multiple players; this was impossible with the first Kinect.

Hand Pose, At Last

Much more importantly, though, I started getting hand pose data.  And the hand pose data is fast and reliable.  There are only three hand poses supported — open (all fingers spread), pointing (one or two fingers pointing, the rest in a fist), and closed (a plain fist).  It takes a little getting used to, as far as making rapid and clear transitions from one pose to another; but with just a little practice it gets very fluid.

Three hand poses is kind of like having a mouse that has a three-position switch on it… it’s not a whole lot to work with, but it’s enough.  I started brainstorming gestural interfaces, and my wife MIchelle helped me take some notes:

[picture of HF brainstorming notes]

The basic idea is this:

  • You open your hand to get the app’s attention.  (This is “armed” state, internally.)
  • You can then close your hand to start recording yourself; as long as your hand is closed, the recording continues.  When you open your hand, you “drop” the recording at that spot on the screen.

That’s the most basic interaction:  make a fist to record a loop, then let go to play it.

  • You can also point, to enter “pointing mode.”  Basically, each hand has its own “pointing mode” that determines what will happen when that hand points.
  • The default “pointing mode” is “mute/unmute.”  In this mode, you point at a sound or group of sounds, and you make a fist to mute them, or open your hand to unmute them.  If you mute some muted sounds, they get deleted altogether.  This gives you the ability to bring loops in and out.
  • Another “pointing mode” is “sound effects.”  In this mode, you point at a sound or group of sounds, and then you move your hand up/down/left/right to apply one of four sound effects (one per direction).  I prototyped this interface with Holofunk 1.0 and it works OK, so I’m bringing that forwards.
  • There can be multiple “sound effects” modes with different combinations of sound effects.

I’ve implemented the “mute/unmute” behavior and it’s pretty incredible — the hand recognition is fast enough that you really feel like you’re grabbing a bunch of sounds and then shushing them by squeezing them, then opening your hand again to bring them back to life.

So how do you change modes?  My main insight here was that I wanted some kind of “chord” gesture — in a conventional interface you’d have shift-click, or control-click, or something.  So what could be a modifier for the pointing gesture?  I had already implemented radial popup menus, I just needed a way to invoke them.

What I came up with was to use body pose as a modifier.  Specifically, if you put your hand on your hip (akimbo, in other words), then when you point with the other hand, you get a popup menu that lets you pick the pointing mode for that hand.  So you just put your hand on your hip, point your other hand at “effect mode”, grab that menu item, and now that other hand is in effect mode.  It’s natural and feels quite good.  Putting your hand behind your back (rather than elbow-out akimbo) means you’ll get the system popup menu, with commands like “Delete all” and “Change tempo”.

Body pose is modal.  This is your NUI koan for the day.

This combination of hand pose (for pointing and picking), body pose (for modifying that pointing/picking), and per-hand interaction mode means that the interface is truly ambidextrous:  both hands can perform independent gestures simultaneously.  You could have one hand applying reverb/flange/chorus/delay and the other hand applying volume/pan, or one hand muting and unmuting while the other hand tweaks sound effects, or whatever you like.

Right now I have the popup menus coming up, but not interacting properly — some minor issue, I think.  Will be fixing that very soon.

My current code has a nice little hierarchical state machine for the per-hand interaction, so I have two independent state machines, one per hand.  Previously, an event — such as the user pointing — would always cause a transition to a fixed new state.  But in the new interface, pointing while the other hand is on the hip should bring up a popup menu; pointing while in mute/unmute mode should enter mute/unmute state; and so forth.

All I needed to implement this, it turned out, was a “computed transition” that would run some code to determine a target state, rather than using a fixed target state.  This was a very simple thing to add, and wound up perfectly expressing both pointing modes and body-pose modes.

Two-Handed Interaction

Now, having an ambidextrous, bilateral interface is all very well, but not all interactions involve only one hand.  Some interactions want two.  For example, using two hands to drag out a selection rectangle for sound grouping.  Or, dragging out a time distortion envelope for time mapping.

I have two independent state machines, one per hand.  Fine if the hands are independent, but what if they’re not?  Do both hands need to be in the same state, somehow?  How do you coherently use two state machines for one interaction?  It all just felt wrong and ugly and hacky, a sure sign that I needed to sleep on it some more.  When this project gets stalled, it’s either because I don’t have a working brain cell left, or because I haven’t got a clear simple picture of how it should work.  And I don’t have the spare time to write vague code and then have to debug it! 

I plan to build a state machine hierarchy in which a “body” state machine can look at both hands, and if it does not want to consume the state of both hands in a two-handed way, it can delegate the state of each hand to a lower-level, per-hand state machine. State machine delegation, in other words. I think it will work well… once I get there.

Written by robjellinghaus

2014/07/30 at 00:47

Posted in Holofunk

Tagged with ,

To John Carmack: Ship Gamer Goggles First, Then Put Faces And Bodies Into Cyberspace

with 3 comments

John Carmack, CTO of Oculus Rift, tweeted:

Everyone has had some time to digest the FB deal now. I think it is going to be positive, but clearly many disagree. Much of the ranting has been emotional or tribal, but I am interested in reading coherent viewpoints about objective outcomes. What are the hazards? What should be done to guard against them? What are the tests for failure? Blog and I’ll read.

I have already blogged on this but will make this a more focused response to John specifically.

Here are my objective premises:

  • VR goggles, as currently implemented by the Rift, conceal the face and prevent full optical facial / eye capture.
  • VR goggles, ACIBTR, conceal the external environment — in other words, they are VR, not AR.
  • Real-time person-to-person social contact is primarily based on nonverbal (especially facial) expression.
  • Gamer-style “alternate reality” experiences are not primarily social, and are based on ignoring the external environment.

Here are my conclusions:

  • A truly immersive social virtual environment must integrate accurate, low-latency, detailed facial and body capture.
    • Therefore such an environment can’t be fundamentally based on opaque VR goggles, and will require new technologies and sensor integrations.
  • Opaque VR goggles are, however, ideal for gamer-style experiences.
    • Gamer experiences have never had full facial/body capture, and are based on ignoring the external environment.
  • The more immersive such experiences are, the more people will want to participate in them.
    • This means that gratuitous mandatory social features, in otherwise unrelated VR experiences, would fundamentally break that immersion and would damage the platform substantially.
  • Goggle research and development will mostly directly benefit “post-goggle” augmented reality technology.

The hazards for Rift:

  1. If Facebook’s monetization strategy results in mandatory encounters with Facebook as part of all Rift experiences, this could break the primary thing that makes Rift compelling: convincing immersion in another reality that doesn’t much overlap with this one.
  2. If Facebook tries to build an “online social environment” with Rift as we have historically known them (Second Life, Google Lively, PlayStation Home, Worlds Inc., etc., etc., etc.), it will be as niche as all those others. Most importantly, it will radically fail to achieve Facebook’s ubiquity ambitions.
    • This is because true socializing requires full facial and nonverbal bandwidth, and Rift today cannot provide that. Nor can any VR/AR technology yet created, but that’s the research challenge here!
  3. If Facebook and Rift fail to pioneer the innovation necessary to deliver true augmented social reality (including controlled perception of your actual environment, and full facial and body capture of all virtual world participants), some other company will get there first.
    • That other company, and not Facebook, will truly own the future of cyberspace.
  4. If Rift fails to initially deliver a deeply immersive alternate reality platform, it will not get developers to buy in.
    • This risk seems smallest based on Rift’s technical trajectory.

What should be done to guard against them:

  1. Facebook integration should be very easy as part of the Rift platform, but must be 100% completely developer opt-in. Any mandatory Facebook integration will damage your long-term goals (creating the first true social virtual world, requiring fundamentally new technology innovation) and will further lose you mindshare among those skeptical of Facebook.
  2. Facebook should resist the temptation to build a Rift-based virtual world. I know everyone there is itching to get started on Snow Crash, and you could certainly build a fantastic one. But it would still be fundamentally for gamers, because gamers are self-selected to enjoy surreal online places that happen to be inhabited by un-expressive avatars.
    • The world has lots of such places already; they’re called MMOGs, and the MMOG developers can do a better job putting their games into Rift than Facebook can.
  3. Facebook and Rift should immediately begin a long-term research project dedicated to post-goggle technology. Goggles are not the endgame here; in a fully social cyberspace, you’ll be able to see everyone around you (including those physically next to you), faces, bodies, and all. If you really want to put your long-term money where your mouth is, shoot for the post-goggle moon.
    • Retinal projection glasses? LCD projectors inside a pair of glasses? Ubiquitous depth cameras? Facial tracking cameras? Full environment capture? Whatever it takes to really get there, start on it immediately. This may take over a decade to finally pan out, but you have the resources to look ahead that far now. This, and nothing less, is what’s going to make VR/AR as ubiquitous as Facebook itself is today.
  4. Meanwhile, of course, ship a fantastic Rift that provides gamers — and technophiles generally — with a stunning experience they’ve never had before. Sell the hardware at just over cost. Brand it with Facebook if you like, but try to make your money back on some small flat fee of title revenue (5%? as with Unreal now?), so you get paid something reasonable whether the developer wants to integrate with Facebook or not.

Tests for failure:

  1. Mandatory Facebook integration for Rift causes developers to flee Rift platform before it ships.
  2. “FaceRift” virtual world launches; Second Life furries love it, rest of world laughs, yawns, moves on.
  3. Valve and Microsoft team up to launch “Holodeck” in 2020, combining AR glasses with six Kinect 3’s to provide a virtual world in which you can stand next to and see your actual friends; immediately sell billions, leaving Facebook as “that old web site.”
  4. Initial Rift titles make some people queasy and fail to impress the others; Rift fails to sell to every gamer with a PC and a relative lack of motion sickness.

John, you’ve changed the world several times already. You have the resources now to make the biggest impact yet, but it’s got to be both a short-term (Rift) and long-term (true social AR) play. Don’t get the two confused, and you can build the future of cyberspace. Good luck.

(And minor blast from the past:  I interviewed you at the PGL Championships in San Francisco fifteen years ago.  Cheers!)

Written by robjellinghaus

2014/03/28 at 12:25

Posted in Uncategorized

My Take On Zuckey & Luckey: VR Goggles Are (Only) For Gamers

with 2 comments

I am watching the whole “Facebook buys Oculus Rift” situation with great bemusement.

I worked for a cyberspace company — Electric Communities — in the mid-nineties, back in the first heady pre-dot-com nineties wave of Silicon Valley VC fun.

We were building a fully distributed cryptographically based virtual world called Microcosm. In Java 1.0. On the client. In 1995. We had drunk ALL THE KOOL-AID.

Image

(Click that for source.  For a scurrilous and inaccurate — but evocative — take on it all, read this.)

We actually got some significant parts of this working — you could host rooms and/or avatars and/or objects, and you could go from space to space using fully peer-to-peer communication. Because, you see, we envisioned that the only way to make a full cyberspace work was for it to NOT be centralized AT ALL. Instead, everyone would host their own little bits of it and they would all join together into an initially-2D-but-ultimately-3D place, with individual certificates on everything so everyone could take responsibility for their own stuff. Take that, Facebook!!!

(I still remember someone raving at the office during that job, about this new search engine called Google… the concept of “web scale” did not exist yet.)

The whole thing collapsed completely when it became clear that it was too slow, too resource-intensive, and not nearly monetizable enough. I met a few lifelong friends at that job though, quite a few who have gone on to great success elsewhere (Dalvik architect, Google ES6 spec team member, Facebook security guru…).

I also worked at Autodesk circa 1991, in the very very first era of VR goggles, back when they looked like this:

virtual_realityx299

Look familiar?  This was from 1989.  Twenty-five frickin’ years ago.

So I have a pretty large immunity to VR Kool-Aid. I actually think that Facebook is likely to just about get their money back on this deal, but they won’t really change the world. More specifically, VR goggles in general will not really change the world.

VR goggles are a fundamentally bad way to foster interpersonal interaction, because they obscure your entire face, and make it impossible to see your expression. In other words, they block facial capture. This means that they are the exact worst thing possible for Facebook, since they make you faceless to an observer.

This then means that they are best for relatively solitary experiences that transport you away from where you are. This is why they are a great early-adopter technology for the gamer geeks of the world. We are precisely the people who have *already* done all we can to transport ourselves into relatively solitary (in terms of genuine, physical proximity) otherworldly experiences. So VR goggles are perfect for those of us who are already gamers. And they will find a somewhat larger market among people who want to experience this sort of thing (kind of like super-duper 3D TVs).

But in their current form they are never going to be the thing that makes cyberspace ubiquitous. In a full cyberspace, you will have to be able to look directly at someone else *whether they are physically adjacent or not*, and you will have to see them — including their full face, or whatever full facial mapping their avatar is using — directly. This implies some substantially different display technology — see-through AR goggles a la CastAR, or nanotech internally illuminated contact lenses, or retinally scanned holograms, or direct optical neural linkage. But strapping a pair of monitors to your eyeballs? Uh-uh. Always going to be a “let’s go to the movies / let’s hang in the living room playing games” experience; never ever going to be an “inhabit this ubiquitous cyber-world with all your friends” experience.

Maybe Zuckerberg and Luckey together actually have the vision to shepherd Oculus through this goggle period and into the final Really Immersive Cyberworld. But my guess is the pressures of making enough money to justify the deal will lead to various irritating wrongnesses. Still, I expect they will ship a really great Oculus product and I may even buy one if the games are cool enough… but there will be goggle competitors, and it’s best to think of ALL opaque goggle incarnations as gamer devices first and foremost.

So why did Zuckerberg do this deal? I think it’s simple: he has Sergey Brin envy. Google has its moon-shot projects (self-driving cars, humanoid robots, Google Glass). Zuckerberg wants a piece of that. It’s more interesting than the Facebook web site, and he is able to get his company to swing $2 billion on a side project, so why not? Plus he and Luckey are an epic mutual admiration society. That psychology alone is sufficient explanation. It does lead to the amusingly absurd paradox of Facebook spending $2 billion on something that hides users’ faces, but such is our industry, and such has it ever been.

Realistically, the jury is still out on whether Oculus would have been better off going it alone (retaining the love of their community and their pure gaming focus, but needing to raise more and more venture capital to ramp up production), or going with Facebook (no more worries about money, until Facebook’s ad-based business model starts to screw everything up). The former path might have cratered before launch, or succumbed to deeper-pocketed competitors. The latter path has every chance of going wrong — if Facebook handles things as they did their web gaming efforts, it definitely will. We will see whether Zuckerberg can keep Facebook’s hands off of Oculus or not. I am sadly not sanguine… on its face, this is a bad acquisition, since it does not at all play to the technology’s gaming strengths.

It’s worth noting Imogen Heap’s dataglove project on Kickstarter.  I was skeptical that they would get funded, but their AMA on Reddit convinced me they are going about it the best way they can, and they have a clear vision for how the things should work.  So now I say, more power to them!  Go support them!  They are definitely purely community-driven, the way Oculus was until yesterday….

Written by robjellinghaus

2014/03/26 at 08:48

Life Is What Happens When You’re Making Other Plans

leave a comment »

I didn’t exactly expect to take a nine-month blogging hiatus, but the last while has been considerably more exciting than I expected.

Help Wanted

Let me place my main request up front:  I am interested in self-publishing some software, specifically a Windows application that uses the new Kinect sensor for video and audio live looping. 

I have asked people I trust about how to go about doing this (creating an LLC or other corporation, filing taxes, etc.), and I have been told that I need to talk to an accountant and a lawyer, both of whom are licensed in Washington state. 

So: game developers, indie developers, and general friends in the Seattle area, what accountants and/or lawyers would you recommend for someone getting involved in software self-publishing?  All referrals are welcome.  Thanks very much indeed!

Where The Heck I’ve Been

First, last spring, I put sound effects into Holofunk sure enough.  I showed it at the Seattle Mini Maker Faire with some colleagues in the Kinect for Windows booth, which was awesome — managed to spiel it for seven straight hours without driving everyone crazy, so that was good!

Then, in the late spring, I went to a presentation at work about a program to send computer science workers into schools to teach AP computer science.  It’s called TEALS, for Technology Education And Literacy in Schools:  http://tealsk12.org and it is quite an amazing thing. 

During that presentation I saw a picture of Bill Gates looking at an older boy on a Teletype:

As it happens, that is the same device (ASR-33) that I typed my first BASIC programs on at about the same age Bill is in that picture.  I got surprisingly choked up, and realized how big a deal computers were for me back then.  So I knew I had to sign up.  I did so, and committed to teaching at Lindbergh High School in Renton.

Soon thereafter, I re-did the Holofunk sound effects interface, and had some friends over to play with it.  And it was EPIC.  Given that the majority of them had never seen it before, everyone was creating some pretty intense noises quite quickly.  I have some very raw video (dangling cables, the whole bit) here, if you are feeling experimental.

Everyone had so much fun with it that I realized two things:

1) Holofunk can and should be an actual product.

2) Holofunk needs the new Kinect sensor.

Now, realizing this just after having committed to become a half-time one-period high school teacher for the year — on top of intense day job — is not exactly anything like the best timing.  And there were various other complications with the teaching, including my original co-teacher having to drop out of the program after school had already started, an extraordinarily stressful situation.  Fortunately it worked out for the best… but it has left me no time for anything but work, family, and teaching.

Until now!  I would not be writing this if I weren’t finally regaining some energy for hacking.  And I applied for, and got into, the new Kinect for Windows beta program.  I will be ordering the new sensor tomorrow, in fact, and hopefully receiving it sometime late this month, which is extraordinarily motivating!

I’ve already registered holofunk.com and holofunk.org, and it’s time to create Holofunk LLC (or something, subject to legal counsel).  Then it’s time to bring in the new Kinect, video looping, and hand pose detection… no more Wiimotes.  That alone should help with the Microsoft science fair judges!  I hope very much to get all this working, at least at a basic level, by mid-winter sometime (end of February?).

I am ridiculously excited for the potential of this thing.  And I am also very excited for my high school class, some of whom were initially struggling but are really doing better now.  They are actually learning.  I’ll be making a later post (or several) about it all, as I finally get my head above water.

Enjoy the fall!

Written by robjellinghaus

2013/11/13 at 00:38

Posted in Holofunk

Goodbye XNA, helloooo SharpDX

with one comment

Half a year ago I posted about my XNA dilemma, and about how SharpDX looked like the best replacement. XNA = old and busted C# DirectX library from Microsoft, now officially kicked to the curb; SharpDX = amazing generated-from-C++-DirectX-sources open source C# binding to the entire DirectX API, which is waaaaay more than XNA ever did.

Last summer I was a bit daunted by SharpDX, as the API was not very XNA-like. Alex Mutel, the primary author of SharpDX, said he was working on an XNA compatibility layer but it wasn’t ready yet. I then figured out how to get multiple window support (the primary driver at the time) working with XNA, so I set SharpDX aside for a while.

Well, Alex was as good as his word. After a fair amount of very entertaining Holofunking over the holidays (Holidayofunking?), I finally decided that with the new year would come a new 3D library underneath Holofunk. And what do you know, the SharpDX Toolkit library was ready.

Over the course of a couple of weeks, working relatively infrequently and in small chunks, I converted everything off of XNA. If you want to see a reasonably small, but realistic, XNA-to-SharpDX conversion in full, you could do a lot worse than looking at this changelist on CodePlex.

First of all I realized there was no multi-window support in SharpDX. Well, that’s fine, I posted an issue about it and offered to help with the implementation. Alex immediately responded and asked for a short while to build some support pieces. The next thing I knew (three days later), he’d implemented the whole feature. It worked fine for me the first time.

The most surprising changes were:

  • In XNA, the Rectangle constructor takes (int left, int top, int width, int height). In SharpDX, it takes (int left, int top, int right, int bottom). This led to all KINDS of amusing weirdness.
  • The Point type is gone, so I switched to Vector2 with some casting to int. Probably there is a better way but I don’t yet know it.
  • The XNA Color.Transparent is equivalent to Color(0, 0, 0, 0) — e.g. premultiplied black at zero alpha. In SharpDX, it is Color(0, 0, 0, 255) — e.g. black with full alpha. Everywhere I had been using Color.Transparent I switched to “new Color(0)” to get the same effect in SharpDX.
  • In XNA, the Color constructor takes (int r, int g, int b, int a). In SharpDX, it takes either (byte r, byte b, byte g, byte a) or (float r, float g, float b, float a). This caused some of my color math to use the float overload rather than the byte overload, with ensuing hilarity.
  • I ran into a premultiplied alpha problem with PNGs, which aren’t premultiplied. I posted an issue and Alex responded immediately AGAIN. I was able to hack around it with his suggested BlendState.NonPremultiplied workaround.
  • I tried using 32-bit premultiplied BMPs in RGBA format, but ran into ANOTHER issue, which, of course, I posted. We’ll see whether Alex maintains his exemplary responsiveness.

If you’re getting the idea that Alex is incredibly helpful, you’re right. He’s right up there with the excellent authors of the BASS audio library, about whom I’ve similarly gushed. The very best open source projects to use in YOUR open source project are the ones with active and helpful maintainers. And having an open source project of your own really helps get support, since you can show your code and they can point others at your code as an example of something cool done with their work.

So now Holofunk is fully running on SharpDX with all features maxed, multiple window and two-player support, and much faster rendering. It looks pretty much the same, but now all obstacles to video texturing, bizarre postprocessing, Fourier transforms on the GPU, etc., etc., are removed. The sky is now the limit.

holofunk_20130206_for_blog

Next up: sound effects! At last!

Now that I’ve gotten that big move over with, it’s time to start putting sound effects in at last! I bought a copy of Sugarbytes Turnado, the VST plugin Beardyman uses in his live rig, and it is indeed friggin’ amazing. Been having all kinds of fun exploring it.

The tricky part is that it does eat CPU, and I’m not sure how many instances of it I will be able to create with the BASS audio library I’m using. So while part of me wants an interface that lets me apply any random set of effects to each and every individual loop, that would mean one instance of Turnado per loop. In Holofunk it’s easy to create dozens of loops. So I needed some fallback story for how to manage sound effects in some more granular way.

I think what I will try first is splitting the screen into eight wedges (instead of the four quarters in the current version), with each wedge having its own instance of Turnado. So all loops in a given wedge will share a set of effects. It will be possible to select multiple wedges and wave your arms wildly to wiggle the Turnado knobs for all those wedges at once. It will also be possible to record animated knob-wiggling across a set of wedges. That, combined with effect selection, should be more than enough to get some serious madness going on, while being pretty straightforward to implement.

I also have an interesting idea for a graphical interface for scratching, but I’m going to show that one in prototype form rather than blog about it🙂

Meanwhile, Elsewhere In Loopania

djtechtools.com wins the big Looping Journalism Award of the last two months. They have an excellent and long article on Beardyman’s new live performance setup, which he said on Facebook will be coming soon to a TED talk near you. CAN NOT WAIT.

They also have a video of Imogen Heap with the latest iteration of her gestural music gloves. Here’s the video for your convenience — and look, she’s got a Kinect in there!

How ridiculously cool is THAT???

This entire field of musical experimentation is moving along most awesomely, and I’m greatly enjoying my own little part in it.

So: February is the month of Sound Effects Into Holofunk. March I plan another round of beta testing, aka getting together with my Holofunk posse and partying on it. April and May, more work, and then into demo season in June with a public performance at the Maker Faire. Going to try to be ready for a 30-minute gig this time!

Expect another update in March or so. Until then, stay funky!

Written by robjellinghaus

2013/02/07 at 00:27

Posted in Holofunk

A Publication! In Public!

with 7 comments

The team I work on doesn’t get much publicity.  (Yet.)

But recently some guys I work with submitted a paper to OOPSLA, and it was accepted, so it’s a rare chance for me to discuss it:

Uniqueness and Reference Immutability for Safe Parallelism

This is really groundbreaking work, in my opinion.  Introducing “readable”, “writable”, “immutable”, and “isolated” into C# makes it a quite different experience.

When doing language design work like this, it’s hard to know whether the ideas really hold up.  That’s one epic advantage of this particular team I’m on:  we use what we write, at scale.  From section 6 of the paper:

A source-level variant of this system, as an extension to C#, is in use by a large project at Microsoft, as their primary programming language. The group has written several million lines of code, including: core libraries (including collections with polymorphism over element permissions and data-parallel operations when safe), a webserver, a high level optimizing compiler, and an MPEG decoder. These and other applications written in the source language are performance-competitive with established implementations on standard benchmarks; we mention this not because our language design is focused on performance, but merely to point out that heavy use of reference immutability, including removing mutable static/global state, has not come at the cost of performance in the experience of the Microsoft team. In fact, the prototype compiler exploits reference immutability information for a number of otherwise-unavailable compiler optimizations….

Overall, the Microsoft team has been satisfied with the additional safety they gain from not only the general software engineering advantages of reference immutability… but particularly the safe parallelism. Anecdotally, they claim that the further they push reference immutability through their code base, the more bugs they find from spurious mutations. The main classes of bugs found are cases where a developer provided an object intended for read-only access, but a callee incorrectly mutated it; accidental mutations of structures that should be immutable; and data races where data should have been immutable or thread local (i.e. isolated, and one thread kept and used a stale reference).

It’s true, what they say there.

Here’s just a bit more:

The Microsoft team was surprisingly receptive to using explicit destructive reads, as opposed to richer flow-sensitive analyses (which also have non-trivial interaction with exceptions). They value the simplicity and predictability of destructive reads, and like that it makes the transfer of unique references explicit and easy to find. In general, the team preferred explicit source representation for type system interactions (e.g. consume, permission conversion).

The team has also naturally developed their own design patterns for working in this environment. One of the most popular is informally called the “builder pattern” (as in building a collection) to create frozen collections:

isolated List<Foo> list = new List<Foo>(); 
foreach (var cur in someOtherCollection) {
    isolated Foo f = new Foo();
    f.Name = cur.Name; // etc ... 
    list.Add(consume f); 
}
immutable List<Foo> immList = consume list;

This pattern can be further abstracted for elements with a deep clone method returning an isolated reference.

I firmly expect that eventually I will be able to share much more about what we’re doing.  But if you have a high-performance systems background, and if the general concept of no-compromises performance plus managed safety is appealing, our team *is* hiring.  Drop me a line!

Written by robjellinghaus

2012/11/02 at 14:06

Posted in Uncategorized