The Battle for the Desktop Begins!

kirstens-viewerIn late 2007, Linden Lab had published their viewer roadmap for 2008, a way for us residents to take a look at what they were planning to release in future versions of the official Second Life® client. It all looks very promising, when you suddenly realise that this page has not been updated since August 2008; other Wiki pages linked from there are even older.

So, what happened? The drive to “increase stability” has been mostly understood internally to mean “more quality assurance testing” and also “collecting more statistics”, or so at least it seems. Put it bluntly, this means that every time a single line of code is added to the official SL viewer, it routinely goes through a series of exhaustive testing to make sure that line of code doesn’t “interfere” with the rest and pop up new hitherto unknown bugs into existing. LL has devised a complex and very lengthy procedure to run a series of tests for that.

The result? From the moment a bug is found and a patch is submitted by a developer to the Public JIRA can take days — sometimes even just a few hours, which happens when a volunteer developer (non-LL employee) suddenly gets inspired and patches the code. But it then takes months until that patch is approved. Typically it takes around 6-14 months until a bug gets fixed, although exceptions exist at both extremes — security issues can be patched much quicker (sometimes in hours!), and non-essential bugs or some new features can take 18 months… or, in some cases, like introducing Havok 4, it took four years.

M Linden is still keen in providing residents in 2009 with a new, light version of Second Life that will enhance the new user experience. The question is, is Linden Lab really up to it?


It’s tough to be a SL client developer at Linden Lab. I feel their frustration.

Until somewhen in 2005, developing the SL client was pretty much “hack and slash”. Someone would have a great idea, it would be immediately coded and developed into a new feature, and released to the public. Having just a few thousands of residents, these would try it out, complain, LL would get it fixed, then the residents would complain again, a new fix would be produced, and after two weeks of nightmares, LL would eventually get it right, and they’d start on the next set of features. At some point they started to do the tests on the Preview grid (then called “Beta grid” for testing out beta versions of the SL client).

In those days, both the SL client and the simulation software on the servers had to be in sync, so if something went wrong with the client (not enough testing!), the servers had to be rolled back as well. This became incredibly more messy as new residents — now millions and not thousands — started to use Second Life. The vast majority of them have no clue on how to configure their computers properly, they have widely differing hardware, and also different expectations of how SL should work. So at some stage LL’s quick feature development had to stop.

There was a huge gap in 2006 or so when no further features were introduced, and the focus was on stability. I believe that it was back then when LL started to introduce the concept of quality assurance testing. In the very early days, this consisted of a sequence of “test sheets” that were on their Wiki, and volunteer residents would go with the Lindens on the Preview grid and check each item off and report back. It was slow and painful work. The problem is, due to the nature of LL’s codebase, you never know if something as harmless as correcting a spelling mistake on a dialogue box won’t suddenly introduce a bug in texture caching. So this means that for every single line of code you had to do all the tests — again.

In 2007, a bit of innovation entered Second Life’s development cycle. The source code was released as open software; libSL was promptly created, and the first user-contributed patches started to appear on JIRA; later on, the first “alternate SL viewers” started to make an appearance. OnRez, from the Electric Sheep Company, was probably the first that had some real impact, thanks to its use on the CSI:NY virtual presence in SL. LL had also acquired the company that developed WindLight and this meant, for a time, that residents got a whole new engine to be delighted — SL images and machinimas never looked the same again.

The complexity of dealing with all that change prompted LL to do a lot of things at the same time. First, the server and client code were detached from each other as much as possible, and the Heterogeneous Grid was born — a grid that can run different versions of the simulation software, but also different versions of the SL client. As time has gone by, we’re able to use a wide variety of SL clients to connect to LL’s grid, even some that are hopelessly outdated (but that might somehow work better on a specific resident’s computer for some reason). Linden Lab now has two major ways of releasing new viewers: on the Preview Grid (which has little use) and through the series of Release Candidates (which are used by far more people, but still just “thousands”, not “millions”).

2008 was a return back to “stability” and little innovation — so this goes in cycles. This year, 2009, ought to be “innovation” again, and there were quite a lot of projects accumulating to be deployed. SpeedTree®, to get rid of the Linden Trees and get 21st-century tree rendering into the SL client. Shadows in SL. Flexisculpties. Full meshes. “Puppeteering” or “physical avatars”, even probably with different skeletons/meshes. The group tools, version 2.0, which would also include different sets of permissions (allowing, say, Creative Commons licenses to be tagged to an object). Alternative compilers to Mono (ie. making LSL just one of the supported scripting languages). Client-side plugins. A new lighting system to get good avatar lighting (and get rid of those Facelights, the bane of every machinimist and SL photographer). The list of “ongoing projects” is endless, and some would definitely be introduced in the next “innovation cycle”. We know, however, that LL will skip 2009’s innovation cycle and still continue to focus on “stability” for at least another year.

But… what does “stability” really mean? From a resident’s perspective, it only means longer development cycles, as more and more quality assurance tests are made each time a bug gets fixed. But on the reverse side of the coin, this means more time is spent per bug, and that less bugs are actually fixed. This in fact applies not only to the SL client, but to the simulation software as well: the last release of the sim software (1.24.10) was deployed in December 2007. The current version under deployment, 1.25, has been tested on the main grid in some form since October, but bugs kept it from being the current version and LL promptly rolls back to the previous version. We’ll see if the “final” rollout this week will make it, or if LL will roll it back again, and work on more bugs.

In the mean time, this naturally means that the bugs introduced in December 2007 haven’t been fixed yet.

On the client side, things have been better, but not by much. The new, aggressive statistics capture has been a bane on all LL clients — main client, release candidates, OpenGrid client. For some reason, LL thought that by sending a vast quantity of stats every frame or so will help them to improve their knowledge of what’s going wrong on the resident’s side. Sadly, this also means that the CPU is spending almost as much time processing statistics and sending them to LL than it’s rendering scenes. How is that possible? But it’s a claim very easily sustained: just download one of the alternate clients (almost all disable statistics) and you’ll see twice the performance on the same sim, for no particular reason. Some of the alternate clients don’t even add much besides some patches and disabling statistics. The performance loss we get from the statistics is just incredible.

But if it actually allows LL to develop better viewers… that’s a good thing, right?