The Battle for the Desktop Begins!

kirstens-viewerIn late 2007, Linden Lab had published their viewer roadmap for 2008, a way for us residents to take a look at what they were planning to release in future versions of the official Second Life® client. It all looks very promising, when you suddenly realise that this page has not been updated since August 2008; other Wiki pages linked from there are even older.

So, what happened? The drive to “increase stability” has been mostly understood internally to mean “more quality assurance testing” and also “collecting more statistics”, or so at least it seems. Put it bluntly, this means that every time a single line of code is added to the official SL viewer, it routinely goes through a series of exhaustive testing to make sure that line of code doesn’t “interfere” with the rest and pop up new hitherto unknown bugs into existing. LL has devised a complex and very lengthy procedure to run a series of tests for that.

The result? From the moment a bug is found and a patch is submitted by a developer to the Public JIRA can take days — sometimes even just a few hours, which happens when a volunteer developer (non-LL employee) suddenly gets inspired and patches the code. But it then takes months until that patch is approved. Typically it takes around 6-14 months until a bug gets fixed, although exceptions exist at both extremes — security issues can be patched much quicker (sometimes in hours!), and non-essential bugs or some new features can take 18 months… or, in some cases, like introducing Havok 4, it took four years.

M Linden is still keen in providing residents in 2009 with a new, light version of Second Life that will enhance the new user experience. The question is, is Linden Lab really up to it?

It’s tough to be a SL client developer at Linden Lab. I feel their frustration.

Until somewhen in 2005, developing the SL client was pretty much “hack and slash”. Someone would have a great idea, it would be immediately coded and developed into a new feature, and released to the public. Having just a few thousands of residents, these would try it out, complain, LL would get it fixed, then the residents would complain again, a new fix would be produced, and after two weeks of nightmares, LL would eventually get it right, and they’d start on the next set of features. At some point they started to do the tests on the Preview grid (then called “Beta grid” for testing out beta versions of the SL client).

In those days, both the SL client and the simulation software on the servers had to be in sync, so if something went wrong with the client (not enough testing!), the servers had to be rolled back as well. This became incredibly more messy as new residents — now millions and not thousands — started to use Second Life. The vast majority of them have no clue on how to configure their computers properly, they have widely differing hardware, and also different expectations of how SL should work. So at some stage LL’s quick feature development had to stop.

There was a huge gap in 2006 or so when no further features were introduced, and the focus was on stability. I believe that it was back then when LL started to introduce the concept of quality assurance testing. In the very early days, this consisted of a sequence of “test sheets” that were on their Wiki, and volunteer residents would go with the Lindens on the Preview grid and check each item off and report back. It was slow and painful work. The problem is, due to the nature of LL’s codebase, you never know if something as harmless as correcting a spelling mistake on a dialogue box won’t suddenly introduce a bug in texture caching. So this means that for every single line of code you had to do all the tests — again.

In 2007, a bit of innovation entered Second Life’s development cycle. The source code was released as open software; libSL was promptly created, and the first user-contributed patches started to appear on JIRA; later on, the first “alternate SL viewers” started to make an appearance. OnRez, from the Electric Sheep Company, was probably the first that had some real impact, thanks to its use on the CSI:NY virtual presence in SL. LL had also acquired the company that developed WindLight and this meant, for a time, that residents got a whole new engine to be delighted — SL images and machinimas never looked the same again.

The complexity of dealing with all that change prompted LL to do a lot of things at the same time. First, the server and client code were detached from each other as much as possible, and the Heterogeneous Grid was born — a grid that can run different versions of the simulation software, but also different versions of the SL client. As time has gone by, we’re able to use a wide variety of SL clients to connect to LL’s grid, even some that are hopelessly outdated (but that might somehow work better on a specific resident’s computer for some reason). Linden Lab now has two major ways of releasing new viewers: on the Preview Grid (which has little use) and through the series of Release Candidates (which are used by far more people, but still just “thousands”, not “millions”).

2008 was a return back to “stability” and little innovation — so this goes in cycles. This year, 2009, ought to be “innovation” again, and there were quite a lot of projects accumulating to be deployed. SpeedTree®, to get rid of the Linden Trees and get 21st-century tree rendering into the SL client. Shadows in SL. Flexisculpties. Full meshes. “Puppeteering” or “physical avatars”, even probably with different skeletons/meshes. The group tools, version 2.0, which would also include different sets of permissions (allowing, say, Creative Commons licenses to be tagged to an object). Alternative compilers to Mono (ie. making LSL just one of the supported scripting languages). Client-side plugins. A new lighting system to get good avatar lighting (and get rid of those Facelights, the bane of every machinimist and SL photographer). The list of “ongoing projects” is endless, and some would definitely be introduced in the next “innovation cycle”. We know, however, that LL will skip 2009’s innovation cycle and still continue to focus on “stability” for at least another year.

But… what does “stability” really mean? From a resident’s perspective, it only means longer development cycles, as more and more quality assurance tests are made each time a bug gets fixed. But on the reverse side of the coin, this means more time is spent per bug, and that less bugs are actually fixed. This in fact applies not only to the SL client, but to the simulation software as well: the last release of the sim software (1.24.10) was deployed in December 2007. The current version under deployment, 1.25, has been tested on the main grid in some form since October, but bugs kept it from being the current version and LL promptly rolls back to the previous version. We’ll see if the “final” rollout this week will make it, or if LL will roll it back again, and work on more bugs.

In the mean time, this naturally means that the bugs introduced in December 2007 haven’t been fixed yet.

On the client side, things have been better, but not by much. The new, aggressive statistics capture has been a bane on all LL clients — main client, release candidates, OpenGrid client. For some reason, LL thought that by sending a vast quantity of stats every frame or so will help them to improve their knowledge of what’s going wrong on the resident’s side. Sadly, this also means that the CPU is spending almost as much time processing statistics and sending them to LL than it’s rendering scenes. How is that possible? But it’s a claim very easily sustained: just download one of the alternate clients (almost all disable statistics) and you’ll see twice the performance on the same sim, for no particular reason. Some of the alternate clients don’t even add much besides some patches and disabling statistics. The performance loss we get from the statistics is just incredible.

But if it actually allows LL to develop better viewers… that’s a good thing, right?
Until recently, I thought as much. LL’s series of clients, after all, keep improving over time, and that’s undeniable, even if like to say that “on the good old times everything was better”. It was not. We just have short memories. The most vocal complainers are usually people who happened to log in to SL during one of their best phases and naturally complain that things are not as they used to be. Others just have bad memories: using the same hardware as in 2004, I can now go to places with 100 avatars, stay there for hours on end, and still get a handful of FPS. With luck, after all those hours I will even see everybody in colours, and the likeliness of that happening has steadily increased over the years. During Winterfaire 2008, I did actually enjoy the snowball fight with the Lindens for the first time — usually, I had to navigate among a flurry of flying snowballs at 0.3 FPS, avoiding bumping into the dozens of other residents. This year, I got enough performance on the same hardware to run around like a child and yell in glee when I actually hit some of the Lindens 🙂 I even had enough performance to capture it on video, with plenty of FPS left to spare.

None of that was possible in 2007, much less in 2004.

So, yes, there are certainly  improvements. However… the question here is why we couldn’t have more improvements in stability and performance, while at the same time continue to innovate? We were always told it’s a trade-off.

Not so for the alternate viewer gang. The list quoted before is incomplete, of course, but you can see that they have some things in common, and others that are totally unique. First, almost all have applied a series of a hundred or so patches, some of which are ancient and released by Nicholaz Beresford eons ago. The first patches to fixing memory leaking problems (mostly resulting from sloppy programming) were released by Nicholaz in May 2007. Since then, he has released quite a lot more, and, of course, he wasn’t the only one — many, many volunteers have contributed with bug fixes over the past 18 months. All those patches are on the JIRA. Many are actually very small code changes, a line or two.

Just a very very few ever made it to the LL viewer (and none of the ones with the greatest impact). Most are still under review by the quality assurance team.

But there is not only bug fixing and performance increases on the series of alternate viewers. Frustrated with the slow pace of innovation by LL, many volunteers have delved deep into LL’s code and found delightful bits of incomplete code, and made them into finished products. Kirstens Viewer, for instance, will soon not only release a superfast, fully patched version, which adds her own neat tricks — Skype integration, a texture “thumbnail” browser, an in-built avatar radar (so you don’t need laggy HUDs to get that) — but also bring out LL’s never-released advanced lighting system and their shadow engine, projects that have allegedly been suspended or given up in favour of “more stability”. And Kirstens Viewer has been pronounced by Hamlet Au as “the best SL viewer ever“. That was on January 14. In January 19, Kirsten released a new viewer with shadows. Between 14 and 19, Mac versions of Kirstens Viewer were made available. And as Kirsten says, this just took her a month and a half. One person only, no quality assurance teams whatsoever, and the end result is an insanely fast client (you have to see it for yourself, nobody believes me when I tell them that Armidi’s exteriors fully rezzed in under 30 seconds at 2o FPS, on a bad grid day, with 30 or so avatars on the sim, and during peak hours), with a whole cartload of patches and bug fixes, a sleek new look, and pretty new features as well.

This is actually the rate of development that we had from LL in the glorious days of 2002-2004, but not today. Today, it means that a bug is fixed in a year or so.

Kirstens Viewer might be the latest kid on the block with its coolness, but it’s by far not the only one. Imprudence will feature a completely new user interface, which will be detached from the renderer, so it won’t interfere — and you could thus create UI plugins for it. That means a major overhaul of a lot of LL code. On the other side of the coin, the Restrained Life API allows a certain amount of plugins to interface with your client, without waiting for a full interface redesign. Initially designed by the BDSM community, who always wanted attachments that you couldn’t detach (a very old feature request that predates JIRA…), the ability to develop plugins and access them from LSL actually created a market for interesting non-BDSM plugins, like Mystitools’ ability to just selectively allow some of your friends to chat with you. Why? Because after that API was published, the very popular Cool Viewer promptly implemented it. And this means thousands and thousands of SL residents that can now start to enjoy the pleasure of client-side plugins; thanks to the success of Mystitools overall, it means that more and more people have downloaded the Cool Viewer just to use those extra features.

I’ve always been a fan of “unproven” technology, in the sense that I’m fine with radically new things that don’t work well, so I tend to download and install pretty much whatever I like most. I haven’t been using the regular SL viewer for ages; I tend always to use the Release Candidates, since they always have exciting new things that I prefer. On my “presentation notebook”, however, I tend to have the plain old boring outdated standard SL viewer, because I never know when one of those alternate viewers crash during a presentation.

However, this means that my audience will never be impressed. They’ll yawn at technology developed in 2003 (I can’t use WindLight on it, the Intel card doesn’t work with it). I’ve switched to Kirstens Viewer on my presentations, just because at least my underpowered Macbook (not pro) will be able to impress them with 20+ FPS and reasonable images. Kirstens Viewer also allowed me to finally use WindLight regularly on my usual iMac. Oh, I always loved WindLight, and the few pictures I take are always on the Ultra setting, but… these days, I cannot bear to struggle along sims at 3-6 FPS any more. I’m spoiled: that’s what I had to live with in 2004, but I have no patience for that in 2009. Kirstens Viewer is the first that allows me to have the same performance with WindLight on as I get on the standard LL viewer without WindLight on. Now that impressed me.

So what does this all mean for LL? First, you have to realise that all those viewers together just have “a few thousands” of users. Most aren’t aware they exist. On the Macintosh Group, a recent conversation showed that almost everybody, with a handful of exceptions, never downloaded a non-LL viewer before (although these days, most have already been ported to Mac OS X). Rumours are spread that “the alternate viewers are so good because they send your credit card info and passwords to third parties”, and so they’re not used. Other rumours say that “I had a friend who had a third-party viewer and was promptly banned by LL” (in reality, the “friend” was using CopyBot to steal content and that’s why LL banned them — CopyBot is, for all purposes, a third-party viewer too). And finally, many people come from other virtual worlds, where non-official viewers are illegal and their creators pursued in courts of law. The concept that LL has created (and encouraged!) an open source community around the viewer code is totally alien to them and they can’t believe it’s true. So, with little publicity, it means that the absolute majority of all SL users never tried an alternate viewer, not even one from LL (ie. the release candidates, or the “beta” viewers).

This also means a very limited number of residents actually testing them, and, in general, most of them are technologically inclined. At the very least, they know how to go to the Preferences tab to tweak their settings — pretty much what LL expected users to do in 2002-2004, and that’s why Second Life, unlike most of the competing platforms, has so many tweaking options available. However, the vast majority of the residents don’t know how to do that. Half of them don’t even know how to read the Preferences tab, because they don’t speak English and might not have a translated version of the SL client.

So this means that the number of people effectively testing the alternate viewers is very small — about the same number of residents that existed in Second Life in mid-2004 or so. From LL’s point of view, it’s probably too small a test base to validate that all those patches are actually correctly applied.

Because that’s the only reason I can give for LL to repeatedly ignoring what others are doing to their code. After all, the major reason for releasing the code as open source was to make sure that they’d get free labour in reveiwing and fixing bugs. Which the many volunteers promptly did. But LL is reluctant to accept their code. Why?

The only answer I have is “because it’s not tested enough” — although it has been tested by several thousands of residents. Way more people than LL has in their internal quality assurance team, and very likely way more people that actually use the release candidates.

So… there is obviously a question of internal policy. In my mind, software requires exhaustive testing in the field, something that I believe to share with LL. Two techies running a series of tests on an empty sim will never uncover the kind of bugs that two clueless residents will find on the laggiest sim on the grid, surrounded by avatars and 2.2 billions of items on LL’s asset servers. Nothing like a “real experience” to figure out what’s wrong. But… the “real experience” presumes that bugs are fixed, patches are applied, and users are using the client, not engineers in a lab crossing items on a checklist!

I’d certainly encourage LL to review their internal processes, and at least for their Release Candidate series, be bold and experimenting, fully embrace the cartloads of bug fixes and patches that have been submitted in the past two years or so, and forget their “fears” about “unproperly tested code”. Open source software grows by testing it in the field. “Code fast, release often” is the old mantra that applies to this.

LL can, of course, have three approaches side-by-side. One will focus on the “main SLviewer” — an old, always-obsolete bit of code that nevertheless is reliable enough and that doesn’t change much over the years, and which gets a new release every year or two. That’s the one that ought to be the “Reliability Client” — no features, just field-tested bug fixes, nothing fancy. In parallel, develop the much-announced “SL Lite” client with all major features stripped down and as easy to use as IMVU. That’s also a good approach — that “SL Lite” might even get less releases. Since it probably won’t even have building tools or the scripting editor, it can be made so minimalistic that it’ll run on any computer. Graphics will be ugly, but that’s ok — so far as you can login and move around, that’s all it needs to do.

And then, on the other side of the Lab, have the team developing fast and furiously the bleeding edge of Second Life. Work to incorporate all those developments from the alternate viewers — plugins, performance, lighting effects, features, a detached UI — and release it as a single, LL-branded viewer. Do it often. Forget quality assurance tests — you can always do them once it’s time to launch an official “stable” release, but, during that time, experiment a lot. Releasing a new version every week ought to be the goal — LL managed to do that in 2005/6, with a much reduced number of developers in their teams, so there is no reason they cannot do the same. The trick is, of course, to do the “code hack and slashing” only on the Release Candidates. Then, after a year or so, push the code to the quality assurance team, and let them test it for half a year — and if it meets with their approval, release a new “stable” version. In the mean time, of course, once the code is “frozen” for evaluation by the QA team, development can start instantly on the next batch of release candidates. There is no need to wait!

My issue at this stage is not with LL being totally at odds with that approach. After all, if the team behind the Release Candidates are not allowed to work that way, volunteer residents will — and this means that “unfinished” code and bugs and performance patches will be handled by the open source community, not by LL. But it’s still a pity. Why should millions of SL residents be deprived of the pleasures of enjoying superfast performance and all sort of nifty tricks that way?

Unless, of course, it has all to do with the Big Announcement that is due by the end of the month, which will address the SL client and the open source development efforts behind it, but which sadly nobody is willing to even give us a clue about what it will be about…

Print Friendly, PDF & Email
%d bloggers like this: