No More Limits!

The Programmers’ Nightmare

Expert 3D modellers or architects used to over 30 years of using computer-aided 3D modelling tools tear their hair in frustration when dealing with Second Life’s in-built tools; but nothing like seeing programmers weep and cry at what Linden Lab provides them as a “programming language”.

Linden Scripting Language version 2 (LSL 1 was done over an afternoon and never used by any resident) is ancient. It has about the processing power of a 100 KHz CPU (that’s not MHz) and uses 16 KBytes of memory. That throws us back to 1981, when “personal computers” were kid’s toys at the dawn of ages. But worse than these basic limits are the “Linden limits” on a lot of function calls, which were deliberately “slowed down” mostly to “prevent griefing”. We’ll talk about those in a minute.

In the mean time, if you’re a programmer and curious about how people are able to do such amazing things with LSL, and eager to enter a brand new age of 3D software development — think again. I usually start my programming classes saying that “a LSL programmer spends 10% at doing the code and 90% at developing walkarounds for LSL’s artificial limitations”. Yes, this means that a thousand lines of code in LSL take ten times the development as in any other common, modern programming language.

Imagine that you’re a Web developer and picked, say, PHP for your developing environment. After a moment you suddenly figure out that every time you call a function, the whole webserver blocks for a few seconds, and nobody can view your site during that period. Baffled, you look at your code: are there any bugs there? A strange loop that is waiting for something to happen that never occurs?

You open up the manual, look up the functions used by your code, and just get a warning: when calling a particular function, it stops your webserver for a bit. There is no plausible explanation. The language designers just felt that this function was particularly “dangerous” and, to prevent you to abuse it, block your webserver for a bit.

You scroll down checking each and every one of your functions. And, as you expected… any of the useful functions (the ones you cannot avoid to call, or you web-based application wouldn’t work!) are — you guessed it! — artificially blocking your webserver. All in the name of “protecting your webserver from abuse”. But… what abuse? You’re in control of your webserver, you’re supposed to know what you’re doing, right?…

Going further on, you start to change your code around, just to make sure you call these “blocking” functions as little as possible. You push them into other scripts and call them remotely only when needed. And add some extra code to look up what you’re doing, just to avoid, at all costs, to call those functions in the wrong time.

What happens is that suddenly you find out you have no memory left!

Now what? Well… you can obviously push some of the functions into another script and thus split the memory limits between two scripts… but then you have to add an extra layer: communication between scripts. In effect, you’ll be deploying a whole set of Remote Procedure Calls — those things that you’ve vaguely learned about in college, but that serious programmers never use inside their own code. It’s back to the drawing board, then. And you suddenly figure out that the language does not support Remote Procedure Calls natively — you have to write your own communication layer to deal with that.

While you’re doing this, new problems pop in. When you’re calling a function on another script, you suddenly have no clue where you’re calling it from. This is a synchronisation issue: there is no guarantee that you’ll return to the point you are. Well!… computer software scientists wrote whole books on synchronisation issues. They came up with nice little ideas like “semaphores” or “shared memory” to deal with those. Guess what?… Your programming language doesn’t support either. You simply don’t have anything which is globally shared across scripts. What now?

Oh, and you can’t even store parameters on external files, much less databases. You can read from files (cumbersomely so, and it takes a lot of time), but not write to them. You can call functions on other webservers — provided you don’t call them often. More than ten per second, and you’re out: your script gets blocked, and you get no clue of what’s happening. 

If the World-Wide Web applications had to deal with this sort of issue every day, we would never had seen it growing like it did (the Common Gateway Interface which allows programming languages to be used to develop Web applications was created around 1993 or so). We would be stuck with static HTML on our pages, and simple and primitive search functions perhaps — so long as you didn’t search a lot, of course. And with luck you’d be able to allow your users to change the colours of your web page. Not too often — or it would crash your webserver.

Well, this is the daily experience of a professional LSL programmer!

LSL programmers can bang their collective heads at the walls, but the silly limitations won’t go away, even after years. In fact, LL is quite devious at inventing new ones every day. For instance, since 2003 people have been asking for efficient inter-object communication. LL gave them a way to send text chat and to receive it at the other end — in a painfully slow way. Also, in the process, it lags the whole sim. So they improved it with “linked messages” — but they only work inside the same object. You cannot use a superfast linked message across objects, just across, well, linked prims. You can send emails to other objects — provided their key (UUID) doesn’t change (which it will, so long as you make a copy of the object) — and it takes 10 seconds to deliver. When it does. You have no way to know. And finally, you can make HTTP requests to external webservers — that’s superfast too (almost as fast as linked messages). So LL promptly refuses you to make more than about ten requests per second. Good enough for primitive interfaces (humans are slow to react); totally useless if you wish to develop, say, a fast-paced game that requires a lot of communication. Ah… and you can’t get in touch with in-world objects either, it’s only one-way. Well, sort of. You still have the (deprecated) XML-RPC calls, launched in June 2004 and never changed since then. These used to work blindingly fast too — in June 2004. These days, LL doesn’t even guarantee that any message sent from a remote server using XML-RPC will ever reach the object. The infrastructure supporting XML-RPC is hopelessly outdated and struggles to survive with all the creative uses that people have given to it (like, for instance, SL Exchange or OnRez…). LL recommends not to use it and promises to give us new functions to do pretty much the same — in a few years. Or decades.

If you think it can’t get worse… it does. There are millions of Animation Overriders in SL. Why? Because when LL introduced custom-made animations (also in June 2004 — an excellent month in fantastic new features!) they forgot a tiny detail: how to change the default animations?

One would obviously assume that there would be a built-in preference dialogue box to quickly change them, by dragging and dropping animations on top of it. Not so!… LL forgot about them. LSL programmers to the rescue: by reading the avatar’s “animation state”, you could force it to stop the current animation and start playing a new one. All very nice (conceptually so) until… programmers started to hit the “limitations” of LSL. You don’t get a nice event to tell you when an animation changes: you have to continuously ask for it. Yes, you guessed correctly — that’s insanely laggy.

So you get all sorts of events from SL — like when you touch an object, collide with it, change its shape, colour, or ownership, or even when you drop something inside of it. There are quite a lot of events, and their purpose is always the same one: every time something changes in SL, your LSL script gets informed. You don’t need to check for a change; SL is happy to call your script when an event is waiting for you.

Except, of course, when it matters. There is no event to inform you when an animation changes. Why not? Well… “because”.

Thus, since September 2004, when the first AO was launched, that the grid is plagued with a few millions of AOs that lag the whole grid while they constantly check if their owners’ have changed an animation by chance… and, well, LL has never changed it. Why not? It’s… the Tao of Linden. Adding a new event, or, better, creating a nice, friendly, easy-to-use dialogue box on Preferences to drag and drop animations on top of it, is, well, not a priority — it is still waiting for some Linden developer to take a look at this JIRA request and do something about it. Or, worse, this one, which doesn’t rely on LSL but is a suggestion on how to change the SL client interface to allow client-side AOs (notice that Alexa Linden deemed it to be the same as the other request — it is not the same! — and simply “closed” it, to be buried and ignored under another thousands of useful requests).

Remember the dance machines? They are neat toys where a lot of avatars can select animations to dance. Guess what, a script can only animate one avatar at a time. So how do these dance machines work? They have one script for each possible avatar, and an insanely complex communication protocol to make sure you get assigned a free slot on one of (possibly) 100 scripts inside it. Well, probably not a hundred — if you place more than 40 or so scripts in a single prim, things start to misbehave. So you’ll have to split your dance scripts among several prims, and communicate across them. Wow, so much trouble for a simple device?… Oh yes. That’s what means programming in LSL: the most simple things take a lot of effort just to work around the limitations. Why didn’t LL add the possibility of animating more than one avatar from a single script?… Good question! It can’t be a lag-related issue (more scripts will lag more than one single script), so very likely it’s just because the Linden developer in charge of that bit of code hates clubs and dancing and doesn’t think it’s worth the effort.

When you start entering the physical world… things get even more dramatic. Physics-enabled devices (from vehicles to weapons) are notoriously laggy and have a huge impact on sim performance. So LL has become extra-devious in limiting them all. The major reason here is to avoid griefing — so these functions all have in-built delays and limitations. That’s why it’s hard to create a super-weapon that fires a lot of bullets per second: LSL doesn’t allow it. But… there are workarounds.

In fact, a clever concept is at the kernel of LSL programming: “script farms”. The limitations are almost all “per script”, e.g. like the example for the dance machine. So what the clever LSL programmers do is just to spread a function across more scripts. You can only fire a bullet every three seconds? No problem, have 300 scripts firing bullets at the same time, and cycle among them, and you’ll be able to have a constant firing rate of 100 bullets/second — pretty impressive when you’re at the battleground. And pretty impressive on what it does to the lag on the server, too.

Likewise, LL has fought very aggressively how griefers are allowed to do replicating objects. There is the “grey goo fence”: a series of measures to try to figure out if someone is replicating too fast. If the system that analyses the pattern of self-replication thinks it’s got the signature of a griefer, the functions are blocked — because “regular” use of the rezzing feature is usually not so intense. Well, what do griefers do?… they just replicate slower but use more starting objects.

On the Web-side requests, LL even went further with their paranoid measures. Here their fear was that people would use the Grid to launch distributed denial-of-service attacks on third-party servers. Imagine 25,000 sims, all launching an attack simultaneously doing hundreds of requests per second on, say, Anshe Chung’s website or Prokofy Neva’s blog (always popular targets of griefers). They would immediately crash the servers — and Linden Lab would take the blame, since the attack would have originated on their grid.

Well, to prevent this, LL limits “too many requests” per second. But they do it even more cleverly: they restrict it per avatar. So you can’t use “script farms” that way — all your objects’ requests are pooled together, and the limit is applied to all of them.

Of course, what do griefers do?… create a hundred alts per sim, and each launches their own attacks. It takes a bit more time, but… LSL programmers are used to workarounds, even dramatic ones.

So what we have in fact is a pretty simple language (anyone familiar with programming languages will pick up the basics in an afternoon), but that requires a daunting amount of workarounds until you start to be able to do something remotely useful with it. These workarounds take months and months to learn — many are “trade secrets” which you don’t usually get explained during scripting classes. Quite a lot are impossible to figure out from reading code — freebies tend to use few of those tricks, and the best scripters will not share the workarounds they managed to discover. So be prepared to face years of trial-and-error experimenting until you grasp the basics of “working around LSL limitations”. But a good and persistent programmer will eventually get there.

What this means is that LL has artificially created a huge gap. On one side we have the beginning programmers, the ones that are so happy to have created a door that actually opens and closes. They don’t care about complex things — they wish just to experiment with the simple ones. Doing things like slideshow presenters are easy to do with a few lines of code, and quickly understood by beginners.

On the Dark Side, we have the griefers, exploiting every possible hole in the architecture to, well, bring the grid down, or at least a few sims, or at the very least, annoy a few avatars. To be able to handle them, Linden Lab added a lot of complex limitations and checks so that griefers are seriously hampered in their attempts.

But are they really?… Almost everything has a workaround, it just takes ages to deal with those. So the top programmers spend ten times the normal amount of time to deal with the workarounds — but so do the griefers, who are also top programmers. Granted, the “casual user” and the “script kiddie” will probably give up quickly enough when trying to launch their Ultimate Grid Attack™, but… seriously… how many of these are around? Serious griefers are serious programmers, too; both know how to subvert LL’s limitations. They’re not really worried about how terrific LL’s limitations are: there are always ways around them, and griefers have ample time and nothing to do with it, so they can spend weeks and months figuring out a way to subvert the limitations.

In the meantime, professional scripters, earning their living in Second Life, spend countless hours hitting roadblocks and dead-ends in their desperate attempts to have LSL at least do something useful. Programming time skyrockets into the unforeseen future, as clients wait for the programmers to deliver. And sometimes there is simply no way out but to go to libopenmetaverse (formerly known as libsecondlife) and do there what you can’t do in LSL. No wonder ‘bots are more and more popular: the limitations are sometimes simply too hard to work around in LSL, or even impossible (like the famous scripts that only work when the avatar owning the object is in the same sim; if they log off, everything stops working… a silly limitation for some functions that has no plausible reason for existing, and only by having a ‘bot in the sim will things work correctly…).

Now I’m not underplaying the eternal fight between griefers and developers. There has to be a way to limit griefers’ attacks, and this mostly means making their job so hard that they give up and go elsewhere to have their laughs at our expense. But at the same time, this means that everybody else needs to suffer because of those naughty limitations…

A few, however, are really just “a whim”. I can only seriously hope that LL will start dropping those as soon as Mono is rolled over the whole grid (almost done!) and debugs it thoroughly (shouldn’t take much longer!). Others, well, they will probably never implement — like a way to get rid of the laggy animation overriders by simply adding that feature on the SL client. We’ll have to wait until someone patches the SL client to do that. Similar things will have to wait until the Mono engine allows other languages to be deployed besides LSL. Mono-based scripts are compiled server-side, so there is a higher degree of control that way — LL could, in theory, apply their clever pseudo-AI (the one that tries to identify if a running script is self-replicating itself too quickly and shut it down) at the compilation stage. This would mean that they could search for patterns of suspicious code and disallow its compilation at all — before harm is done. Think of how modern anti-virus software works: it looks for signatures (patterns of behaviour that are often used by virus software writers) to identify potential new virus. LL could do something similar: any script that tries to self-replicate too quickly could be flagged for review and not compile at all, thus keeping the Grid safe.

Granted, like in the real world, this is an arms race: griefers will try to develop new griefing tools that don’t use any known methods to force the server-side Mono compiler to validate their scripts and compile the; but after such an attack, LL would be able to see what kind of new trick the griefer has come up with, and add it to their database of “signatures” — the next time a griefer uses the same trick, they won’t be able to compile that script. More interesting is that this would not require a server patch (like it does today) — just an update on the “griefer script signatures”, which might be done in real-time (ie. all sims are forced to pull the latest database of “griefer script signatures” when compiling a new script to Mono bytecode).

And that way, most silly limitations could be lifted — and allow LSL programmers to be insanely more productive that way.

Conclusions

We live in a limited world. Sadly for the content creators, they spend too much time working around the silly limitations — most of them created at whim by some angry developer in a bad mood; a few really outdated limitations that don’t apply any more; others just the consequences of bad design; and a few just the result of putting SL to uses that LL never dreamed of.

Overhauling the whole of Second Life is not just redesigning the architecture (which is crucial, of course, since it will allow scalability and more stability). It’s taking a good, close look at how people are actually using SL today, and what their nightmares are. Content creators — but also common users! — are tied to all sorts of limiting factors in their virtual lives, many of which don’t make any sense. The struggle against all these limits is too constraining sometimes, and the last option is just to give it up.

It’s true that the most fascinating artistic creations have come from surpassing obstacles and limitations and constraints. It’s the way our minds work: human beings are problem solvers, and “life” has often be described as a series of obstacles that we have to pass, and by doing so, we enjoy a sense of fulfilment that gives us pleasure. Well, that is true; on the other hand, if the obstacles are set too high, it leads to frustration, disappointment, and, ultimately, abandoning the attempt.

Fine-tuning how much LL can limit our creativity without leading to frustration is quite an art 🙂


Edited in June 5, 2020 to correct the link to libopenmetaverse (the ‘new’ name for libsecondlife) — Gwyn