First, a disclaimer…
I do not have any type of connection to Linden Lab (wish I had, though 🙂 ). Information collected about how the system works has three main sources: looking at the logs (SecondLife.log), reading the forums, and talking to older residents (who sometimes have exchanged ideas with Jeff, Ben, or even Philip himself).
Philip and Cory themselves have written a technical paper on the technology used. This is excellent reference material!
Most things are “obvious” for someone with some experience managing Internet server farms. This is not “magic” or “reverse-engineering” or “hacking into machines” to see how they work from the inside. Like a doctor can tell you why you’re ill by just feeling your stomach and making some questions, a system administrator can make a pretty good guess at how things were implemented by just seeing how they work (or better even, how they fail to work). The more experienced you are, the more different solutions to the same problem you know, and the more easier it is to see which one was used. It’s never easy to do something radically new from scratch when you know about “best ways” to deal with some issue. Why reinvent the wheel when someone already has done the crude work?
Since I have no way to confirm most of my assumptions, they may be wrong. Don’t take them literally. Yes, I have been wrong several times in my life. I usually just don’t admit it in public (RL or otherwise) because I like to be considered a stubborn, know-it-all loudmouth. 🙂
And finally, my natural curiosity impels me to try to understand how things work. I hate things I don’t understand, and am stubborn enough (I prefer the word “persistent”) to waste my time thinking how things could have been done.
Many, many thanks to Eggy Lippmann who reviewed this article and told me where I was completely off-track. This “version” already reflects some of his competent technical review.
Of course, I also try to update this article ever so often, as new things are “discovered”, but may rest assured that I’ll probably never get it absolutely right! 🙂
Now a warning:
This will be filled with geekspeak (yeah, Gwyn talks the lingo, too). So this probably be BORING to most of you.
Why is this article here, then?
99.5% of the residents thing LL is a crappy company which never cares for the players’ needs. Everybody knows about the bugs in SL, and some of them are here since beta (18 months ago or so) and have never be fixed. New releases just introduce new features with more new bugs. Actually, people still keep playing the game and pay a handful of dollars every month to LL because they have so much fun – and not because the game is any good. The interface stinks. There is always lag. Sims crash a lot. People keep losing objects in inventory. And there are thousands of posts every day about feature requests and bug reports, and the Lindens don’t care about it.
Pretty picture, isn’t? But this is what we hear every day…
The remaining 0.5% are too fascinated by the game (as a sociological, economical or technological experiment) to gripe about it. These things are “annoying”, but they are a price to pay when you become a pioneer exploring the New Frontier. And, of course, we know why things are so hard to fix.
The first thing you need to understand is that Linden Lab is NOT Microsoft with its 5000 software coders (or are they more?). They have, at most, 15 programmers, and perhaps 3 or 4 system administrators. Even taking into account that “core functionality” was developed by Philip a loooong time before beta was ever launched, that’s not much. And these guys are developing a “game” on 3 different platforms with a new software release every week or so, fixing about a dozen bugs, launching about a dozen new features, and testing them all before they release it to the public. You may not appreciate that much, since Microsoft does about the same. Yeah. Right. But they have 5000 guys working just on that. Oh, so you’re going to tell that Linux does even better, and there is not even a company behind it? But Linux has, like, ONE MILLION programmers working on it!
So let’s keep thing into perspective, shall we? Linden Lab is a small company, running 1000+ servers with a game with 100.000 angry players who complain every day with thousands of requests which are never fulfilled, with an understaffed tech team, and employees not only working overtime, but gladly replying in the forums, being online off-duty (even on weekends and holidays) helping people out, and spending as much time as they can to get things running smoothly. Microsoft hasn’t 1% of that attitude and is the biggest software company of the world 🙂
You read about the many good things about Second Life in their Press Releases, but their marketing is targeted to a broad audience. If they would do a press release for a geeky audience, here is what they would write about it:
LL is a hosting provider (not a content provider; content is designed by residents, not employees). Their core business is setting up a technology which enables creative people to provide content and entertainment. So they are not really competitors to other MMORPGs, but rather to hosting providers. Think about Second Life as a 3D combination of WWW with chat rooms with entertainment/games.
Hosting providers sell “Web space” by megabytes of consumed hard disk (or by bandwidth). LL sells “SL space” as “land” with “prim allotments”. You pay monthly fees to have some megabytes of Web space, and in SL you pay for tier. Same concept, different words. Hosting providers can give you “virtual servers” or even whole machines, where you get some tools to manage it, but you’re pretty well on your own. LL gives you the opportunity to rent a whole sim. Actually, prices are very, very similar. Finally, some companies also sell software to manage machines (eg. operating system and some remote administration tools). In the future, LL will give you a license to run your own sim, or even open source the code.
So this means that the “grid” (how we name the Matrix in Second Life) has to be segmented in “managable bits” so that LL may rent it (that’s where their income comes from, and NOT from players joining the game or paying for the Premium account…). As we will see, this will seriously condition the whole system architecture, and you’ll also understand better why LL used an approach which is radically different from their competitors (I’m not saying it was a good choice. It’s different, because LL is not a “gaming company”, but a “3D content hosting provider”. It’s not the same!).
The “game client” – the piece of software you install on you PC, Mac or in Linux early 2006 – is just a 3D browser and nothing more. It will work with every sim it connects to (that doesn’t amaze you? Think about your web browser has problems connecting to different sites…). When you get the client application, “almost nothing is stored there”. Even your avatar is not stored on your computer. This is precisely the same approach used by Web browsers (ie. when you download a Web browser, there are no “pages” or “default images” or whatever which come with it… you connect to web sites, you download pages, and they get stored in a cache for later usage).
Actually, according to Eggy Lippmann, a few things are stored in your client: your agent’s inventory (just pointers to UUIDs), the “default” ground textures, default “system” animations and sounds, etc. They simply pale in comparision to what is available online (around 60 MBytes, compared to several TB online).
Why is this so? Why, unlike other games, do we have to download everything? Wouldn’t it be more effective to have already “major” textures/objects ready on our client application instead of getting it from the Internet?
The answer is: no, it’s impossible. With almost 2000 sims (one “sim” [short for simulator] is just an area, and it used to run on a single server; nowadays, the newest servers run 4 sims each), and 100.000 residents, content is several terabytes of data. Not 100 or 200 MBytes like on “conventional” MMORPGs. And the worse part is, what are the “major textures”? Every day the fashionable places (or clothes!!) change, and there is a different set of “major textures” – just like on the Web there are trends which define the “most hot” sites at the moment. Now think what would happen when SL becomes a game with a few million users and some petabytes of data! (1 petatype = 1024 terabytes = 1024 * 1024 gigabytes)
So the reverse approach – big, fat application clients with lots of content built-in – are great for closed and controlled games, where the company has set up an environment for the players to explore, but it simply doesn’t work for the kind of company and “game” SL is.
Streaming technology fundamentals
The core of SL is streaming. It’s no wonder, since Philip Rosedale (the persona behind Philip Linden) is the guy who “wrote the book” on streaming media 🙂 We associate streaming with either video or audio, but LL was clever – they did it with textures and other stuff.
Basically they super-compress textures and deliver it with a technology not unlike RealPlayer to the application client. Just like Real (or QuickTime, or MS, or MP3 streaming…), the trick here is being able to have a single server streaming the same stuff to several “browsers” at the same time, and deal with things like packet loss, textures arriving out of order, and similar stuff. What they do is almost uncanny – you can only appreciate it with an awful connection (like I have) which usually drops packets like crazy anyway, and has fluctuating ping times. I think I actually have the worse possible connection to SecondLife – it has lag, it has noise, it has jitter, and, worse, it has a dynamic allocation of packet priority for heavy usage (this means my ISP simply drops all my packets first and lets the kiddies download their pirated DVDs just because they don’t do it as often as I play SL 🙂 ). Still, everything works smoothly in SL, most of the time!
Now, for those of you who know a little bit about streaming media, you know you usually have an “emitter” (where the media stream is encoded) and a “broadcaster” (where the stream is broadcasted to the users’ application client). Most also allow for “repeaters” or “reflectors” – they are a type of broadcasters who allow for additional streaming (they connect to a broadcaster “like a client” but also allow clients to connect to them and get a “streaming feed” from them). This is, for instance, the way Akamai does it – they try to put “repeaters/reflectors” all around the world to provide the fastest streaming content to end users.
In SL, the broadcasters are the sims we connect to. So, besides holding all the terrain data (and lots of more stuff that we will see shortly), the sims are also responsible for feeding us with streams of textures.
Where is the “emitter” then? Actually, the sims emit and broadcast at the same time. When a sim boots up, it talks to the asset server, looks what it needs to load (i.e., which objects/textures are rezzed in at that particular sim), and broadcasts it to all users on that sim (and also on the neighbouring 4 sims, of course).
There are no “repeaters” or “reflectors” in SL at the moment. These could be a way to provide better service on the East Coast, Europe or Australia/Asia – just set up a few repeaters there.
It takes two to tango, but…
… things in SL get confusing very fast. Where are objects stored? Actually on several different places – according to the white papers, each object is initially stored at the place it was created, and moved (or better: copied and stored at local caches) across sims when needed. The asset server seems to have “pointers” to every object, texture, animation, sound, etc. Overall composition and current “physical location” – ie. the vertices, how prims attach to each other, which texture go into each face, pointers to owner, group, and, finally, where it currently is stored, are actually on MySQL databases, stored inside each and every sim. Inventory has a link to the object when not rezzed in. And sims also have the position of the object when it is rezzed in (and its textures, too). Universal IDs (UUID) are keys for all those components. Think about the “asset server” as being similar to the DNS servers on the Internet ? they have just pointers to things, not the things itself.
To do this efficiently, every sim has it’s own “local copy” of the asset server (apparently, according to Eggy, they have squid acting as cache server, so a request goes for an UUID to the asset server, and the reply is cached locally). Of course, when you upload things, or move it across inventories, you have to make sure the asset server is properly updated.
This is also the reason why new releases usually are a bit slow at first – Eggy says that the cache is cleaned on each grid server for a new release. This means that Squid will need some time to fill up the caches!
(A question I still have is about intra-communication between
grid servers and the “auxiliary” servers. According to the white papers, UDP streaming is only used for broadcasting objects to the users – ie., prims, textures, etc. The sims run Apache and Squid, so I’d guess it’s plain XML over HTTP in queries between them. This tends to strengthen a paper where the people at Linden Lab thought they could do everything over an HTTP connection but gave it up because of the overhead needed to handle thousands of TCP/IP connections).
So the most hard part of SL is when you buy an object from a vendor (the vendor has a pointer to the object in inventory… then the object has to be rezzed in that sim… but it goes into the user’s directory instead and is not really displayed), attach it (suddenly the object is rezzed in), you move to a neighbouring sim (no problem here, since every avatar is listening to streaming from the 4 neighbouring sims anyway and can get the necessary data directly from them) and then use the teleporter! Oops… now you have to tell the asset server that you’re bringing all that stuff with you, inform the sim (and the neighbouring sims) that these objects are suddenly not here any more, contact the new region, ask it to get the objects’ location and textures from the original region, update inventory and asset server with the new locations, display your avatar… and hope that you got all the objects correctly positioned since you left the region you were in!
No wonder people keep losing stuff from inventory when teleporting! (fortunately, this has progressively improived over time)
And don’t forget that your avatar “comes with you”, too. It’s just a 10.000-polygon object (or thereabouts… just to have an idea, a prim cube has 6 faces, so 6 polygons if SL uses squares, or 12 polygons if it uses triangles. Older graphical engines only used triangles), highly difficult to draw correctly, and as you know, SL deals much better with avatars than with high-prim objects!
The trick, according to Eggy Lippmann, is that we actually are all “Children of Ruth”. What does this mean?
Well, according to SL Myth and Legend, Ruth supposedly “really existed” and was one of the first alpha testers. At that time, avatars were built with prims, as everything else, and later avatars were converted to meshes (which we use nowadays). How exactly the “Ruth” avatar became the “first” avatar, is not clear to our myth recorders 🙂 But in a sense she’s our “mother” – that is, ALL avatars start being Ruth, and then settings are applied on top of Ruth’s AV, so that they look like us! Weird idea, huh? Actually, this is what makes SL render avatars superquickly – the application client is something like a “superfast Ruth renderer”. It just gets a small string of settings (200 settings which you set using the Appearance interface – just a few bytes!) from the asset server, and – ta-da! Instant avatars! You just need a few textures for the clothes and skins, they are “baked” (i.e. all clothes layers and the skin are sort of “fused together” in a single texture, and sent to you), and the avatar is done (thanks for Eggy for that piece of info, I really hadn’t figured that out for myself, and I didn’t understand how SL could render so many prims so fast – after all, one sim can only hold 15.000 prims or so, and we know how long it takes to render all of those…)!
The Sim Lottery
When a sim boots up, it’s “empty”, and it loads all the data it needs from a central server. This is a very intelligent approach, because it enables the Lindens to concentrate on a failing machine and have a replacement very fast online (the central server is also an excellent place to make backups from). If a sim fails, you just tell this central server that a new, empty machine needs the sim data, and afterwards you can do maintenance on that sim at your leisure (probably means rebooting and just formatting the machine from scratch with a default configuration). In the mean time, people will already be connecting to the “new sim”. There will be just a “glitch in the grid” for the avatars who were on the failing sim when it was “switched” to another machine – they will gracefully time out and try again by connecting to the new sim. With luck, your application client won’t even crash.
There must be hundreds of different tricks beyond this. A sim, on the other hand, very very rarely looses its objects – the worse that happens is having “lost objects” (ie. objects stored there with no entries on the asset server). Terrain data is a 512×512 “raw” image data (with special channels for indicating land, water, parcels, etc. – the information is available elsewhere, you can start with the Bad Geometry Wiki for a reference), so you can easily backup that on a few bytes. And 15.000 prims is not really much. If we assume they need to have 1 item for each face – and taking into account positioning data, which is done with floating numbers, and trimming on the prims – worse case scenario seems to be a cut torus/ring with 9 faces. So, say, 20 to 30 items per face, 100.000 faces per server, each item won’t take much more than 8 bytes (probably just 4), so 20-30 MBytes to save the whole sim’s information. It’s really not much. Probably they update it dynamically, if they’re using MySQL on their Linux boxes, you can do remote syncronization – takes about 5 minutes to setup, and you get “almost instantaneous” sync’ing. Actually I would do it in a different way – have 5 MySQL databases for each sim, 1 storing objects on the current region, the remaining 4 sync’ing with the neighbouring sims. This means you’d have 4-way backup on most sims (except for islands), which is more than enough for redundancy purposes. In practice, LL does not do it their way ? they just keep open connections from each neighbouring sim to your avatar (or better said, to your agent session). But they still do a remarkably good job in mantaining those databases consistent. My theory is that on the mainland, the chances of dropping an object and “losing it” are virtually nil. The worse that can happen is that you can’t update it on the asset server…
The big, big problem is backuping all those textures… if they were distributed evenly among all servers, it would mean that each has around 250 MBytes of texture data. You understand now why Linden Lab is so reluctant on “restoring backup data” on a sim, just to get your precious, lost object back.
This random allocation of sims is called by oldtimers “the sim lottery”, because you never know when it will be your turn to get an “older” machine. Newer machines use hyperthreading CPUs, and they run 4 sims at the same time (using mostly virtual machines) but there are still a few old machines around (according to the Bad Geometry Wiki, they’re 2.6 and 2.8 GHz Pentium Xeons with 512 MB RAM). Since sims reboot once in a while, there is a fair usage of all machines, both old and new. So you won’t always be on an “old” machine even if you are in the oldest parts of the mainland!
The user server
Another key server is the user server. What exactly it holds I’m still not sure, but certainly unified login is handled on that machine. You can test it, it uses some XML-based authentication scheme (probably OpenLDAP…), and it’s used for the web servers, too. Information about everything related to yourself and your avatar is here – name, profile data about your avatar, probably the groups you’ve joined, the land you own, etc.
This is the server that you connect to when you start the game, of course, but it also does lot of work. It has to be checked to know that you’re online, for instance. Or when you fly over land, it has to be checked to see if you belong to the proper group.
When this server crashes, something interesting happens. Of course, you can’t login any more, but people inworld will still be able to do some things. If you have an active IM session, or chat, you’ll be able to continue it – but your name will be “blank” (since everything in SL uses UUID as keys, stuff will still work using them, even if names aren’t available). This has been happening rarely and rarely with the latest updates. I never tried to “break permissions” when this happens, but it should be possible. However, the user server is pretty stable, because, unlike the grid sims and the asset server, it doesn’t provide much information, it’s mostly read only. Grid sims and the asset server are constantly exchanging information back and forth, and that’s why they are much more susceptible to failure.
More work for poor sims…
Besides streaming, sims really do lots of stuff… they track all object’s positions, they run Havok to detect collisions, they do the physics, run all scripts, give hints about lighting, and even do the weather on the spare time. That’s why the Lindens have to put some limits on all the stuff – maximum number of avatars per sim, maximum number of prims, maximum memory a script may use (16k), etc.
There is some stuff done client-side, too – handling some types of rotations and all the particle stuff. This means that a rotating object will show differently when seen by different people!
In any case… this means that the application client does not do much stuff except rendering. Since content is so dynamic – it literally can change while you turn your head! – LL seemed to prefer a “quick & dirty” approach: feed all data on objects and textures to the client and let it sort it all by himself how things should be rendered (and keep stuff in the cache in order not to ask for objects again). This explains why the rendering is quite crazy, with things appearing randomly at the client. There is some logic in it – apparently, scripted objects are drawn first, then physical objects (ie. no ghosts, no transparencies, no alphas, etc) next, and pretty much all the rest is rendered as soon as it arrives. Since broadcasting textures over UDP is done randomly, they can arrive out of order, and textures are stored in differently compressed versions, so that you may render them with increased quality when they arrive, but start with low-res textures if you have the overall information (according to the Lindens, JPEG2000 is an excellent format for this kind of behaviour). Things are fuzzy at the start but will increasy quality very quickly (or not… if the sim is too slow).
Chat and IM
I’m still not sure about how these work. IM is pretty easy, it’s just a two-way channel (or several-way channel) between a group of people, and it doesn’t have any restrictions at all (ie. no need to consult the sim, or any other thing, besides the user server to see if the users are all online or not). I would think that the user server handles IM as well. According to the white papers, there is no direct client-to-client communication.
On the other hand, chat is much more difficult to do. First, it is subject to “listening distance”, so it has to be tied in into the sim – the sim needs to know how far your words go, figure out who is listening on that range, and display words accordingly. That is why we get chat lag and not IM lag – a busy sim will not be able to process chat as fast as desired, and that’s why you get messages out of order (or even lose some). There is no easy way to do this: everything which needs to track sim position needs to be processed by a sim…
Bottlenecks and improving performance
If you have read up so far, this structure of the SL grid pretty much resembles how the Internet works. You have tons of individual and private Web servers, and DNS which is the “glue” which enables addresses to be converted to IPs. There are “accessory” servers like databases, email, chat, etc. but they don’t interfere with Web sites.
A similar construct is used by LL, since they want people to have their “private” sims – meaning that one sim has to handle everything pretty independently. However, inter-sim communication is easily done under this model.
Common login/user information is handled on a separate server, which makes sense – you use the same login information on the whole grid.
The problem is mainly with the asset server. You see, if one sim fails, you affect at most the 25-35 people that are on the sim at that time. If the asset server fails (or is slow), it affects everybody. Actually, the asset server is the major bottleneck on SL! Recently, they have added several asset servers running in tandem, but often they’re simply too slow.
I think that LL thought that they could minimize impact on the asset server because the streaming would be done by the sims, who would not need to do a lot of lookups on the asset server – since sims only need part of the whole asset database (theory: you have 1000 sims, so each sim just has at most 1/1000th of the database, on average). This would be probably reasonable if there weren’t many updates on the asset server! But there are attachments, inventories to update, permissions that change, scripts that are edited, and all kinds of stuff that are updated at the asset server.
I think that to improve redundancy, every time something gets updated, it’s written on the the asset server, but cached almost immediatly on the sim’s local copy. This is why you don’t lose objects so easily. If you delete something by accident, it’s on the asset server marked under Trash. If the asset server dies, you’ll have it either on the sim (and next time the asset server goes up again, the sim will try to sync the data with it) or at lost & found (basically, the sim has an object from you which isn’t rezzed in, which shouldn’t happen – it belongs somewehere to your inventory, but only the asset server knows where).
So the grid is pretty stable in tems of content persistence, even if glitches/lag occur very frequently. Again, this mirrors the way the WWW works – a stateless protocol, where you can connect “partially” and still get some content (say, text but no images, or some images but not all of them, etc.).
However, the bottleneck created by the asset server is almost unbearable – there are more and more objects and textures around, and it takes lots of time just to update it. When I started the game, dropping things from my inventory into an object was instantaneously (we had 10,000 residents then). Saving a script took a few seconds. Nowadays, with around 100,000 residents, the same operation takes between 8-30 seconds on average, and sometimes even several minutes – or times out. The problem is that there is just one centralized asset server cluster for 1000 sims! If it weren’t like that, you couldn’t bring your objects from one sim into the next…
How can you deal with this and improve overall performance?
Some older players think that “Lindens is doing everything wrong”, there should be a redundant array of grid computers, dynamically allocating CPU timeslots as needed. Under the current model, some sims are almost empty doing nothing, while others are overcrowded. Under a redundant grid of dynamic allocation (hint: read about the Beowulf project), also called a cluster, or, if you will, a loosely connected supercomputer, things could potentially run much more smoothly. You basically just have one big computer running everything – one system to manage, one system to do backup, one system to look at to see if something is going wrong. No bottlenecks. If one of the 1000 computers goes down, well that’s too bad, it just means you loose 0.1% of the performance! That’s perfectly acceptable! And there are not “bottlenecks” since applications are uniformly distributed among all systems.
Also, you could have perfectly all online users at the same region, if you wished, or one zillion objects just for one building in one region (assuming, of course, that your graphics card could render so many polygons fast enough!). There would be absolutely no problem with that model – that’s how supercomputers are built. Better still, teleporting would be as simple as just changing the avatar’s position. The whole concept of “sims” or “one region per sim” would not make any sense, you could design a world as large as you wished – 40.000 km x 40.000 km, if you wanted. To increase performance if things are getting laggy, you just need to add a few more computers and more bandwidth. Better: instead of the “sim lottery”, you could use all kinds of crazy hardware to hook up on the cluster. Things like Beowulf clusters handle load balancing application CPU & memory consumption – so you could have a few 486 running there, the CPU power would NOT be wasted.
Of course, this also means rewriting SL from scratch for such a type of system. The current software base simply won’t do it. The best they can do is running 4 sims at a time on a single hardware server ? that way, CPUs are shared, and unless all four sims are very busy simultaneously, you get better resource allocation among them.
But under the current model, if one sim fails, you drop 25-50 people out. If the asset server cluster fails (and yes, even with the redundancy added, this happens), no one can play the game. If the user server fails, no one can login (still happens quite often). And so on. Even if you have “empty” sims, you cannot use their CPU cycles to help “heavilly saturated” sims. It seems to be a much worse model than the alternative!
Actually, the “alternative” is used by some of the competitors (they usually go the hard way about it, writing their own cluster software, instead of using open source solutions, but the principle is the same). They have much less content to worry about, so the easiest way is to replicate all data among identical machines, and split users among several machines. That’s why they usually get much better performance than we do in SL. World of Warcraft claims to have 4 million users, 1 million of them simultaneously online, and just a hundred servers to handle them all. There is some problem in getting in touch with your friends logged in on a different machine, but there are clever tricks to deal with that.
As good as this model sounds, it simply cannot be used by Second Life. Remember where we started our discussion – Linden Lab is a hosting provider, not a content provider. To be able to “host” things easily, they have to have clearly defined “borders”. One region per sim sounds a very good compromise. As to the asset server cluster… it has to be universal, and it has to be read/write, and deal with objects that cross regions. And finally, SL simply has too much dynamic content. You can’t pack it all on the same machine and expect it to work.
Still, you could have “virtual regions” on the game… ie. “define” that this parcel of land is used by company X wishing to rent it. Why give them a “physical” machine?
The question is, LL is once more looking towards the future… and in the future, there will be several servers around the world – probably isolated islands, or clusters of islands – hosting completely private sims, licensing the software from LL, and doing maintenance on their own sims. For this to become reality, LL needs to be able to have this region/sim concept ready. Still, I’m not sure how they’ll deal with the asset server (IM is probably easy to do and easy to replicate; user server can be cached, since user data is much more read-only than write; just the asset server is a real problem here!). The only “easy way out” is having several asset servers, and contact them using a simple hash on the keys. This means that you have to apply a simple algorithm to know which asset server you should contact – say, all keys starting from “0” to “7” go into asset server #1, the ones from “8” to “f” go into asset server #2. This is a pretty simple implementation and could be easily done with a simple patch. When logging in, the application client would just need to get the hash function and a list of asset servers available 🙂 [Gwyn’s note: rumour says that this is eaxctly the approach they took when moving from a single server to a cluster of servers, but there is no easy way to check this from the logs]. This is vaguely similar to the concept of DNS servers outside the US (there is just one DNS system, but each country has its own server to host its own “national” data, and you know which servers to look up if you need to know other countries’ data).
Still, as there is a “mainland” (as opposed to several disconnected islands), LL could do a mixed approach. Buy an IBM mainframe designed to host 10,000 virtual Linux servers (for the cost of about 100, and the size of a mini-fridge). Since each virtual server gets its own IP address, from the perspective of the players, there would be absolutely no difference! And on this type of hardware, the mainframes’ CPU distribute the load much better – so if you have, say, 32 CPU’s, you could have 20 or 30 to handle a big event with 300+ avatars on the same region, while the remaining CPUs would run the “empty sims”. No waste of resources there – you could even have the asset server and the remaining servers running on the IBM mainframe, too. But independent private islands would still have their own, individual machines. And sims on other parts of the world – run and mantained independently from LL – would still be able to run their own software and be connected to the grid. Actually, if someone gets into the business earnestly, they could also buy their own IBM mainframe and host a “rival continent” with the same features of CPU load balancing as LL’s mainframes. Finally, I haven’t read much about it, but since you can have virtual Linux servers running on top of Linux itself, theoretically you could run those virtual servers on a few 16-CPU machines for a fraction of the cost of an IBM mainframe.
There are some technical solutions around. It’s up to LL to look into them and see what they can come up with!