Some background on the underlying technology
Let’s scroll back to the beginning. Linden Lab’s Second Life Grid® is a technology based on a set of fixed assumptions. There is a large array of central servers (which nobody outside the ‘Lab knows what they are or how they’re configured). Then there is a vast number of some 5,000+ servers, all using relatively similar configurations (servers of the same class have similar hardware specs, and they all run the same software), which, in turn, run the simulator software (sims), each region running on its own virtual machine (4-16 can run on a single server, although semi-official rumours claim that LL has been experimenting with higher density of CPUs, and may have one day 64 openspace regions on a single server). All this is automated: from the installation of the operating system to the region simulator, it’s all done with point-and-click configuration to set them up when someone buys a region from the Land Store.
Things started to become confusing when Linden Lab started to move out of a single co-location facility. Now all of a sudden, some sims would be running outside the network where the central servers were. Apparently, the addresses of those servers are hard-coded in some way, so the solution was to create first a VPN between co-location facilities (which are a software-based solution to make external servers “look like” they are in the same physical network), and, later, to simply run fibre between the co-location facilities, effectively placing the servers once more in the same physical network.
Some comments made by the engineers (like this explanation by FJ Linden, which followed a more thorough description of how the grid is physically implemented) explained that some co-location facilities actually ran some of the central asset servers: Dallas and Phoenix apparently have their own central servers, while Washington DC has not. It’s not clear if they’re just clones of each other and kept in sync (like during an earlier implementation). Also, thanks to HTTP downloads of assets, implemented last week, now the core of all requests to the asset servers can be pushed into the Amazon S3 cloud, saving precious bandwidth, and providing Linden Lab an increased layer of reliability — Amazon S3 is way more stable than LL’s own setup, merely because that’s the core service they provide, and cloud technology is currently the best known way to provide almost infinite redundancy.
Nevertheless, what is important in this picture is that all of the Second Life Grid runs on an unified system. There is just a single architecture for the whole grid. Of course, individually, there are thousands of servers to track down, but all work pretty much the same way.
Now contrast that with the rest of the Internet! Each system, each network, each computer, each node is different. Servers and active network devices have all sorts of possible configurations. From smartphones to high-end routers, from home-run Web servers on old hardware to complex cluster solutions and massively parallel supercomputers, to cloud computing — each system is uniquely managed, and each architecture is different. Google might have more servers than Amazon or Microsoft, and although their specs are the same on all of Google’s co-location facilities around the world, their configuration has nothing to do with what their competitors have — and vice-versa. There is no “standard” way of connecting servers and networks to the Internet-at-large, except for one thing: they all speak the same protocols.
The difficulty that Linden Lab had when they started moving servers to a different co-location facility was that their architecture was never planned to be anything but uniform. Merely having “two locations” introduced a new layer of complexity that was never foreseen. Sims, although pretty much independent from each other, were designed to work in the same network and connect to the same central servers. Everything was designed from the ground up having one grid in mind.