Thursday, March 4, 2010

14 of 52 - Why Second Life eventually will fail...

... (and why OpenSim won't make it either).

However much I love Second Life as a resident, with its unique possibilities of developing ones creativity and meeting people from all over the world: As an IT professional, this world makes me want to cry.

The symptoms of the basic architectural problems are eminent everywhere: Lag, inventory breakdowns, sim crossing disasters and object delivery failures. But some of these are just that: Symptoms. The real problem is how Second Life is founded: SIM centric.

Second life really is a one-sim game. Its quite evident that when Linden Lab first tried out their idea, they made a server capable of running one sim, and one sim only. When they had done that, they started adding cludges of code to handle adjacent sims.

There are code to handle situations when an avatar looks into another sim. There are more code to handle avatar crossings. And code to handle vehicle and object crossings. All of that code is based upon the need for several servers to transfer work from one to another, and your viewer to connect to several servers. When you cross a sim border, your viewer actually changes its main connection to a different server. If that new server is a bit busy just then, it may delay answering your call for several seconds. The result is that your avatar is drifting, something we have all experienced. If the delay is large, the viewer will loose its sync with the servers, and you will end up drifting into the ground and finally crash.

Just imagine how fast this reconnection must happen if you are driving a boat or an airplane at high speed! The strange thing is not the catastrophes that happens to you, but the fact that it works at all:-)

So, Sim crossings is one major problem with this architecture. The second problem is that of scaling.

Second life needs one CPU core for each full SIM. This one core is dedicated to running a sim, even when there is no-one there to run it for. So, the architecture scales with land, making land tier expensive (because it has to pay for idle servers) and wasting processor power that could be used elsewhere in busy places.

A much more scalable setup would be to have avatar and scripted objects servers. One such server would handle a set of avatars and their objects, so you could scale with concurrent logins instead of land. A SIM would only be a set of information, accessed by any process needing it.

There are of course complications in such a scheme. I admit to not know enough to really criticize the Lab for choosing and upholding the architecture they have.

My proposed setup could have a huge impact on the economy of the world. Basically, tier could scale according to the use of your objects (traffic, running scripts) instead of the prims and land themselves. So a full sim for your home, used for an hour or 2 a day, would not be more expensive than an 2K parcel with a busy shop.

LL talks about "viewer 2.0" as the next main thing (LOL this was obviously written a few weeks age:-) ). Instead of doing "server 2.0", that should be a complete rewrite of the architecture. But that is obviously never going to happen, simply because it's way too late. The coding cost would be very large, and the transfer project enormous, because the world has grown to the size it has.

So, driven by the unstoppable momentous of its own weight, this single world may head to its grinding halt, where it has used every resource available to it and just can't scale no more. (A somehow interesting analogy to how our real world is doing; the momentous of its growth makes it seemingly impossible to change what we know must be changed to avoid an environmental disaster.)

Btw, OpenSim is unfortunately just a copy of the same flawed setup. Any OS based grid growing will face the same challenges the Lab has had to resolve in the past few years.

Somewhat pessimistic, I know. But heck, it should be spring, but there are 15 cm of fresh snow in my yard (sigh).


iliveisl said...

very well written! the one sim per core dealio is very inefficient as you point out

imagine all the empty sims at any one moment, that's a lot of idle power. sure would be nice to load share that (of course, people would have a cow if that was on private estates because we think that if we pay for it, it is 100% ours even when not in use)

now if i was guaranteed as much as a core when i needed it, i would never be able to tell if some of my resources were in use when i did not need them

and that is part of being more responsible to our planet. if LL could use half the servers because the load was shared, imagine that energy saving!

well anyway, my two cents

thank you for a great blog post!

Cristopher Lefavre said...

Thanks Ener!

Yes, from an energy saving point of view LL should at least virtualize their servers, the way I think ReactionGrid is doing. As an alternative to rewriting the whole architecture it could also give lesser lag to the busy sims by giving them MORE than one core when needed. That is, IF the server code is compatible with that (threadsafe), something I suspect it's not:-)

Alas, it wont improve sim crossing disasters...

BUT, the server 1.38 announcement last week tells me that LL is finally doing some server cleanup on the performance side.