[scyld-users] SCYLD/MOSIX for Game Server SSI/Process Migration?

Scott Taylor scott at trinitygames.com
Tue Feb 8 13:36:11 PST 2005


Hi,

I'm new to the list, first post.  Thank you for allowing me to post.

We've begun the long and painful process of exploring various cluster 
solutions.

We run online game servers, so for us, our applications are:

- Many separate serial tasks
- Varying load per task
- Task/job/applications run all the time

We know we won't gain many of the benefits people generally expect from 
parallel systems.  This is OK for us.

Our primary objective is to increase efficient use of server space by load 
balancing on the fly. i.e. SSI with Process Migration.

Our game server daemons, like all others, start with a burst of CPU to 
setup, then are generally idle, until players join, and then CPU increases 
with player load.  This is where we hope process migration will help us to 
better utilize our serverspace.  We don't know in advance which of the 
hundreds of game servers we operate will actually have player counts and 
when, so a single system image with process migration among many nodes 
appears ideal for us.  With separate systems and manual load balancing, we 
end up with many idle systems and some that are indeed overloaded.

So far, the cluster solutions we've studied are:

MOSIX
SCYLD

If SCYLD can do what we want, and is affordable, it sure leems like the 
obvious choice due to apparent ease of installation and configuration.

However, a major concern is that a migration event will cause a "lag spike" 
on the game server daemon being migrated or other gaming processes on the 
system -- this is a real show stopper for game servers, and our users would 
not tolerate it.

Our processes can be compared to near real-time applications like streaming 
video or audio, and any hiccup is very noticeable.

In a paper written in Nov. 2002, Carlo Daffara raises this issue, and 
overcomes the problem by using iproute2 queue controls.  Here is an excerpt 
from the writing:

http://www.democritos.it/events/openMosix/papers/Openmosix4n.pdf
"Another problem appeared during testing: since the game server memory 
footprint is large (around 80 Mbytes each), we discovered that the migration 
of processes slowed down the remaining network activity, introducing 
significant packet latency (especially perceptible, since packets are very 
small). So, we used the linux iproute2 queue controls to establish a 
stochastic fair queuing discipline to the ethernet channels used for 
internode communications; this works by creating a set of network "bins" 
that host the individual network flows, marked using hashes generated from 
the originating and destination IP addresses and the other part of the 
traffic
header. The individual bins are then emptied in round robin, thus 
prioritizing small packets over large transfer and not penalizing large 
transfers (like process migration)."

So, the questions raised so far in our quest are:

- Does Scyld support process migration and load balance like [MOSIX]?
- Will the process migration event cause a hiccup as described by Daffara?
- Does our GigE network [help to] overcome this problem?
- Is it necessary (or even possible) to use the iproute2 queue controls on 
SCYLD?

I certainly would appreciate anyone's input on any of these or other related 
issues.

This is our available test hardware:

Head/Master:
Twin Xeon 2.8, 2G, 80G SATA primary for root/boot, some big RAID for the 
'common' filesystem (tbd).

Nodes: P4 3.0/800 1G, Diskless. PXE/Gigabit NIC.

Network:
Dedicated Gig. Switch, GigE/PXE in every node.

We haven't installed any O/S yet.  I'm still trying to find out how to 
obtain SCYLD.  We are waiting for an answer from an email sent to the email 
address on the site which is supposed to be emailed to find vendors.

Thank you,
---
Scott Taylor
Network Administrator
Trinity Gaming 




More information about the Scyld-users mailing list