[Beowulf] sun grid engine on Scyld beowulf cluster

Chris Dagdigian dag at sonsorol.org
Thu Feb 17 14:12:07 PST 2005

I know Grid Engine well but not Scyld so forgive my ignorance if I say 
something stupid and given the level of expertise on this list I'm quite 
certain I'm about to make a fool myself :)

If Scyld is presenting you with a single system image (ie a single linux 
server that can farm out tasks to all those nodes) then you would 
install SGE in the same way that you would install it on a big SMP box:

1. Install the SGE qmaster and scheduler on the master node
2. Install the execution host on the master node as well

You will only have 1 execd per queue but each queue can be configured 
with N number of "job slots" which actually control how many jobs can 
run at the same time on the same machine.

Try setting your # of job slots within your single SGE queue to the 
number of nodes in your cluster. This is simlar to what you would do on 
a big SMP machine -- small number of queues each supporting a decent 
jobslot count.

Then submit a bunch of jobs and see if SGE causes the master node to 
fall over under load. If not then Scyld is doing its thing behind the 
scenes to migrate stuff around to the other nodes.


billk01 wrote:

> I am in the process of installing SGE on a Scyld beowulf cluster.  As
> most people are aware, the Scyld cluster runs a complete OS (linux) only
> on the master node and the compute nodes are simply for executing.
> During the SGE install, it requires adding the compute nodes as execute
> hosts.  I do not understand how to do this given the current setup of a
> scyld cluster since you can't "login" to the nodes to execute the
> install script.  The script does exist on an NFS shared directory
> (cluster wide).  Has anybody else ran into this problem?

