[Beowulf] sun grid engine on Scyld beowulf cluster

Reuti reuti at staff.uni-marburg.de
Wed Feb 23 00:23:07 PST 2005


Hi,

maybe this is of help:

http://noel.feld.cvut.cz/magi/sge+bproc.html

Cheers - Reuti



Quoting BillKnebel <billk01 at metrumrg.com>:

> 
> Chris,
> 
> I was able to get grid engine to run on the Scyld cluster using the 
> approach of setting the master (head) node as the submit, admin, and 
> execute host.  Unfortunately, starting a set of jobs on the cluster 
> results in all jobs being run on the head node only (if grid engine only 
> commands are used) or I can integrate grid engine "qsub" command with  
> some of the Scyld tools to get jobs started then migrated ( to a point) 
> over the cluster.  However, I am still running into problems becuase all 
> of the queueing variables for grid engine read the headnode info and 
> since all jobs run on the compute nodes, the headnode appears to be 
> always free which results in all jobs being started at once. This is not 
> ideal. 
> 
> I am waiting on some feedback from Scyld/Penguin computing on some 
> related issues that will hopefully solve some of these problems. 
> 
> Bill
> Chris Dagdigian wrote:
> 
> >
> > I know Grid Engine well but not Scyld so forgive my ignorance if I say 
> > something stupid and given the level of expertise on this list I'm 
> > quite certain I'm about to make a fool myself :)
> >
> > If Scyld is presenting you with a single system image (ie a single 
> > linux server that can farm out tasks to all those nodes) then you 
> > would install SGE in the same way that you would install it on a big 
> > SMP box:
> >
> > 1. Install the SGE qmaster and scheduler on the master node
> > 2. Install the execution host on the master node as well
> >
> > You will only have 1 execd per queue but each queue can be configured 
> > with N number of "job slots" which actually control how many jobs can 
> > run at the same time on the same machine.
> >
> > Try setting your # of job slots within your single SGE queue to the 
> > number of nodes in your cluster. This is simlar to what you would do 
> > on a big SMP machine -- small number of queues each supporting a 
> > decent jobslot count.
> >
> > Then submit a bunch of jobs and see if SGE causes the master node to 
> > fall over under load. If not then Scyld is doing its thing behind the 
> > scenes to migrate stuff around to the other nodes.
> >
> > -Chris
> >
> >
> >
> > billk01 wrote:
> >
> >> I am in the process of installing SGE on a Scyld beowulf cluster.  As
> >> most people are aware, the Scyld cluster runs a complete OS (linux) only
> >> on the master node and the compute nodes are simply for executing.
> >> During the SGE install, it requires adding the compute nodes as execute
> >> hosts.  I do not understand how to do this given the current setup of a
> >> scyld cluster since you can't "login" to the nodes to execute the
> >> install script.  The script does exist on an NFS shared directory
> >> (cluster wide).  Has anybody else ran into this problem?
> >>
> >
> >
> >
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
> 





More information about the Beowulf mailing list