[Beowulf] Suggestions for scheduling software

stephen mulcahy smulcahy at aplpi.com
Tue Aug 19 02:20:50 PDT 2008


Hi,

Up to now we've been working with a 20 node cluster where we'd have the 
luxury of working without any scheduling or queuing software - the 
cluster is pretty much dedicated to running a single job and is manually 
invoked with mpirun.

We're moving to a much larger cluster in the near future and are keen to 
keep the utilisation as high as possible. On the new cluster we have to 
to run 2 distinct jobs - one is a long-running (weeks or possibly 
months) job and the other is a regular short running job (running in a 
few hours) which has to run at a specific time each day.

We're currently looking at using SLURM for queuing up jobs on the system 
but I'm not sure if it will meet all of our needs here. Ideally, we'd 
have some system that would allow us to queue up the long-running job 
and a series of short-running jobs and the system would automatically 
suspend the long-running job when the short-running job is due to start, 
run the short-run job and then restart the long-running job.

I expect we're not the only ones in this situation. Is SLURM the right 
tool for this job? If not, can anyone recommend other tools out there, 
preferably open source?

Thanks,

-stephen

-- 
Stephen Mulcahy, Applepie Solutions Ltd., Innovation in Business Center,
GMIT, Dublin Rd, Galway, Ireland.  +353.91.751262  http://www.aplpi.com
Registered in Ireland, no. 289353 (5 Woodlands Avenue, Renmore, Galway)



More information about the Beowulf mailing list