[Beowulf] scheduler policy design
Tim Cutts
tjrc at sanger.ac.uk
Thu Apr 26 02:20:58 PDT 2007
On 26 Apr 2007, at 10:06 am, Toon Knapen wrote:
> Tim Cutts wrote:
>> The compromise we ended up with is this set of LSF queues on our
>> system (a cluster with about 1500 job slots):
>> QUEUE_NAME PRIO STATUS MAX JL/U JL/P JL/H NJOBS
>> PEND RUN SUSP
>> yesterday 500 Open:Active 200 10 - - 1
>> 0 1 0
>> normal 30 Open:Active - - - - 281
>> 110 171 0
>> hugemem 30 Open:Active - - - - 3
>> 0 3 0
>> long 3 Open:Active - - - - 4022
>> 2987 1035 0
>> basement 1 Open:Active 300 200 - - 127
>> 0 127 0
>> yesterday:
>> a special purpose high priority queue for the "I need it
>> yesterday" crowd. No run length limits, but very limited in terms
>> of how many slots the user can use.
>
>
> Do you have slot reserved exclusively for the 'yesterday' queue or
> to any of the other queue's ?
No, yesterday is just the highest priority queue, so when a slot
comes available anywhere yesterday jobs tend to get it. Given the
number of job slots we have (more than 1,500) and the various limits
that are in place, the pathological corner cases which would stop a
yesterday job getting onto the system within a couple of minutes are
pretty rare (and in fact I have not yet seen it happen). Even if the
system were full of jobs running for the full 24 hour maximum, you'd
get a node coming free on average every minute or so. I should point
out here that the vast majority of our jobs are serial single
processor jobs solving embarrassingly parallel problems. If we start
to get significant multi-CPU jobs we may have to re-think this strategy.
The only queue which has dedicated slots is hugemem, because its
specifically for the Altixes (and none of the other queues can send
jobs to the Altixes). We don't dedicate any other machines to
individual queues or purposes, because doing so would reduce the
cluster's throughput unless it was *extremely* carefully managed. My
personal view is that it's only worth dedicating nodes to a
particular task type if you can guarantee that there are enough of
those tasks available to keep the specialised nodes continually busy;
in which case you effectively have a second cluster dedicated to that
task.
Tim
More information about the Beowulf
mailing list