[Beowulf] slow jobs when run through queue

Nick Evans nick.c.evans at gmail.com
Tue Dec 5 21:47:42 PST 2017

Thanks Brian / Carl / Chris for places to look.... it turned out to be what
Chris had mentioned and they were only requesting 1 CPU but trying to use
all 48 in the machine.

Resubmitted the request asking for all CPU's and the job ran in the
expected amount of time.

Thanks again

On 6 December 2017 at 12:58, Chris Samuel <chris at csamuel.org> wrote:

> On 6/12/17 11:44 am, Nick Evans wrote:
> We have found that if we submit a job to the queue then it takes a long
>> time to process. ie. >4 hours
>> If we are to run the exact same processing directly on the compute node
>> then it is significantly faster < 1 hour.
> Some quick ideas
> Are you comparing a job that has asked for all cores and all RAM with
> it running directly on the node?
> Try using "perf top" to get an idea of what's going on with the node
> when doing the comparison runs, perhaps "perf record" too but I can
> never remember if an unprivilged user can do that.  That might shed
> some light.
> To me it sounds like it might be something that that checks how
> many cores a node has naively and then starts that many threads/
> processes and if the batch job only asks for a single core, or
> less than all, then you might end up with a lot of contention.
> Good luck!
> Chris
> --
>  Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20171206/b02992b5/attachment.html>

More information about the Beowulf mailing list