[Beowulf] Avoiding/mitigating fragmentation of systems by small jobs?
Scott Atchley
e.scott.atchley at gmail.com
Sat Jun 9 08:22:07 PDT 2018
Hi Chris,
We have looked at this _a_ _lot_ on Titan:
A Multi-faceted Approach to Job Placement for Improved Performance on
Extreme-Scale Systems
https://ieeexplore.ieee.org/document/7877165/
This issue we have is small jobs "inside" large jobs interfering with the
larger jobs. The item that is easy to implement with our scheduler was
"Dual-Ended Scheduling". We set a threshold of 16 nodes to demarcate small.
Jobs using more than 16 nodes, schedule from the top/front of the list and
smaller schedule from the bottom/back of the list.
Scott
On Sat, Jun 9, 2018 at 2:56 AM, Chris Samuel <chris at csamuel.org> wrote:
> On Saturday, 9 June 2018 12:39:02 AM AEST Bill Abbott wrote:
>
> > We set PriorityFavorSmall=NO and PriorityWeightJobSize to some
> > appropriately large value in slurm.conf, which helps.
>
> I guess that helps getting jobs going (and we use something similar), but
> my
> question was more about placement. It's a hard one..
>
> --
> Chris Samuel : http://www.csamuel.org/ : Melbourne, VIC
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20180609/8341f114/attachment.html>
More information about the Beowulf
mailing list