[Beowulf] Avoiding/mitigating fragmentation of systems by small jobs?

Sun Jun 10 03:26:15 PDT 2018

On Sun, Jun 10, 2018 at 06:46:04PM +1000, Chris Samuel wrote:
> On Sunday, 10 June 2018 1:48:18 AM AEST Skylar Thompson wrote:
> 
> > We're a Grid Engine shop, and we have the execd/shepherds place each job in
> > its own cgroup with CPU and memory limits in place.
> 
> Slurm has supports cgroups as well (and we use it extensively), the idea here 
> is more to try and avoid/minimise unnecessary inter-node MPI traffic.

We have very little MPI, but if I had to solve this in GE, I would try to
fill up one node before sending jobs to another. The queue sort order
(defaults to instance load, but can be set to a simple sequence number) is
a general way, while the allocation rule for parallel environments
(defaults to round_robin, but can be set to fill_up) is another specific to
multi-slot jobs.

Not sure the specifics for Slurm, though.

-- 
Skylar