[Beowulf] Scheduler question -- non-uniform memory allocation to MPI

Prentice Bisbal prentice.bisbal at rutgers.edu
Thu Jul 30 11:37:42 PDT 2015


I don't want to be 'that guy', but it sounds like the root-cause of this 
problem is the programs themselves. A well-written parallel program 
should balance the workload and data pretty evenly across the nodes. Is 
this software written by your own researchers, open-source, or a 
commercial program? In my opinion, your efforts would be better spent 
fixing the program(s), if possible, than finding a scheduler with the 
feature you request, which I don't think exists.

If you can't fix the software, I think you're out of luck.

I was going to suggest requesting exclusive use of nodes (whole-node 
assignment) the easiest solution. What is the basis for the resistance?


On 07/30/2015 11:34 AM, Tom Harvill wrote:
> Hi,
> We run SLURM with cgroups for memory containment of jobs.  When users 
> request
> resources on our cluster many times they will specify the number of 
> (MPI) tasks and
> memory per task.  The reality of much of the software that runs is 
> that most of the
> memory is used by MPI rank 0 and much less on slave processes. This is 
> wasteful
> and sometimes causes bad outcomes (OOMs and worse) during job runs.
> AFAIK SLURM is not able to allow users to request a different amount 
> of memory
> for different processes in their MPI pool.  We used to run Maui/Torque 
> and I'm fairly
> certain that feature is not present in that scheduler either.
> Does anyone know if any scheduler allows the user to request different 
> amounts of
> memory per process?  We know we can move to whole-node assignment to 
> remedy
> this problem but there is resistance to that...
> Thank you!
> Tom
> Tom Harvill
> Holland Computing Center
> hcc.unl.edu
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf

More information about the Beowulf mailing list