[Beowulf] Scheduler question -- non-uniform memory allocation to MPI

Christopher Samuel samuel at unimelb.edu.au
Thu Jul 30 16:13:40 PDT 2015

On 31/07/15 04:51, Tom Harvill wrote:

> Thank you for your reply.  Yes, it's 'bad' code.  It's WRF mostly.

We've also seen this same issue with NAMD where rank 0 uses more memory
as it's tracking all the information about other ranks in order to load
balance correctly.  Admittedly this was on BlueGene/Q where you're
running thousands of ranks and each node only has 1GB/core so if you're
running 16 ranks per node you can hit that 1GB limit easily.

The solution there was (IIRC) to guide the user to use the SMP build so
they could run 1 rank per node and multithread on the node instead.

 Christopher Samuel        Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
 http://www.vlsci.org.au/      http://twitter.com/vlsci

More information about the Beowulf mailing list