[Beowulf] slow mpi init/finalize
Michael Di Domenico
mdidomenico4 at gmail.com
Wed Oct 11 07:12:02 PDT 2017
i'm seeing issues on a mellanox fdr10 cluster where the mpi setup and
teardown takes longer then i expect it should on larger rank count
jobs. i'm only trying to run ~1000 ranks and the startup time is over
a minute. i tested this with both openmpi and intel mpi, both exhibit
close to the same behavior.
has anyone else seen this or might know how to fix it? i expect ~1000
ranks to take sometime to setup, but it seems to be taking longer then
i think it should
More information about the Beowulf
mailing list