[Beowulf] MPI_Alltoall

Tue Apr 12 06:52:49 PDT 2005

On Tue, 2005-04-12 at 02:29 -0700, Rita Zrour wrote:
> Hello I have a question,
> when i do many  MPI_Alltoall in my program always the
> first MPI_Alltoall take too much time to be done. 
> 
> I don't know where the first communication is always
> expensive. Is that a problem of memory???????

Many MPI implementations do "lazy" allocation of resources, comms
buffers and descriptors, it's not unusual for the first iteration of a
loop to have to allocate these on the fly, future iterations simply
re-use cached descriptors/handles as needed.  This isn't unique to MPI
but happens nearly everywhere in the software world, perhaps alltoall
exposes it more as it has more simultaneous pending send/recvs than
anything else?

Plus of course I assume you are actually initialising your data before
you send it, far to many people write "benchmarks" that just send
un-initialised mmaped() memory and end up measuring the page fault
performance rather than the network bandwidth.

Proper benchmarks (for the most part) zero all data before they send it
and do a handful of warmup laps before doing any measurements, even
without extra allocation/faulting simply having the data cache-hot can
make a difference to measured performance.

Ashley,