[Beowulf] wall clock time for mpi_allreduce?
landman at scalableinformatics.com
Sun Sep 12 05:52:53 PDT 2010
On 09/10/2010 10:46 PM, xingqiu yuan wrote:
> I found that use of mpi_allreduce to calculate the global maximum and
> minimum takes very long time, any better alternatives to calculate the
> global maximum/minimum values?
There are several variations on this theme you can try, and some might
work better than others. All will be more verbose than the allreduce
repeated vector reductions.
1) Take M-vectors of length N so your vector you are reducing (index as
1:N in F90, or 0:N-1 in C/C++) and do a maximum and minimum reduction.
2) Take vectors of length 2, and use pair reductions. Every iteration
you have 1/2 of the previous generation. Would require something on the
order of log_2(Vector_length) iterations.
This said, while allreduce is a collective and something of a
heavyweight operation, you might be dealing with slowness due to
something else. I'd suggest some careful measurements of the time
between some timing calipers to help you determine where things are
spending time. Allreduce and other collectives do require
synchronization, so if something is delaying the synchronization, then
it will appear slower.
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics, Inc.
email: landman at scalableinformatics.com
web : http://scalableinformatics.com
phone: +1 734 786 8423 x121
fax : +1 866 888 3112
cell : +1 734 612 4615
More information about the Beowulf