[Beowulf] Re: Re: Home beowulf - NIC latencies
rross at mcs.anl.gov
Mon Feb 14 09:04:17 PST 2005
I don't know all the implementations well enough to comment on them
one-by-one. I'm sure that Rossen can talk about their implementation with
regards to (a) below, and others will fill in other gaps.
In general, to support (a) the implementation must either spawn a thread
or have support from the NIC to make progress (this is related to the
"Progress Rule" that people occasionally bring up). The standard *does
not* specify that progress must be made when not in an MPI_ call.
MPICH/MPICH2 do not use an extra thread (for portability one cannot assume
that threads are available!). Thus the only overlap that occurs in MPICH2
over TCP is through the socket buffers.
Making a sequence of MPI_Isends followed by a MPI_Wait go faster than a
sequence of MPI_Sends isn't hard, particularly if the messages are to
different ranks. I would guess that every implementation will provide
better performance in the case where the user tells the implementation
about all these concurrent operations and then MPI_Waits on the bunch.
Hope this helps some,
Rob Ross, Mathematics and Computer Science Division, Argonne National Lab
On Mon, 14 Feb 2005, Mikhail Kuzminsky wrote:
> Let me ask some stupid's question: which MPI implementations allow
> a) to overlap MPI_Isend w/computations
> b) to perform a set of subsequent MPI_Isend calls faster than "the
> same" set of MPI_Send calls ?
> I say only about sending of large messages.
> I'm interesting (1st of all) in
> - Gigabit Ethernet w/LAM MPI or MPICH
> - Infiniband (Mellanox equipment) w/NCSA MPI or OSU MPI
> Mikhail Kuzminsky
> Zelinsky Institute of Organic Chemistry
More information about the Beowulf