[Beowulf] Re: Re: Home beowulf - NIC latencies

Patrick Geoffray patrick at myri.com
Wed Feb 16 02:53:03 PST 2005


Keith D. Underwood wrote:
>>Looking for overlaping is actually not that hard:
>>a) look for medium/large messages, don't waste time on small ones.
> I contend that this particular item is bad advice.  If you send a lot of
> small messages, you should use MPI_Isend there as well to give the MPI
> implementation every opportunity to do the right thing.  As we go
> forward, end-to-end acknowledgments are going to become a reality.  The

I agree. We are strongly considering acking at the lib level instead of 
at the firmware level in MX. It has many good side effects, and a few 
evil ones.

> last thing you want is to spend a round-trip delay on every message you
> send if you send a lot of them.  Yes, the implementation can copy on the
> sending side to allow the send to complete, but that wastes memory and
> time.  

If you are reliable, you need to be able to resend the data if you don't 
receive the ack in time. If you don't want to do a copy, you have to 
wait for the ack before releasing the send buffer. For small messages, 
the copy is cheaper than the rtt, IMHO.

Do you say that if someone use Isend for sending small messages, it's an 
hint that avoiding the copy is worth it because he tries to overlap and 
he does not care about latency ? Yes, that would be logical. But then 
you need to have blocking Send to hint the reverse, and then you assume 
smart people will use blocking Send because they know latency matters at 
that place, whereas clueless people will use it because it's simpler 
than Isend.


Patrick Geoffray
Myricom, Inc.

More information about the Beowulf mailing list