[Beowulf] Re: Re: Home beowulf - NIC latencies

Mikhail Kuzminsky kus at free.net
Mon Feb 14 07:47:15 PST 2005

In message from Rob Ross <rross at mcs.anl.gov> (Fri, 11 Feb 2005 
20:47:22 -0600 (CST)):
>Hi Isaac,
>On Fri, 11 Feb 2005, Isaac Dooley wrote:
>> >>Using MPI_ISend() allows programs to not waste CPU cycles waiting 
>>on the
>> >>completion of a message transaction.
>> >No, it allows the programmer to express that it wants to send a 
>> >but not wait for it to complete right now.  The API doesn't specify 
>> >semantics of CPU utilization.  It cannot, because the API doesn't 
>> >knowledge of the hardware that will be used in the implementation.
>> That is partially true.  The context for my comment was under your 
>> assumption that everyone uses MPI_Send(). These people, as I stated 
>> before, do not care about what the CPU does during their blocking 
>I think that it is completely true.  I made no assumption about 
>using MPI_Send(); I'm a late-comer to the conversation. 
>I was not trying to say anything about what people making the calls 
>about; I was trying to clarify what the standard does and does not 
>However, I agree with you that it is unlikely that someone calling
>MPI_Send() is too worried about what the CPU utilization is during 
>> I was trying to point out that programs utilizing non-blocking IO 
>> have work that will be adversely impacted by CPU utilization for 
>> messaging. These are the people who care about CPU utilization for 
>> messaging. This I hopes answers your prior question, at least 
>I agree that people using MPI_Isend() and related non-blocking 
>are sometimes doing so because they would like to perform some 
>computation while the communication progresses.  People also use 
>calls to initiate a collection of point-to-point operations before 
>waiting, so that multiple communications may proceed in parallel.

Let me ask some stupid's question: which MPI implementations allow
a) to overlap MPI_Isend w/computations
b) to perform a set of subsequent MPI_Isend calls faster than "the 
same" set of MPI_Send calls ?

I say only about sending of large messages.

I'm interesting (1st of all) in
- Gigabit Ethernet w/LAM MPI or MPICH
- Infiniband (Mellanox equipment) w/NCSA MPI or OSU MPI

Mikhail Kuzminsky
Zelinsky Institute of Organic Chemistry
> The 
>implementation has no way of really knowing which of these is the 
>Greg just pointed out that for small messages most implementations 
>will do
>the exact same thing as in the MPI_Send() case anyway.  For large 
>I suppose that something different could be done.  In our 
>(MPICH2), to my knowledge we do not differentiate.
>You should understand that the way MPI implementations are measured 
>is by 
>their performance, not CPU utilization, so there is pressure to push 
>former as much as possible at the expense of the latter.
>> Perhaps your applications demand low latency with no concern for the 
>> during the time spent blocking. That is fine. But some applications 
>> benefit from overlapping computation and communication, and the 
>> not wasted by the CPU on communication can be used productively.
>I wouldn't categorize the cycles spent on communication as "wasted"; 
>not like we code in extraneous math just to keep the CPU pegged :).
>Rob Ross, Mathematics and Computer Science Division, Argonne National 
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit 

More information about the Beowulf mailing list