[Beowulf] Re: Re: Home beowulf - NIC latencies

Patrick Geoffray patrick at myri.com
Wed Feb 16 00:39:17 PST 2005

Hi Keith,

Keith D. Underwood wrote:
> Inertia is a powerful thing.  Billions of dollars have been invested in
> MPI codes.  Changing that will not be easy (or cheap).  This is not as
> simple as moving from vectors to distributed memory - there wasn't
> nearly as much accumulated code then (and, it hurt back then). 

I would not drop the whole MPI standard, I would define a subset that is 
the recommanded API for performance. If your code is too old, link with 
a legacy MPI lib. If it's coded with the subset, link either with a 
legacy MPI lib and it works, or link with the optimized MPI lib and see 
what the MPI implementation can deliver.

>>It's used because it's there, there is no other reason. If you don't 
>>know who sends you what in a message passing application, then you 
>>cannot get either performance or robustness. If really you cannot do 
>>otherwise (and I don't believe that), you can always use unexpected 
>>messages (post the receive after Probe()ing), That's ugly, but you get 
>>what you deserved :-)
> That just isn't true.  If I don't know how many messages I will get, or
> from whom, but I can bound it, then I should prepost those receives. 
> This is particularly true in your standard physics code that runs for
> days and does thousands of time steps. (i.e. you can maintain a circular
> queue of these things).

A few years back, I looked at a lot of real world code to see if 
triggering the communication from the receive side could be worth it, ie 
if most of the messages did not use ANY_SENDER. I was amazed that the 
vast majority of the messages sent across many applications used the tag 
to discriminate on the sender among other things, not the source. For 
the couple of large code I dissected (sorry, don't remember the names 
right now), there was no rationale. I guess doing bookkeeping on the 
source and the tag was too much for the developer(s).

You can still do the receive-pull optimization and fall back on 
sender-push when you see a receive with ANY_SENDER, but if ANY_SENDER is 
the common case, that's useless. The best way to force developer to 
write code that can leverage optimization in the MPI lib is to remove 
the source of the ambiguity. So ANY_SENDER in the legacy API, not in the 

> The user should always expose as much opportunity for optimization as
> possible to the MPI layer.  e.g. a load-store architecture like the X1
> (not what I am advocating for MPI performance, mind you) could do
> excellent datatype processing.  You would rather the user do the
> gather/scatter themselves to prohibit the MPI from being able to do it?

In general yes, more opportunities for optimization is better. Now, 
assuming that irregular datatypes can be optimized as much as regular 
ones is wrong. The hardware can gather/scatter better than the 
application for nice long strides. However, MPI libs should print 
insults when tiny segments are used (when the scatter/gather efficiency 
collapse). The developer assumes that's it's fine because he does not 
know or he does not care.

I advocate to hide the guns instead of letting the developer shoot 
himself in the foot.


Patrick Geoffray
Myricom, Inc.

More information about the Beowulf mailing list