[Beowulf] Re: Re: Home beowulf - NIC latencies

Tue Feb 15 10:43:18 PST 2005

Patrick Geoffray wrote:
> A last remark. I really think that the argument of using the same 
> swiss-army-knive MPI implementation such as ScaMPI or Intel MPI or even 
> MPI/Pro to infere interconnect characteristics is even worse that 
> looking at latency and bandwidth alone. These implementations are never 
> going to be designed to use all hardware efficiently, their design is 
> either historic (Scali used to provided software for SCI alone) or 
> politicaly motivated (Intel is using uDapl, hummm, wonder why), or both. 

The two most important things done to optimise performance of an MPI 
implementation for a hardware platform are:
- low-level pt-2-pt communication
- collective operations

AFAIK, Myrinet's MPI (MPICH-GM), for example, does use the standard 
(partly naive) collective operations of MPICH. Considering this, plus 
the fact
- that it's not all that hard to use GM for pt-2-pt efficiently. We have 
done this in our MPI, too, with the same level of performance.
- that you probably do not know anything on ScaMPI's current internal 
design (Intel is MPICH2 plus some Intel-propietary device hacking) and 
little about it's performance (if this is wrong, let us know)
- that all code apart from the device, and also the device architecture 
  of MPICH-GM are more or less 10-year-old swiss-army-knive MPICH code 
(which is not a bad thing per se)
you should maybe think again before judging on the efficiency of other 
MPI implementations.

  Joachim

-- 
Joachim Worringen - NEC C&C research lab St.Augustin
fon +49-2241-9252.20 - fax .99 - http://www.ccrl-nece.de