[Beowulf] MPICH vs. OpenMPI

Fri Apr 25 03:07:28 PDT 2008

Hi Jan,

At Wed, 23 Apr 2008 20:37:06 +0200, Jan Heichler <jan.heichler at gmx.net> wrote:
> >From what i saw OpenMPI has several advantages:
>
>- better performance on MultiCore Systems 
>because of good shared-memory-implementation

A couple of months ago, I conducted a thorough 
study on intra-node performance of different MPIs 
on Intel Woodcrest and Clovertown systems. I 
systematically tested pnt-to-pnt performance 
between processes on a) the same die on the same 
socket (sdss), b) different dies on same socket 
(ddss) (not on Woodcrest of course) and c) 
different dies on different sockets (ddds). I 
also measured the message rate using all 4 / 8 
cores on the node. The pnt-to-pnt benchmarks used 
was ping-ping, ping-pong (Scali’s `bandwidth´ and osu_latency+osu_bandwidth).

I evaluated Scali MPI Connect 5.5 (SMC), SMC 5.6, 
HP MPI 2.0.2.2, MVAPICH 0.9.9, MVAPICH2 0.9.8, Open MPI 1.1.1.

Of these, Open MPI was the slowest for all 
benchmarks and all machines, upto 10 times slower than SMC 5.6.

Now since Open MPI 1.1.1 is quite old, I just 
redid the message rate measurement on an X5355 
(Clovertown, 2.66GHz). On an 8-byte message size, 
OpenMPI 1.2.2 achieves 5.5 million messages per 
seconds, whereas SMC 5.6.2 reaches 16.9 million 
messages per second (using all 8 cores on the node, i.e., 8 MPI processes).

Comparing OpenMPI 1.2.2 with SMC 5.6.1 on 
ping-ping latency (usec) on an 8-byte payload yields:

mapping OpenMPI   SMC
sdss       0.95  0.18
ddss       1.18  0.12
ddds       1.03  0.12

So, Jan, I would be very curios to see any documentation of your claim above!

Thanks, Håkon
Disclaimer, I work for Scali and may be biased.