<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html><head><title></title>

<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">

<meta http-equiv="Content-Style-Type" content="text/css">

<style type="text/css"><!--

body {

  margin: 5px 5px 5px 5px;

  background-color: #ffffff;

}

/* ---------- Text Styles ---------- */

hr { color: #000000}

body, table /* Normal text */

{

 font-size: 9pt;

 font-family: 'Courier New';

 font-style: normal;

 font-weight: normal;

 color: #000000;

 text-decoration: none;

}

span.rvts1 /* Heading */

{

 font-size: 10pt;

 font-family: 'Arial';

 font-weight: bold;

 color: #0000ff;

}

span.rvts2 /* Subheading */

{

 font-size: 10pt;

 font-family: 'Arial';

 font-weight: bold;

 color: #000080;

}

span.rvts3 /* Keywords */

{

 font-size: 10pt;

 font-family: 'Arial';

 font-style: italic;

 color: #800000;

}

a.rvts4, span.rvts4 /* Jump 1 */

{

 font-size: 10pt;

 font-family: 'Arial';

 color: #008000;

 text-decoration: underline;

}

a.rvts5, span.rvts5 /* Jump 2 */

{

 font-size: 10pt;

 font-family: 'Arial';

 color: #008000;

 text-decoration: underline;

}

span.rvts6

{

 font-weight: bold;

 color: #800000;

}

a.rvts7, span.rvts7

{

 color: #0000ff;

 text-decoration: underline;

}

span.rvts8

{

 font-weight: bold;

 color: #800000;

}

span.rvts9

{

 font-weight: bold;

 color: #800080;

}

/* ---------- Para Styles ---------- */

p,ul,ol /* Paragraph Style */

{

 text-align: left;

 text-indent: 0px;

 padding: 0px 0px 0px 0px;

 margin: 0px 0px 0px 0px;

}

.rvps1 /* Centered */

{

 text-align: center;

}

--></style>

</head>

<body>


<p>Hallo Håkon,</p>

<p><br></p>

<p>Freitag, 25. April 2008, meintest Du:</p>

<p><br></p>

<p><span class=rvts6>HB> Hi Jan,</span></p>

<p><br></p>

<p><span class=rvts6>HB> At Wed, 23 Apr 2008 20:37:06 +0200, Jan Heichler <</span><a class=rvts7 href="mailto:jan.heichler@gmx.net">jan.heichler@gmx.net</a><span class=rvts8>> wrote:</span></p>

<p><span class=rvts9>>> >From what i saw OpenMPI has several advantages:</span></p>

<p><br></p>

<p><span class=rvts9>>>- better performance on MultiCore Systems </span></p>

<p><span class=rvts9>>>because of good shared-memory-implementation</span></p>

<p><br></p>

<p><br></p>

<p><span class=rvts6>HB> A couple of months ago, I conducted a thorough </span></p>

<p><span class=rvts6>HB> study on intra-node performance of different MPIs </span></p>

<p><span class=rvts6>HB> on Intel Woodcrest and Clovertown systems. I </span></p>

<p><span class=rvts6>HB> systematically tested pnt-to-pnt performance </span></p>

<p><span class=rvts6>HB> between processes on a) the same die on the same </span></p>

<p><span class=rvts6>HB> socket (sdss), b) different dies on same socket </span></p>

<p><span class=rvts6>HB> (ddss) (not on Woodcrest of course) and c) </span></p>

<p><span class=rvts6>HB> different dies on different sockets (ddds). I </span></p>

<p><span class=rvts6>HB> also measured the message rate using all 4 / 8 </span></p>

<p><span class=rvts6>HB> cores on the node. The pnt-to-pnt benchmarks used </span></p>

<p><span class=rvts6>HB> was ping-ping, ping-pong (Scali’s `bandwidth´ and osu_latency+osu_bandwidth).</span></p>

<p><br></p>

<p><span class=rvts6>HB> I evaluated Scali MPI Connect 5.5 (SMC), SMC 5.6, </span></p>

<p><span class=rvts6>HB> HP MPI 2.0.2.2, MVAPICH 0.9.9, MVAPICH2 0.9.8, Open MPI 1.1.1.</span></p>

<p><br></p>

<p><span class=rvts6>HB> Of these, Open MPI was the slowest for all </span></p>

<p><span class=rvts6>HB> benchmarks and all machines, upto 10 times slower than SMC 5.6.</span></p>

<p><br></p>

<p><br></p>

<p>You are not gonna share these benchmark results with us, right? Would be very interesting to see that!</p>

<p><br></p>

<p><span class=rvts6>HB> Now since Open MPI 1.1.1 is quite old, I just </span></p>

<p><span class=rvts6>HB> redid the message rate measurement on an X5355 </span></p>

<p><span class=rvts6>HB> (Clovertown, 2.66GHz). On an 8-byte message size, </span></p>

<p><span class=rvts6>HB> OpenMPI 1.2.2 achieves 5.5 million messages per </span></p>

<p><span class=rvts6>HB> seconds, whereas SMC 5.6.2 reaches 16.9 million </span></p>

<p><span class=rvts6>HB> messages per second (using all 8 cores on the node, i.e., 8 MPI processes).</span></p>

<p><br></p>

<p><span class=rvts6>HB> Comparing OpenMPI 1.2.2 with SMC 5.6.1 on </span></p>

<p><span class=rvts6>HB> ping-ping latency (usec) on an 8-byte payload yields:</span></p>

<p><br></p>

<p><span class=rvts6>HB> mapping OpenMPI   SMC</span></p>

<p><span class=rvts6>HB> sdss       0.95  0.18</span></p>

<p><span class=rvts6>HB> ddss       1.18  0.12</span></p>

<p><span class=rvts6>HB> ddds       1.03  0.12</span></p>

<p><br></p>

<p>Impressive. But i never doubted that commercial MPIs are faster. </p>

<p><br></p>

<p><span class=rvts6>HB> So, Jan, I would be very curios to see any documentation of your claim above!</span></p>

<p><br></p>

<p>I did a benchmark of a customer application on a 8 node DualSocket DualCore Opteron cluster - unfortunately i can't remember the name. </p>

<p><br></p>

<p>I used OpenMPI 1.2 , mpich 1.2.7p1, mvapich 0.97-something and Intel MPI 3.0 IIRC.</p>

<p><br></p>

<p>I don't have the detailed data available but from my memory:</p>

<p><br></p>

<p>Latency was worst for mpich (just TCP/IP ;-) ), then IntelMPI, then OpenMPI and mvapich the fastest. </p>

<p>On a single machine mpich was the worst, then mvapich and then OpenMPI - IntelMPI was the fastest. </p>

<p><br></p>

<p>Difference between mvapich and OpenMPI was quite big - Intel just had a small advantage over OpenMPI. </p>

<p><br></p>

<p><br></p>

<p>Since this was not low-level i don't know which communication pattern the Application used but it seemed to me that the shared memory configuration on OpenMPI and Intel MPI was far better than on the other two. </p>

<p><br></p>

<p>Cheers,</p>

<p>Jan</p>


</body></html>