[Beowulf] MPI2007 out - strange pop2 results?

Fri Jul 20 18:20:54 PDT 2007

Gilad,

And you would never compare your products against our deprecated  
drivers and five year old hardware. ;-)

Sorry, couldn't resist. My colleagues are rolling their eyes...

Scot

On Jul 20, 2007, at 2:55 PM, Gilad Shainer wrote:

> Hi Kevin,
>
> I believe that your company is using this list for pure marketing wars
> for a long time, so don't be surprise when someone responds back.
>
> If you want to put technical or performance data, and than to make
> conclusions out of it, be sure to compare apples to apples. It is easy
> use the lower performance device results of your competitor and  
> than to
> attack his "architecture" or his entire product line. If this is not a
> marketing war, than I would be interesting to know what you call a
> marketing war....
>
> G
>
>
> -----Original Message-----
> From: Kevin Ball [mailto:kevin.ball at qlogic.com]
> Sent: Friday, July 20, 2007 11:27 AM
> To: Gilad Shainer
> Cc: Brian Dobbins; beowulf at beowulf.org
> Subject: RE: [Beowulf] MPI2007 out - strange pop2 results?
>
> Hi Gilad,
>
>   Thank you for the personal attack that came, apparently without even
> reading the email I sent.  Brian asked about why the publicly  
> available,
> independently run MPI2007 results from HP were worse on a particular
> than the Cambridge cluster MPI2007 results.  I talked about three
> contributing factors to that.  If you have other reasons you want  
> to put
> forward, please do so based on data, rather than engaging in a blatant
> ad hominem attack.
>
>   If you want to engage in a marketing war, there are venues with  
> which
> to do it, but I think on the Beowulf mailing list data and coherent
> thought are probably more appropriate.
>
> -Kevin
>
> On Fri, 2007-07-20 at 10:43, Gilad Shainer wrote:
>> Dear Kevin,
>>
>> You continue to set world records in providing misleading  
>> information.
>> You had previously compared Mellanox based products on dual
>> single-core machines to the "InfiniPath" adapter on dual dual-core
>> machines and claim that with InfiniPath there are more Gflops....  
>> This
>
>> latest release follow the same lines...
>>
>> Unlike QLogic InfiniPath adapters, Mellanox provide different
>> InfiniBand HCA silicon and adapters. There are 4 different silicon
>> chips, each with different size, different power, different price and
>> different performance. There is the PCI-X device (InfiniHost), the
>> single-port device that was deigned for best price/performance
>> (InfiniHost III Lx), the dual-port device that was designed for best
>> performance (InfiniHost III Ex) and the new ConnectX device that was
>> designed to extend the performance capabilities of the dual port
>> device. Each device provide different price and performance points
> (did I said different?).
>>
>> The SPEC results that you are using for Mellanox, are of the single
>> port device. And even that device (that its list price is probably
>> half of your InfiniPath) had better results with  8 server nodes than
> yours....
>> Your comparison of InfiniPath to the Mellanox single-port device
>> should have been on price/performance and not on performance. Now, if
>> you want to really compare performance to performance, why don't you
>> use the dual port device, or even better, ConnectX? Well... I will do
> it for you.
>> Every time I had compared my performance adapters to yours, your
>> adapters did not even come close...
>>
>>
>> Gilad.
>>
>> -----Original Message-----
>> From: beowulf-bounces at beowulf.org [mailto:beowulf- 
>> bounces at beowulf.org]
>> On Behalf Of Kevin Ball
>> Sent: Thursday, July 19, 2007 11:52 AM
>> To: Brian Dobbins
>> Cc: beowulf at beowulf.org
>> Subject: Re: [Beowulf] MPI2007 out - strange pop2 results?
>>
>> Hi Brian,
>>
>>    The benchmark 121.pop2 is based on a code that was already
>> important to QLogic customers before the SPEC MPI2007 suite was
>> released (POP, Parallel Ocean Program), and we have done a fair  
>> amount
>
>> of analysis trying to understand its performance characteristics.
>> There are three things that stand out in performance analysis on  
>> pop2.
>>
>>   The first point is that it is a very demanding code on the  
>> compiler.
>
>> There has been a fair amount of work on pop2 by the PathScale  
>> compiler
>
>> team, and the fact that the Cambridge submission used the PathScale
>> compiler while the HP submission used the Intel compiler accounts for
>> some (the serial portion) of the advantage at small core counts,
>> though scalability should not be affected by this.
>>
>>   The second point is that pop2 is fairly demanding of IO.  Another
>> example to look at for this is in comparing the AMD Emerald Cluster
>> results to the Cambridge results;  the Emerald cluster is using NFS
>> over GigE from a single server/disk, while Cambridge has a much more
>> optimized IO subsystem.  While on some results Emerald scales better,
>> for pop2 it scales only from 3.71 to 15.0 (4.04X) while Cambridge
>> scales from 4.29 to 21.0 (4.90X).  The HP system appears to be using
>> NFS over DDR IB from a single server with a RAID;  thus it should  
>> fall
>
>> somewhere between Emerald and Cambridge in this regard.
>>
>>   The first two points account for some of the difference, but by no
>> means all.  The final one is probably the most crucial.  The code  
>> pop2
>
>> uses a communication pattern consisting of many small/medium sized
>> (between 512 bytes and 4k) point to point messages punctuated by
>> periodic tiny (8b) allreduces.  The QLogic InfiniPath architecture
>> performs far better in this regime than the Mellanox InfiniHost
>> architecture.
>>
>>   This is consistent with what we have seen in other application
>> benchmarking;  even SDR Infiniband based off of the QLogic InfiniPath
>> architecture performs in general as well as DDR Infiniband based on
>> the Mellanox InfiniHost architecture, and in some cases better.
>>
>>
>> Full disclosure:  I work for QLogic on the InfiniPath product line.
>>
>> -Kevin
>>
>>
>> On Wed, 2007-07-18 at 18:50, Brian Dobbins wrote:
>>> Hi guys,
>>>
>>>   Greg, thanks for the link!  It will no doubt take me a little
>>> while to parse all the MPI2007 info (even though there are only a
>>> few submitted results at the moment!), but one of the first things I
>
>>> noticed was that performance of pop2 on the HP blade system was
>>> beyond
>>
>>> atrocious... any thoughts on why this is the case?  I can't see any
>>> logical reason for the scaling they have, which (being the first
>>> thing
>>
>>> I noticed) makes me somewhat hesitant to put much stock into the
>>> results at the moment.  Perhaps this system is just a statistical
>>> blip
>>
>>> on the radar which will fade into noise when additional results are
>>> posted, but until that time, it'd be nice to know why the results
>>> are the way they are.
>>>
>>>   To spell it out a bit, the reference platform is at 1 (ok, 0.994)
>>> on
>>> 16 cores, but then the HP blade system at 16 cores is at 1.94.  Not
>>> bad there.  However, moving up we have:
>>>   32 cores   - 2.36
>>>   64 cores  -  2.02
>>>  128 cores -  2.14
>>>  256 cores -  3.62
>>>
>>>   So not only does it hover at 2.x for a while, but then going from
>>> 128 -> 256 it gets a decent relative improvement.  Weird.
>>>   On the other hand, the Cambridge system (with the same processors
>>> and a roughly similar interconnect, it seems) has the follow scaling
>
>>> from 32->256 cores:
>>>
>>>    32 cores - 4.29
>>>    64 cores - 7.37
>>>   128 cores - 11.5
>>>   256 cores - 15.4
>>>
>>>   ... So, I'm mildly confused as to the first results.  Granted,
>>> different compilers are being used, and presumably there are other
>>> differences, too, but I can't see how -any- of them could result in
>>> the scores the HP system got.  Any thoughts?  Anyone from HP (or
>>> QLogic) care to comment?  I'm not terribly knowledgeable about the
>>> MPI
>>> 2007 suite yet, unfortunately, so maybe I'm just overlooking
>>> something.
>>>
>>>   Cheers,
>>>   - Brian
>>>
>>>
>>> ____________________________________________________________________
>>> __ _______________________________________________
>>> Beowulf mailing list, Beowulf at beowulf.org To change your
>>> subscription (digest mode or unsubscribe) visit
>>> http://www.beowulf.org/mailman/listinfo/beowulf
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org To change your subscription
>> (digest mode or unsubscribe) visit
>> http://www.beowulf.org/mailman/listinfo/beowulf
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf