[Beowulf] 512 nodes Myrinet cluster Challanges

Wed May 3 09:57:03 PDT 2006

Hi Patrick,

On Wed, 2006-05-03 at 01:54, Patrick Geoffray wrote:
> Vincent,
> 
> Vincent Diepeveen wrote:
> > Just measure the random ring latency of that 1024 nodes Myri system and 
> > compare.
> > There is several tables around with the random ring latency.
> > 
> > http://icl.cs.utk.edu/hpcc/hpcc_results.cgi
> 
> I just ran it on a 8 nodes dual Opteron 2.2 GHz with F card (Myrinet-2G) 
> running MX-1.1.1 on 2.6.15 (one process per node):

  Do you have results with two or four processes per node that you could
share?  We've found that when it comes to the random ring tests, those
tend to be more interesting than the single processor results, and we
believe, more relevant to application performance.  This is because when
the random ring tests are run on all of the processes in a node, you get
all processes in a node trying to talk through the network at the same
time.  This is much closer to how an application (particularly a tightly
coupled application, which is where all of these high performance
interconnects shine) will be using the network.

> 
> This is nowhere near the few reported Myrinet results which, I am 
> guessing, are all running GM on either C cards or D cards. There are no 
> recent results running MX on recent hardware. You can also noticed that 
> there is no QsNetIII results, which would be very close to the top in 
> terms of latency.

  I think you are right that all of the Myrinet results are out of
date.  However it is also worth noting that these are average latency
and bandwidth results; unlike some of the other benchmarks in HPCC, they
get worse as the size of the cluster grows.  Thus some of the difference
will be due to that.  

  I must admit some curiosity as to why there aren't more recent results
from both Myrinet and Quadrics.

> 
> I find it amusing that you have previously expressed reservation about 
> the Linpack benchmark used in the Top500 but you blindely trust the HPCC 
> results. A benchmark is useful only if it's widely used and if it is 
> properly implemented. HPCC is neither. It has many many flaws that I 
> have reported and that have been ignored so far. A major one is to 
> penalize large clusters, specially on the ring latency test.

  Judging via HPCC is, as both you and Joe point out, vastly inferior to
benchmarking your own application results.  The advantage of it (similar
to Linpack) is that there is a central repository of results to compare
against one another.  It is better than Linpack because it has tests
that much more strenuously test the interconnect and memory subsystems
of a cluster.  

  It would be wonderful if someone (preferably CPU/Interconnect neutral)
would sign up to set up and maintain a similar page for application
benchmark results.  I personally have spent many hours trying to scour
the web for such results, trying to get a feel for how PathScale
products (both our interconnect and our compilers) are doing relative to
the competition.  This type of information is not only useful for
vendors, but would be incredibly useful for customers.

  In the absence of such a central repository, we post many application
scaling and performance results on our website in the form of a white
paper.  We would be very happy if other vendors did the same, but better
still would be an independent body had a place for such results.

> The key is to set the right requirements in your RFP. Naive RFPs would
> use broken benchmarks like HPCC. Smart RFPs would require benchmarking
> real application cores under reliability and performance constraints.

  Here we are absolutely in agreement.  People should benchmark on the
codes they wish to run.  Unfortunately RFPs are not always written
well.  We often see requests for only synthetic benchmarks, or requests
with applications so unportable it takes weeks to get them to work
outside anywhere but the (now obsolete) platform the writers were
previously using, or on nodecounts higher than any vendor has available.

  Smaller purchasers usually don't have the privilege of writing an RFP
and waiting for vendors to come knocking.  They will generally have to
get in line for one (or several) of the many small test clusters around
and try to test their application.  It would be nice if there were more
information on the web for common applications, so someone who wants to
run one of them would at least have a good place to start, instead of
taking a random walk through available test clusters.

-Kevin