[Beowulf] Correct networking solution for 16-core nodes

Thu Aug 3 12:53:40 PDT 2006

On Thu, Aug 03, 2006 at 11:19:44AM +0200, Joachim Worringen wrote:

> From the numbers published by Pathscale, it seems that the simple MPI 
> latency of Infinipath is about the same whether you go via PCIe or HTX. 
> The application perfomance might be different, though.

No, our published number is 1.29 usec for HTX and 1.6-2.0 usec for PCI
Express. It's the message rate that's about the same.

BTW there are more HTX motherboards appearing: the 3 IBM rack-mount
Opteron servers announced this Tuesday all have HTX slots:

http://www-03.ibm.com/systems/x/announcements.html

In most HTX motherboards, a riser is used to bring out either HTX or
PCI Express, so you don't have to sacrifice anything. That's why IBM
can put HTX in _all_ of their boxes even if most won't need it,
because it doesn't take anything away except a little board space. The
existing SuperMicro boards work like this, too.

Vincent wrote:

> Only quadrics is clear about its switch latency (probably
> competitors have a worse one). It's 50 us for 1 card.

We have clearly stated that the Mellanox switch is around 200 usec per
hop.  Myricom's number is also well known.

Mark Hahn wrote:

> I intuit (totally without rigor!) that fatter nodes do increase bandwidth
> needs, but don't necessarily change the latency picture.

Fatter nodes mean more cpus are simultaneously trying to send out
messages, so yes, there is an effect, but it's not quite latency: it's
that message rate thing that I keep on talking about.

http://www.pathscale.com/performance/InfiniPath/mpi_multibw/mpi_multibw.html

Poor scaling as nodes get faster are the dirty little secret of our
community; our standard microbenchmarks don't explore this, but
today's typical nodes have 4 or more cores.

-- greg