[Beowulf] New beowulf recommendations

Mark Hahn hahn at mcmaster.ca
Sun Sep 9 11:11:57 PDT 2007

> 1) There is an onboard Gigabit NIC which pushes the computational load
>   onto the CPU.

I doubt it.  it's fairly easy for nics to perform stateless offload,
and afaik even cheap ones do.  the result is that any nic will
give nearly the same CPU overhead.  I expect the only "onboard"-ness
here is that the nic is part of the chipset.  this matters very little,
since a gigabit nic not going to push the limits of any current bus.

>  Our vendor states that a server card in the
>   PCIExpress slot would have better latency.  True?  Significant?

peculiar claim, since pcie actually _adds_ a small latency cost;
they're basing this on offload-type arguments?

> 2) We're considering either a Layer 2 or Layer 3 Netgear 48 port
>   switch.  The backplane bandwiths are 96Gb and 196Gb respectively,
>   and the latencies are 20us and 2us.  I don't understand how the
>   additional bandwidth can be used,

I'm guessing the l3 switch is GSM7352s, and the l2 is GSM7248.
while 20 vs 2 us is a big difference, your observed Gb latency
is still going to be ~50 us, so it's not a huge big deal.

if I'm right on the switch models, I think the difference is more
generational and features.  the 7248 seems like an older-gen switch,
and lacks not just the L3 stuff but also the 10G options.

it's the 10G options that let them claim 196 Gbps for the GSM7352s,
since besides the 48 normal ports, it's got 8x SFP's and bays for 
4x 10G stacking ports (which btw only adds up to 192 Gbps for me...)

I'd consider the GSM7352s mainly if I wanted to use the 10G ports
(you might verify that the ports can be used for 10G in general,
rather than only for stacking...)

> but the latency gain seems reason
>   enough for the Layer 3.  Is it worth an extra $3k?

I would guess not unless you want the additional features (routing and 

>  We are network
>   latency bound on our existing 16 node cluster, but I do not know
>   how much latency is due to the switch, nor how to find out.

well, the simplest test is to connect two nodes back-to-back and 
run a latency test.  compare versus plugged into the switch.
(gigabit ports are all auto-mdi, so you don't need a special crossover
cable for this test.)

I would guess that your current switch is about the same latency 
as the GSM7248, but that you'll measure something like 50 us 
back-to-back.  so dropping 18 us will not make a dramatic difference:
ie, 70 us vs 52 - that's 25%, but it's still no where near a "real"
interconnect (myrinet, infiniband, 10G, quadrics).

you should also verify that your nics aren't currently doing some sort
of interrupt mitigation/coalescing, since that will hurt your latency.

if you are truely small-packet latency-bound, and unwilling to consider
a higher-performance interconnect, I think you should contemplate putting
more cores in each box.  going from 2 cores per box to 8 or 16 will make a
big difference for smallish jobs that use a small number of nodes (even 
if you stick to plain old gigabit).

More information about the Beowulf mailing list