[Beowulf] How to configure a cluster network

Mark Hahn hahn at mcmaster.ca
Thu Jul 24 22:38:27 PDT 2008

> to generate a Universal FNN.  FNNs don't really shine until you have 3 or 4
> NICs/HCAs per compute node.

depends on costs.  for instance, the marginal cost of a second IB port 
on a nic seems to usually be fairly small.  for instance, if you have 
36 nodes, 3x24pt switches is pretty neat for 1 hop nonblocking.
two switches in a 1-level fabric would get 2 hops and 3:1 blocking.
if arranged in a triangle, 3x24 would get 1 hop 2:1, which might be 
an interesting design point.

> Though, as others have mentioned, IB switch latency is pretty darn small,
> so latency would not be the primary reason to use FNNs with IB.

yeah, that's a good point - FNN is mainly about utilizing "zero-order"
switching when the node selects which link to use, and shows the biggest
advantage when it's slow or hard to do multi-level fabrics.

> I wonder if anyone has built a FNN using IB... or for that matter, any
> link technology
> other than Ethernet?

I'm a little unclear on how routing works on IB - does a node have 
something like an ethernet neighbor table that tracks which other nodes are
accessible through which port?

I think the real problem is that small IB switches have never really 
gotten cheap, even now, in the same way ethernet has.  or IB cables,
for that matter.

regards, mark hahn.

