[Beowulf] Infiniband modular switches

Mark Hahn hahn at mcmaster.ca
Sun Jun 15 10:44:21 PDT 2008

> Static routing is the best approach if your pattern is known. In other

sure, but how often is the pattern actually known?  I mean in general:
aren't most clusters used for multiple, shifting purposes?

> There are some vendors that uses only the 24 port switches to build very
> large scale clusters - 3000 nodes and above, without any
> oversubscription, and they find it more cost effective. Using single

so the switch fabric would be a 'leaf' layer with 12 up and 12 down,
and a top layer with 24 down, right?  so 3000 nodes means 250 leaves
and 125 tops, 9000 total ports so 4500 cables.

> enclosures is easier, but the cables are not expensive and you can use
> the smaller components.

in federated networks, I think cables wind up being 15-20% of the network
price.  for instance, if we take the simplest possible approach, and equip
this 3000-node cluster with a non-blocking federated fabric (assuming
just sdr) from colfax's current price list:

subtot	unit	n	what
375000	125	3000	ib nic
117000	39	3000	1m host ib cables
148500	99	1500	8m leaf-top ib cables
900000	2400	375	24pt unman switch
1540500 total (cable 17%)

I'm still confused about IB pricing, since the street price of nics,
cables and switches are dramatically more expensive than colfax.
(to the paranoid, colfax would appear to be a mellanox shell company...)

for completeness, here's the same bom with "normal" public prices:

subtot	unit	n	what
2100000	700	3000	ib nic
330000	110	3000	1m host ib cables
330000	220	1500	8m leaf-top ib cables
1500000	4000	375	24pt unman switch
4260000 total (cable 15%)

interestingly, if nodes were about 3700 apiece (about what you'd expect
for intel dual-socket quad-core 2G/core), the interconnect winds up being 
28% of the cost.

More information about the Beowulf mailing list