a yet another stupid network topology

Mon Oct 30 08:25:47 PST 2000

On Mon, 30 Oct 2000, Eugene Leitl wrote:

> For all this to make sense, we need
> 
> 1) 16-port switches are much cheaper/port than 32-port switches and
>    larger

I don't know about the "much cheaper" part.  HP ProCurves can be had for
just about $40/port up to 80 ports.  I don't know if they have enough
bisection bandwidth to support 64(nodes)x100 Mbps (32 simultaneous
bidirectional pairs) but they'd almost certainly outperform any
hypercube even if they didn't, where you have to worry about both
latency hits and and bottlenecking and feeding multiple NICs/node.

This may be as much as twice as expensive as tiny 100Base switches
(which seem to be down to $20/port or thereabouts) but you don't have to
buy 3 NICs/node, you don't have to mess with complex routing.  OTOH, you
don't have the aggregate bandwidth you have in the hypercube IF you can
organize the calculation to feed the NICs.

> 2) above cheap 16-port switches can provide full 200 MBps*16 backbone 
>    bandwidth

More likely the 100 Mbps*16 bisection bandwidth -- remember, it is 8
PAIRS of 100 Mbps connections going both ways;-) -- but probably yes, or
very nearly.  But their store-and-forward switching latency will suck.
I presume that you're interested in tranferring a few large blocks of
data between nodes, not lots of small ones, so this probably doesn't
matter much.

> 3) we can feed 3 (or even 4) FastEthernet NICs in a node

This is a toughie.  I think Don Becker once told me that with channel
bonding they still get significant gain with up to 3 100 Base NICs (as
in maybe 240-260 Mbps out of 300?  Don't remember exactly), but that a
fourth is a wash.  This is with a custom kernel module, though.  I'd
guess that you can feed 2 and MAYBE still get gain with 3, but your
system won't be doing much else in the meantime.  It probably depends on
the system, as well -- I think that it is a question of when you become
CPU bound so newer CPUs or dual CPUs may go to 4 NICs.

More generally:

Is there any way you can just buy a "throwaway" 16 port candidate and
prototype?  Take it home for your home beowulf if it doesn't work out?
The prototyping process may quickly resolve the question.  It did for
me.

I tried to build a small (4-5 node) hypercube at home once upon a time
(before I could afford a switch) and quickly decided that switched ports
were cheap at the cost even at $220 for 8 ports (~$30/port).  Just the
extra NICs alone ended up costing almost as much, and it took SO long to
build even small routing tables, which was not particularly rewarding
work.  Finally, performance and stability were not what I imagined they
should be when I was done, which is likely partly the 2.2.(small) kernel
I was using at the time (around the time of RH 6.0) but was also just
the fact that there was more to go wrong and that wrong things (like
downed systems) had a longer "range".

In a professional environment, where both the switch AND MY (or your)
TIME MESSING WITH THE NETWORK is generally paid for with other people's
money I >>think<< you'll find that it still just isn't worth the effort.
You're talking a marginal cost of perhaps $20-50 per port (presuming
that you STILL put multiple NICs/node in to gain the higher bandwidth
with e.g.  channel bonding -- if you don't a switch is actually cheaper
as the two extra NICs almost certainly cost more than the marginal cost
of a switched port on a large switch).  Even the upper end of $50x64 for
a fairly high-end switch is only $3200.  $50/node is (usually) a fairly
small fraction of the total node costs (say, strictly less than 10%, and
as little as 2% if you buy high end nodes), where it is REASONABLE to
invest a fair fraction of the node costs in a decent network if the
problem is IPC data intensive.  Add in the week or so of your extra time
that a hypercube might take to design and test and build at $100/hour
(You make $100/hour, don't you?  No?  Me neither -- but that's what I
assume my time is worth for doing unpleasant work;-).  Note that
generally speaking the extra $3200 is somebody else's money that you are
working so hard to save and...

...on the basis of this (personal experience) I've pretty much concluded 
that hypercubes are more costly than they first appear and more trouble
than they are generally worth with switch prices as low as they are.
Even when it is MY money that I'm "saving" and not somebody else's.

The place they MIGHT still be an interesting idea would be for very
(relatively) cheap, few-node gigabit beowulfs -- a tiny four node
tetrahedral beowulf, for example.  Even here, by the time you buy three
gigabit NICs per node you've oversubscribed the PCI bus, you're
overloading the processor, and gigabit switched ports are probably
comparable in cost to the two extra NICs you have to buy.

Naaah, I just don't like them.  Cool idea on paper, and once (back when
NICs were cheap but switches were still dear) I thought that they'd make
a nifty way to build a supercheap beowulf, but I learned otherwise the
hard way.

All these meanderings of disconnected thought are strictly my own, of
course...and your own mileage might be different.  Which is why I
suggested a nominal investment of a few hundred in a switch to prototype
it.  At least it might give you a quantitative basis for your
decision...

    rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu