[Beowulf] using two separate networks for different data streams
tmattox at gmail.com
Sat Jan 28 18:39:40 PST 2006
Did someone mention FNNs? :-)
I really shouldn't be even reading the beowulf list right now, since I
have a dissertation to urgently write about those FNN things...
But I'll take a few minutes to jump into the discussion.
The use of multiple network interfaces per node in a cluster can
let you do many different things. Here are five typical goals
that I've seen for using multiple network interfaces per node:
1) Higher single stream bandwidth:
a) round-robin Linux channel bonding (ethernet)
b) multi-rail (Quadrics, etc.)
c) Trunking or EtherChannel (Cisco, etc.)
d) Striping (Open MPI)
(terminology changes with the technology/company/etc. apparently)
2) Higher availability/fault-tolerance/failover
a) various Linux Channel bonding modes
b) an "admin" network for when the fancy network fails or is overloaded
3) Higher aggregated bandwidth on multiple streams
a) various other Linux Channel bonding modes
b) multi-homed/multiple-IP addressed servers
4) Reduced NIC contention on SMP nodes
This is a variant of #3 by dedicating a NIC per CPU in an SMP
5) Lower latency via wider "single-switch-distance" connectivity
a) Flat Neighborhood Networks (FNNs)
b) Sparse FNNs (SFNNs)
And of course various mixtures of the above goals.
One reason goal #1 isn't always the best choice is that
due to packet reordering, individual TCP streams can actually
go slower if you are unlucky. I've not benchmarked it, but
I've seen many reports that GigE tends to be mostly unlucky in
this regard (with the technical details beyond the scope of
this particular post).
And thus, the shift to goals #2 & #3, with one obvious
approach being the split into an I/O network and an MPI network.
Appropriate use of Google and a Linux/Network Guru can
help you achieve various sets of those goals.
P.S. - Goal #5 should be more accessible/achievable soon by
mere mortals... :-) this summer I hope. Until then, this particular
"guru" is too busy trying to graduate to be of any direct help
with FNNs, etc.
On 1/27/06, Douglas Eadline <deadline at clustermonkey.net> wrote:
> I wonder if a dual Ethernet node would be better served by something like
> a FNN (http://aggregate.org/FNN/) Tim Mattox can probably weigh in on
Tim Mattox - tmattox at gmail.com
I'm a bright... http://www.the-brights.net/
More information about the Beowulf