Channel Bonding

Timothy I Mattox tmattox at engr.uky.edu
Wed Mar 28 15:20:06 PST 2001


Hello,
Please excuse beating this dead horse...

No, a "normal" switch should NOT expect to see the same MAC address
coming from more than one port at any one time.  The standards body that
formalized the ethernet standard set up a mechanism for the entire
industry to be able to guarantee that every single Ethernt NIC sold in
the world has a unique MAC address!  It is a fundamental part of how
ethernet works.

The way Channel Bonding is implemented violates this unique MAC address
per device standard, by deliberately making several NICs have the same
MAC address.  It is a very cool way to have implemented the concept so
that higher layers of the network stack could be oblivious to the fact
that the lower level packets were going out (and coming in) on different
NICs.  However, you usually can't just wire it all up and have it "just
work", but I'll get back to that in a moment...

Each time a packet with a particular source MAC address goes through
a unmanaged switch, the port it arrived on is recorded in the switch's
routing table, replacing any previous entry for that MAC address.
If you connect several NICs with the same MAC address to the switch,
at any one time it would have only one port listed for a particular
MAC address, whichever NIC was the last to send out a packet.
Granted, not all switches have to work this way, but unmanaged
switches some how have to learn where each MAC address is located...
So, for all the posts from people saying "I tried channel boding,
and it was much SLOWER than when I used just one NIC"... this is what
is likely going on... your switch at any one time will be sending
all the packets for a particular MAC address down ONE port, and thus
blocking, and overflowing, and just making a real mess.

Originally the way to make channel bonding work was to make N copies
of your network, where N is the number of NICs you intend to bond
together.  And your N networks had to be isolated from each other,
so that their tree-spanning algorithm, etc. would not get confused
by seeing the same MAC coming from multiple places.

More modern/advanced/expensive switches have added the ability to
properly handle having the same MAC address appear to be connected
to several ports.  This has gone under a variety of names, but I think
the most commonly used term is "trunking".  As far as I was aware,
not all implementations of trunking can be used to connect bonded
NICs.  Please correct me if I am wrong on this aspect of trunking.

My comment about VLAN's not helping out was that it would seem that
you could split a switch in two (or three) parts, each as a Virtual LAN,
and then connect up your bonded NICs, one to each VLAN segment.
However, that appears not to always work either.  My guess is that the
internal routing tables in some switches with VLAN support will still do
lookups based on MAC addresses, and keep only one entry per MAC address.
I am only guessing on this since we haven't played with VLAN stuff yet
in our lab.

Here is some OLD documentation for channel bonding:
http://www.beowulf-underground.org/doc_project/BIAA-HOWTO/Beowulf-Installation-and-Administration-HOWTO-12.html
The most resent bonding documentation I can find is in the kernel source:
/usr/src/linux/Documentation/networking/bonding.txt
However, that document ASSUMES you will be using switches that
support trunking... ignoring that channel bonding worked
before such switches existed (1994?).

Anyway, my point was that it takes a special switch to be able to
do channel bonding WITHIN one switch.  So to answer the original question
of do I need one 16 way switch, or two 8 way switches for an 8 cluster
still is answered by:  Get two cheap unmanaged 8-port switches
and do NOT tie them together.  You can get 10/100 8-port switches
for less than $80... see http://www.buy.com for a few choices.

I would love to know of alternatives that would not use the
duplicated MAC address implementation of channel bonding.
Our FNN work with KLAT2 has made me look around for such alternatives,
and the closest I came across was the work to combine more than one PPP
connection (dual channel ISDN, or multiple regular modems).
It looks like we are going to have to roll our own solution when
we have time to do it...


On Wed, 28 Mar 2001, Jakob Østergaard wrote:
> On Wed, Mar 28, 2001 at 01:57:30PM -0500, Timothy I Mattox wrote:
> > Hello,
> > Unless the 16 port switch can be configured to handle it (and most can
> > not as far as I know), you would need two 8 port switches that are NOT
> > connected together.  Some high end switches have a form of trunking
> > (I'm not sure which flavor of trunking will work) that can properly
> > handle having more than one connection appear to have the same MAC
> > address.  Also, from comments here on the list, it seems that not all
> > VLAN support is created equal, so splitting a 16 port switch into two
> > VLANs won't necessarily work either.
> >
> > The fundamental problem is that channel bonding makes several NICs in
> > the same box have identical MAC addresses, and that breaks the most
> > commonly used method(s) for routing ethernet packets inside of switches,
> > since MAC addresses are supposed to be unique.
>
> At work we use an intel switch that allows "trunking" of several ports.
> however, a 24 port switch has three "groups" of eight ports each (or was it four
> groups of six ? I forgot), and you can only do trunking between ports in the
> same group, and you can only trunk once in each group. Thus, we can only create
> three (or four) trunk "sets" for each switch.
>
> This is very switch-specific - you should check the capabilities of your own
> switch.   I think this kind of capability is becoming more normal in lower
> end switches as well.
>
> However, once the switch is set up to trunk a few ports, enabling it in RedHat
> 7 with a 2.4 kernel is so easy it's almost cheating   :)  It works very well
> indeed.  The RedHat initscripts are prepared for this setup, so there's no
> special hackery needed at all.  I don't know about other distributions.
>
> It's correct that the kernel uses the same MAC on all NICs that are trunked,
> but this is what the switch expects, and it's the only sane way to do it as I
> see it.   And I don't know why VLANs got involved in this discussion at all :)

-- 
Tim Mattox - tmattox at ieee.org - http://home.earthlink.net/~timattox
   http://aggregate.org/KAOS/ - http://advogato.org/person/tmattox/





More information about the Beowulf mailing list