Channel-bonding/VLANs
Mike Weller
weller at zyvex.com
Tue Jul 10 13:31:30 PDT 2001
Hello,
I'd like to update you all on my progress so far. Apparently, even
though the HP4000 Procurve switches support VLAN's, having duplicate
MAC addresses on different VLANs must have been confusing it.
Theoretically, non-overlapping VLANs should behave as independent
switches, but they don't in reality.
I moved one VLAN over to a separate switch to avoid confusion. For
test purposes, I'm just testing 2 nodes (master + slave), so I'm using
a cheap $60 linksys 5-port switch as my second VLAN (temporarily). I
bypassed my first hurdle of retaining my TCP/IP connection when
channel-bonding both master and slave simultaneously. However, my
bandwidth is cut in half for some strange reason. I tried several
different switches as well.
I booted both systems up on 1 NIC, and here are my netperf and FTP
results:
NETPERF:
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
65535 65535 65535 10.00 85.98
FTP:
37748736 bytes received in 3.5 seconds (1e+04 Kbytes/s)
37748736 bytes received in 3.6 seconds (1e+04 Kbytes/s)
37748736 bytes received in 3.8 seconds (9.8e+03 Kbytes/s)
That's close to the theoretical 100Mbps that the single NICs can do.
I verified that this was just using 1 interface with my "measure" script:
Node 0 eth0 rx 11818 tx 27604
Node 0 eth1 rx 0 tx 0
Node MASTER eth0 rx 14 tx 1
Node MASTER eth1 rx 27604 tx 11818
Node MASTER eth2 rx 0 tx 0
After bonding the 2 simultaneously, I redid my transfer tests, and
I now get approximately 50% of what I was getting before:
FTP:
37748736 bytes received in 9.2 seconds (4e+03 Kbytes/s)
NETPERF:
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
131070 131070 65535 10.10 47.43
131070 131070 65535 10.05 46.50
[root at beowulf ~]# ./measure
Node 0 eth0 rx 8747 tx 13400
Node 0 eth1 rx 8747 tx 13398
Node MASTER eth0 rx 17 tx 2
Node MASTER eth1 rx 13221 tx 8751
Node MASTER eth2 rx 13221 tx 8751
As you can see, the TX and RX packets are evenly matched (unlike my
previous problem). I thought that maybe there were timing issues, and
that perhaps time is lost because the packets are arriving out of
order. I really have no idea. I also tried using 2 identical linksys
switches, but got similar results.
Does anyone know why I'm experiencing degradation in performance?
-Mike
PS: To drop out of channel-bonding mode, all that was necessary was this
command on the master:
"ifconfig bond0 down; ifconfig eth2 down; ifconfig bond0 up"
Yesterday, I wrote:
> Hello,
>
> I sent an email to the list last week regarding configuring our HP
> Procurve 4000m switch for channel-bonding. I am still having major
> problems!!
>
> If the OS was configured for channel-bonding without any switch
> configuration, I got only 17Mbps :-( When I turned on trunking for
> certain ports on the switch, I got about 100Mbps, which was only a
> slight improvement from 1 NIC (plus, it was using SA/DA, so there was
> no node-to-node bandwidth improvement).
>
> I got a response suggesting that I should setup 2 VLANs, and have all
> eth0's on VLAN-1 and all eth1's on VLAN-2. The responder said that he
> can get 190Mbps. The trunking configuration is supposed to be for
> switch to switch configurations. He was using an HP 2400 (or was it
> 2424?). The manual I have is for both models, so I assume that it
> will work with mine as well.
>
> I am still having major difficulties with getting it to work with
> Scyld.
>
> I telnetted to my switch, and added VLAN1 and VLAN2. All slave eth0's
> were set to VLAN1 and eth1's to VLAN2. (Note: Master has 3 NICs, so
> eth1 was put on VLAN1 and eth2 on VLAN2). I did not configure any
> overlap between the VLANs.
>
> I had to temporarily disable channel-bonding on the master to get
> the slave to boot:
>
> /etc/init.d/beowulf stop ; ifconfig bond0 down ; ifconfig eth2 down ;
> ifconfig eth1 down ; ifconfig bond0 inet 10.0.0.1 netmask 255.255.255.0 ;
> ifconfig eth1 inet 10.0.0.1 netmask 255.255.255.0 ;
> ifenslave bond0 eth1 ; /etc/init.d/beowulf start
>
> After doing so, the slave node was able to boot up off of ETH0, by
> grabbing the image from master's ETH1.
>
> Now the question is, how am I supposed to channel-bond the nodes
> after this point?
>
> When ALL NICs were part of the same VLAN, these scripts used to work:
>
> SLAVE: (no idea if there's a better way)
> #!/bin/csh -f
> set node=$1
> modprobe --node $node bonding
> cat <<EOF > /tmp/runme
> ifconfig eth0 inet `bpstat -a $node` netmask 255.255.255.0
> ifconfig bond0 inet `bpstat -a $node` netmask 255.255.255.0
> ifenslave bond0 eth0
> ifenslave bond0 eth1
> EOF
> bpcp /tmp/runme ${node}:/tmp
> bpsh $node nohup csh -f /tmp/runme
>
> MASTER:
> ifenslave bond0 eth2
>
>
> Now that the NICs are on 2 distinct VLANs, the SLAVE script "hangs",
> which makes sense because its eth1 interface is transmitting half of
> the packets onto an isolated VLAN that MASTER's eth2 was not
> configured for yet. When I ran the MASTER line immediately after
> that, it did not remedy the problem.
>
> >From the master (10.0.0.1), I could still ping the slave (10.0.0.2).
> The TCP/IP connections could not be established for some reason.
> I purposely started an FTP server on the slave so that I can test
> forming TCP/IP connections afterward. ftp'ing to the slave just
> hung, although I could sniff 2-way packets from 10.0.0.2-615 to
> 10.0.0.1-2223 (bpmaster's port). I'm guessing that the connection
> was spoiled since the slave put up this message:
>
> bproc: connect: connect failed, errno=111
> bpslave: short read - lost connection to master
> rebooting in 30 seconds
>
> "ifconfig -a" on the master and slave showed that they were properly
> bonded. I also tricked the slave into giving me a shell so that I
> could type stuff after it lost connection with the master. I did this
> by "bpsh 0 csh -f /tmp/shell"
> /tmp/shell looks like:
> tcsh < /dev/console > /dev/console
>
> >From the slave, I was able to still ping the master, and "ifconfig -a"
> showed that it was properly bonded. Of course, it rebooted me in
> 30 seconds :-(
>
> At this point, I'll buy a whole new switch if it makes my life easier!
> Any ideas?
>
> -Mike
--
Michael J. Weller, M.Sc. office: (972) 235-7881 x.242
weller at zyvex.com cell: (214) 616-6340
Zyvex Corp., 1321 N Plano facsimile: (972) 235-7882
Richardson, TX 75081 icq: 6180540
More information about the Beowulf
mailing list