Troubles with Adaptec DuraLAN, SMP boxes, channel bonding
Ward Fenton
ward@zurg.amazingmedia.com
Tue Dec 14 03:52:14 1999
I've been struggling with the starfire driver and channel bonding on about
a dozen intel dual processor servers. Each duralan card is two ports, our
switch is an Extreme Networks Summit 48.
I'm continuously recieving errors during periods of intense network
activity. I recieved the errors below from running "ping -f target"
from two simultaneous local boxes. At the time the "something wicked"
error occurs, the network link is momentarily frozen then resumes after a
few seconds. I also have run netperf, ftp, scp with similar outcomes.
With netperf I've measured 192mbits/sec UDP bandwidth over my link.
TCP tests do not behave well and trigger my problems immediately. I've
spent days searching for clues and tricks to fix this problem
and now believe that I'm either going to have to run boot with the noapic
option or migrate to different hardware. It seems that a multiport or
multiple card tulip based solution is the way to go assuming 21143
based cards are available and that the recent driver capability of
interrupt mitigation in hardware.
So far in my testing I'm only seeing these overruns with the starfire
driver. I've done some minor testing with 21140 based SMC etherpower cards
without seeing the same problems.
One other small point of concern is that I thought that I'd come across a
message regarding the kern-2.3 network drivers at cesdis.gsfc.nasa.gov
which stated that the starfire driver and others hadn't received some
of the most recent updates which were already applied to the more common
drivers.
by the way... i can send out some of my company's t-shirts out to
any people who can help get this thing moving.
Thanks in advance,
Ward
$ uname -a
Linux xxxxx 2.2.13ac3 #1 SMP Mon Dec 13 17:51:02 EST 1999 i686
unknown
$ cat /proc/interrupts
CPU0 CPU1
0: 1420115 1422245 IO-APIC-edge timer
1: 29 26 IO-APIC-edge keyboard
2: 0 0 XT-PIC cascade
8: 0 1 IO-APIC-edge rtc
12: 1 0 IO-APIC-edge PS/2 Mouse
13: 1 0 XT-PIC fpu
19: 10983 10942 IO-APIC-level aic7xxx, aic7xxx
20: 116547 114677 IO-APIC-level eth0
21: 115460 116831 IO-APIC-level eth1
NMI: 0
ERR: 0
$ cat /proc/net/dev
Inter-| Receive |
face |bytes packets errs drop fifo frame compressed multicast|
lo: 200 4 0 0 0 0 0 0
bond0:16147680 163176 0 0 3 0 0 0
eth0: 7477264 80485 0 0 1 0 0 0
eth1: 8670416 82691 0 0 2 0 0 0
| Transmit
| bytes packets errs drop fifo colls carrier compressed
200 4 0 0 0 0 0 0
1332302098 3798142 0 0 3 0 0 0
2813605078 1899071 0 0 1 0 0 0
2813664316 1899071 0 0 2 0 0 0
$ ifconfig
bond0 Link encap:Ethernet HWaddr 00:00:D1:DA:C6:33
inet addr:208.51.95.74 Bcast:208.51.95.127 Mask:255.255.255.192
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:161089 errors:0 dropped:0 overruns:3 frame:0
TX packets:3795172 errors:0 dropped:0 overruns:3 carrier:0
collisions:0 txqueuelen:0
eth0 Link encap:Ethernet HWaddr 00:00:D1:DA:C6:33
inet addr:208.51.95.74 Bcast:208.51.95.127 Mask:255.255.255.192
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:79532 errors:0 dropped:0 overruns:1 frame:0
TX packets:1897586 errors:0 dropped:0 overruns:1 carrier:0
collisions:0 txqueuelen:100
Interrupt:20 Base address:0xa000
eth1 Link encap:Ethernet HWaddr 00:00:D1:DA:C6:33
inet addr:208.51.95.74 Bcast:208.51.95.127 Mask:255.255.255.192
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:81557 errors:0 dropped:0 overruns:2 frame:0
TX packets:1897586 errors:0 dropped:0 overruns:2 carrier:0
collisions:0 txqueuelen:100
Interrupt:21 Base address:0xb000
from dmesg:
starfire.c:v0.13 8/21/99 Written by Donald Becker
Undates and info at http://www.beowulf.org/linux/drivers.html
eth0: Adaptec Starfire 6915 at 0x9005a000, 00:00:d1:da:c6:33, IRQ 20.
eth0: MII PHY found at address 1, status 0x782d advertising 01e1.
eth1: Adaptec Starfire 6915 at 0x900db000, 00:00:d1:da:c6:34, IRQ 21.
eth1: MII PHY found at address 1, status 0x782d advertising 01e1.
eth0: Setting full-duplex based on MII #1 link partner capability of 41e1.
eth1: Setting full-duplex based on MII #1 link partner capability of 41e1.
eth0: Something Wicked happened! 2048101.
eth0: Something Wicked happened! 2048101.
eth1: Something Wicked happened! 2048101.
eth0: Something Wicked happened! 2048101.
eth0: Link changed: Autonegotiation advertising 01e1 partner 41e1.
eth0: Something Wicked happened! ffffffff.
eth1: Link changed: Autonegotiation advertising 01e1 partner 41e1.
eth1: Something Wicked happened! ffffffff.
eth1: Link changed: Autonegotiation advertising 01e1 partner 41e1.
eth1: Something Wicked happened! ffffffff.