[eepro100] eepro100, Same Old Timed Out?

Estes, Matthew mestes@microstrategy.com
Fri, 20 Oct 2000 12:36:17 -0400


I don't know if there is anything new to try with this issue except switch
cards.  But, my setup:
8 mail servers, 2x777MHz i686 Dell, PERC Raid Arrays (Raid-5), percraid.o
drivers, 4x eepro100 NICs (with on-board NIC), Cisco 6509 Switch/Router.
2.2.16-4 kernel (from RedHat), qmail, aic7xxx driver.
Switched from fixed 100/Full to auto negotiate based on a statement about
100/Full fixed could have problems.  It auto negotiates to 100/Full, but I
had unplug the cables first to actually get it to autonegotiate.

When I run at loads of about 2.00 or so, things seem to do well.  When I
increase to loads of 8+, machines handling RX and TX traffic always break
(the TX only machines never seem to break, but that could be unrelated)

I started with the kernel 2.2.16-4 eepro100 1.09j-t driver from Andrey...
cmd_timeout errors.  Switched to 1.11 from Donald... "Transmit timed out"
errors.

I've used rx-copybreak=10, rx-copybreak=1518 (I was desperate),
max_interrupt_work=800, "noapic" boot option.
I've thought of using someone's custom modification for chaning the MTU, but
I didn't think that was related.

Note:  eth0 and USB are both on IRQ 11 and bus 0, could this cause a
problem?  One card timed out with only 2 NIC cards on IRQ 11 and 5, while
the USB was on 10 (still bus 0)

eth0 is the default route out, while the traffic is sent to eth1, other
cards currently unused
Here is the summary of logs I kept during the NIC timeouts:

Oct 19 11:19:00 guru5-exds kernel: eth0: Transmit timed out: status 0000
0010 at 14923/14954 commands 000c0000 000c0000 000c0000. 
Oct 19 11:19:04 guru5-exds kernel: eth0: Transmit timed out: status 0000
0010 at 14923/14955 commands 000c0000 000c0000 000c0000. 
Oct 19 11:19:06 guru5-exds kernel: eth0: Transmit timed out: status 0000
0010 at 14923/14956 commands 4001a000 000c0000 000c0000. 
Oct 19 11:19:10 guru5-exds kernel: eth0: Transmit timed out: status 0000
0010 at 14923/14957 commands 0001a000 4001a000 000c0000. 
Oct 19 11:19:14 guru5-exds kernel: eth0: Transmit timed out: status 0000
0010 at 14923/14958 commands 0001a000 0001a000 4001a000. 
(message repeats)

eth0: Tx ring dump,  Tx queue 14972 / 14923:
eth0:   0 0001a000.
eth0:   1 0001a000.
eth0:   2 0001a000.
eth0:   3 0001a000.
eth0:   4 0001a000.
eth0:   5 0001a000.
eth0:   6 0001a000.
eth0:   7 0001a000.
eth0:   8 0001a000.
eth0:   9 0001a000.
eth0:   10 0001a000.
eth0: * 11 0001a000.
eth0:   12 0001a000.
eth0:   13 0001a000.
eth0:   14 0001a000.
eth0:   15 0001a000.
eth0:   16 0001a000.
eth0:   17 0001a000.
eth0:   18 0001a000.
eth0:   19 0001a000.
eth0:   20 0001a000.
eth0:   21 0001a000.
eth0:   22 0001a000.
eth0:   23 0001a000.
eth0:   24 0001a000.
eth0:   25 0001a000.
eth0:   26 0001a000.
eth0:   27 4001a000.
eth0:  =28 0001a000.
eth0:   29 0001a000.
eth0:   30 0001a000.
eth0:   31 0001a000.
eth0:Printing Rx ring (next to receive into 45349).
  Rx ring entry 0  00000001.
  Rx ring entry 1  00000001.
  Rx ring entry 2  00000001.
  Rx ring entry 3  00000001.
  Rx ring entry 4  c0000001.
  Rx ring entry 5  00000001.
  Rx ring entry 6  00000001.
  Rx ring entry 7  00000001.
  Rx ring entry 8  00000001.
  Rx ring entry 9  00000001.
  Rx ring entry 10  00000001.
  Rx ring entry 11  00000001.
  Rx ring entry 12  00000001.
  Rx ring entry 13  00000001.
  Rx ring entry 14  00000001.
  Rx ring entry 15  00000001.
  Rx ring entry 16  00000001.
  Rx ring entry 17  00000001.
  Rx ring entry 18  00000001.
  Rx ring entry 19  00000001.
  Rx ring entry 20  00000001.
  Rx ring entry 21  00000001.
  Rx ring entry 22  00000001.
  Rx ring entry 23  00000001.
  Rx ring entry 24  00000001.
  Rx ring entry 25  00000001.
  Rx ring entry 26  00000001.
  Rx ring entry 27  00000001.
  Rx ring entry 28  00000001.
  Rx ring entry 29  00000001.
  Rx ring entry 30  00000001.
  Rx ring entry 31  00000001.
  PHY index 1 register 0 is 3000.
  PHY index 1 register 1 is 782d.
  PHY index 1 register 2 is 02a8.
  PHY index 1 register 3 is 0154.
  PHY index 1 register 4 is 05e1.
  PHY index 1 register 5 is 41e1.
  PHY index 1 register 21 is 0000.

eepro100-diag.c:v2.02 7/19/2000 Donald Becker (becker@scyld.com)
 http://www.scyld.com/diag/index.html
Index #1: Found a Intel i82557 (or i82558) EtherExpressPro100B adapter at
0xecc0.
i82557 chip registers at 0xecc0:
  00100000 371a6244 00000000 00080002 142405e1 00000000
  No interrupt sources are pending.
   The transmit unit state is 'Idle'.
   The receive unit state is 'Idle'.
  This status is unusual for an activated interface.
 The Command register has an unprocessed command 0010(?!).
Index #2: Found a Intel i82557 (or i82558) EtherExpressPro100B adapter at
0xec80.
i82557 chip registers at 0xec80:
  00000050 36a1d8e4 00000000 00080002 182541e1 000005f0
  No interrupt sources are pending.
   The transmit unit state is 'Suspended'.
   The receive unit state is 'Ready'.
  This status is normal for an activated but idle interface.
Index #3: Found a Intel i82557 (or i82558) EtherExpressPro100B adapter at
0xec40.
i82557 chip registers at 0xec40:
  00000000 00000000 00000000 00080002 183f0000 00000000
  No interrupt sources are pending.
   The transmit unit state is 'Idle'.
   The receive unit state is 'Idle'.
  This status is unusual for an activated interface.
Index #4: Found a Intel i82557 (or i82558) EtherExpressPro100B adapter at
0xccc0.
i82557 chip registers at 0xccc0:
  00000000 00000000 00000000 00080002 183f0000 00000000
  No interrupt sources are pending.
   The transmit unit state is 'Idle'.
   The receive unit state is 'Idle'.
  This status is unusual for an activated interface.

eepro100-diag.c:v2.02 7/19/2000 Donald Becker (becker@scyld.com)
 http://www.scyld.com/diag/index.html
Index #1: Found a Intel i82557 (or i82558) EtherExpressPro100B adapter at
0xecc0.
EEPROM contents, size 64x16:
    00: d000 69b7 1568 0503 0000 0201 4701 0000
  0x08: a089 2201 4882 100c 8086 0000 0000 0000
      ...
  0x30: 002c 0000 0000 0000 0000 0000 0000 0000
  0x38: 0000 0000 0000 0000 0000 0000 0000 81cc
 The EEPROM checksum is correct.
Intel EtherExpress Pro 10/100 EEPROM contents:
  Station address 00:D0:B7:69:68:15.
  Board assembly a08922-001, Physical connectors present: RJ45
  Primary interface chip i82555 PHY #1.
Index #2: Found a Intel i82557 (or i82558) EtherExpressPro100B adapter at
0xec80.
EEPROM contents, size 64x16:
    00: d000 69b7 b965 0503 0000 0201 4701 0000
  0x08: a089 2201 4882 100c 8086 0000 0000 0000
      ...
  0x30: 002c 0000 0000 0000 0000 0000 0000 0000
  0x38: 0000 0000 0000 0000 0000 0000 0000 ddcf
 The EEPROM checksum is correct.
Intel EtherExpress Pro 10/100 EEPROM contents:
  Station address 00:D0:B7:69:65:B9.
  Board assembly a08922-001, Physical connectors present: RJ45
  Primary interface chip i82555 PHY #1.
Index #3: Found a Intel i82557 (or i82558) EtherExpressPro100B adapter at
0xec40.
EEPROM contents, size 64x16:
    00: d000 69b7 b765 0503 0000 0201 4701 0000
  0x08: a089 2201 4882 100c 8086 0000 0000 0000
      ...
  0x30: 002c 0000 0000 0000 0000 0000 0000 0000
  0x38: 0000 0000 0000 0000 0000 0000 0000 dfcf
 The EEPROM checksum is correct.
Intel EtherExpress Pro 10/100 EEPROM contents:
  Station address 00:D0:B7:69:65:B7.
  Board assembly a08922-001, Physical connectors present: RJ45
  Primary interface chip i82555 PHY #1.
Index #4: Found a Intel i82557 (or i82558) EtherExpressPro100B adapter at
0xccc0.
EEPROM contents, size 64x16:
    00: b000 20d0 cadf 0400 0000 0201 4701 0000
  0x08: 0719 5d00 48a2 009b 1028 0000 0000 0000
      ...
  0x30: 0020 0000 0000 0000 0000 0000 0000 0000
  0x38: 0000 0000 0000 0000 0000 0000 0000 146b
 The EEPROM checksum is correct.
Intel EtherExpress Pro 10/100 EEPROM contents:
  Station address 00:B0:D0:20:DF:CA.
  Receiver lock-up bug exists. (The driver work-around *is* implemented.)
  Board assembly 07195d-000, Physical connectors present: RJ45
  Primary interface chip i82555 PHY #1.
eepro100-diag.c:v2.02 7/19/2000 Donald Becker (becker@scyld.com)
 http://www.scyld.com/diag/index.html
Index #1: Found a Intel i82557 (or i82558) EtherExpressPro100B adapter at
0xecc0.
 MII PHY #1 transceiver registers:
  3000 7809 02a8 0154 05e1 0000 0000 0000
  0000 0000 0000 0000 0000 0000 0000 0000
  0000 0000 0001 0000 0000 0000 0000 0000
  0000 0000 0000 0000 0000 0000 0000 0000.
Index #2: Found a Intel i82557 (or i82558) EtherExpressPro100B adapter at
0xec80.
 MII PHY #1 transceiver registers:
  3000 782d 02a8 0154 05e1 41e1 0003 0000
  0000 0000 0000 0000 0000 0000 0000 0000
  0203 0000 0001 1443 0000 0000 126c 0000
  0000 0000 0b10 0000 0000 0000 0000 0000.
Index #3: Found a Intel i82557 (or i82558) EtherExpressPro100B adapter at
0xec40.
 MII PHY #1 transceiver registers:
  3000 7809 02a8 0154 05e1 0000 0000 0000
  0000 0000 0000 0000 0000 0000 0000 0000
  0000 0000 0001 0000 0000 0000 0000 0000
  0000 0000 0890 0000 0000 0000 0000 0000.
Index #4: Found a Intel i82557 (or i82558) EtherExpressPro100B adapter at
0xccc0.
 MII PHY #1 transceiver registers:
  3000 7809 02a8 0154 05e1 0000 0000 0000
  0000 0000 0000 0000 0000 0000 0000 0000
  0000 0000 0001 0000 0000 0000 0000 0000
  0000 0000 08f0 0000 0000 0000 0000 0000.

eth0      Link encap:Ethernet  HWaddr 00:D0:B7:69:68:15  
          inet addr:10.31.42.108  Bcast:10.31.47.255  Mask:255.255.248.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:45349 errors:0 dropped:0 overruns:0 frame:0
          TX packets:14916 errors:36 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100 
          Interrupt:11 Base address:0xc000 
eth1      Link encap:Ethernet  HWaddr 00:D0:B7:69:65:B9  
          inet addr:10.31.42.109  Bcast:10.31.47.255  Mask:255.255.248.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:31293 errors:0 dropped:0 overruns:0 frame:0
          TX packets:7 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100 
          Interrupt:10 Base address:0xe000 
lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:3924  Metric:1
          RX packets:4 errors:0 dropped:0 overruns:0 frame:0
          TX packets:4 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 


           CPU0       CPU1       
  0:     210216          0          XT-PIC  timer
  1:       1004          0          XT-PIC  keyboard
  2:          0          0          XT-PIC  cascade
  5:       7220          0          XT-PIC  aacraid
  8:          1          0          XT-PIC  rtc
 10:      21363          0          XT-PIC  eth1
 11:      48328          0          XT-PIC  aic7xxx, eth0
 13:          1          0          XT-PIC  fpu
 14:          7          0          XT-PIC  ide0
NMI:          1
ERR:          0

(This next interrupt dump is taken a few seconds later, notice the eth1 and
eth0 are no longer incrementing.  However, on some machines I believe they
time out, but continue to increment.)
           CPU0       CPU1       
  0:     217843          0          XT-PIC  timer
  1:       1481          0          XT-PIC  keyboard
  2:          0          0          XT-PIC  cascade
  5:       7326          0          XT-PIC  aacraid
  8:          1          0          XT-PIC  rtc
 10:      21363          0          XT-PIC  eth1
 11:      48328          0          XT-PIC  aic7xxx, eth0
 13:          1          0          XT-PIC  fpu
 14:          7          0          XT-PIC  ide0
NMI:          1
ERR:          0

(note: eth0 and USB are on IRQ 11 and bus 0)
PCI devices found:
  Bus  0, device   0, function  0:
    Host bridge: Unknown vendor CNB30LE PCI Bridge (rev 5).
      Medium devsel.  Master Capable.  Latency=48.  
  Bus  0, device   0, function  1:
    Host bridge: Unknown vendor CNB30LE PCI Bridge (rev 5).
      Medium devsel.  Master Capable.  Latency=48.  
  Bus  0, device   2, function  0:
    Ethernet controller: Intel 82557 (rev 8).
      Medium devsel.  Fast back-to-back capable.  IRQ 11.  Master Capable.
Latency=32.  Min Gnt=8.Max Lat=56.
      Non-prefetchable 32 bit memory at 0xfe304000 [0xfe304000].
      I/O at 0xecc0 [0xecc1].
      Non-prefetchable 32 bit memory at 0xfe200000 [0xfe200000].
  Bus  0, device   4, function  0:
    Ethernet controller: Intel 82557 (rev 8).
      Medium devsel.  Fast back-to-back capable.  IRQ 10.  Master Capable.
Latency=32.  Min Gnt=8.Max Lat=56.
      Non-prefetchable 32 bit memory at 0xfe303000 [0xfe303000].
      I/O at 0xec80 [0xec81].
      Non-prefetchable 32 bit memory at 0xfe100000 [0xfe100000].
  Bus  0, device   8, function  0:
    Ethernet controller: Intel 82557 (rev 8).
      Medium devsel.  Fast back-to-back capable.  IRQ 5.  Master Capable.
Latency=32.  Min Gnt=8.Max Lat=56.
      Non-prefetchable 32 bit memory at 0xfe302000 [0xfe302000].
      I/O at 0xec40 [0xec41].
      Non-prefetchable 32 bit memory at 0xfe000000 [0xfe000000].
  Bus  1, device   8, function  0:
    Ethernet controller: Intel 82557 (rev 8).
      Medium devsel.  Fast back-to-back capable.  IRQ 10.  Master Capable.
Latency=32.  Min Gnt=8.Max Lat=56.
      Non-prefetchable 32 bit memory at 0xfa100000 [0xfa100000].
      I/O at 0xccc0 [0xccc1].
      Non-prefetchable 32 bit memory at 0xfa000000 [0xfa000000].
  Bus  0, device  14, function  0:
    VGA compatible controller: ATI Unknown device (rev 122).
      Vendor id=1002. Device id=4759.
      Medium devsel.  Fast back-to-back capable.  Master Capable.
Latency=32.  Min Gnt=8.
      Prefetchable 32 bit memory at 0xfc000000 [0xfc000008].
      I/O at 0xe800 [0xe801].
      Non-prefetchable 32 bit memory at 0xfe301000 [0xfe301000].
  Bus  0, device  15, function  0:
    ISA bridge: Unknown vendor Unknown device (rev 79).
      Vendor id=1166. Device id=200.
      Medium devsel.  Master Capable.  No bursts.  
  Bus  0, device  15, function  1:
    IDE interface: Unknown vendor Unknown device (rev 0).
      Vendor id=1166. Device id=211.
      Medium devsel.  Master Capable.  Latency=64.  
      I/O at 0x8b0 [0x8b1].
  Bus  0, device  15, function  2:
    USB Controller: Unknown vendor Unknown device (rev 4).
      Vendor id=1166. Device id=220.
      Medium devsel.  Fast back-to-back capable.  IRQ 11.  Master Capable.
Latency=32.  Max Lat=80.
      Non-prefetchable 32 bit memory at 0xfe300000 [0xfe300000].
  Bus  1, device   2, function  0:
    PCI bridge: Intel Unknown device (rev 1).
      Vendor id=8086. Device id=962.
      Medium devsel.  Fast back-to-back capable.  Master Capable.
Latency=32.  Min Gnt=6.
  Bus  1, device   2, function  1:
    RAID storage controller: Unknown vendor Unknown device (rev 1).
      Vendor id=1028. Device id=3.
      Medium devsel.  Fast back-to-back capable.  IRQ 5.  Master Capable.
Latency=32.  
      Prefetchable 32 bit memory at 0xf0000000 [0xf0000008].
  Bus  2, device   4, function  0:
    SCSI storage controller: Adaptec Unknown device (rev 1).
      Vendor id=9005. Device id=c5.
      Medium devsel.  Fast back-to-back capable.  BIST capable.  IRQ 5.
Master Capable.  Latency=32.  Min Gnt=40.Max Lat=25.
      I/O at 0xdc00 [0xdc01].
      Non-prefetchable 64 bit memory at 0xf8fff000 [0xf8fff004].
  Bus  2, device   4, function  1:
    SCSI storage controller: Adaptec AIC-7899 (rev 1).
      Medium devsel.  Fast back-to-back capable.  BIST capable.  IRQ 11.
Master Capable.  Latency=32.  Min Gnt=40.Max Lat=25.
      I/O at 0xd800 [0xd801].
      Non-prefetchable 64 bit memory at 0xf8ffe000 [0xf8ffe004].

________________________________
Matt Estes
Senior Engineer, Operations
703.770.1652  mailto:mestes@microstrategy.com