[eepro100] Power glitch disables interface until machine is UNPLUGGED.

Mike Herrick mherrick@openreach.com
Tue Oct 22 11:45:01 2002


This weekend we had a power glitch which caused all of the PCs to reboot.
PCs that we use which had the Tyan S2425 (Tomcat 815ef) motherboard, with
two eepro100 NICs onboard failed to recover in an interesting way: the PC
rebooted, but the eth0 interface cannot be used.  The eth1 interface is
fine.  A soft reboot of the system doesn't clear the problem, a power off/on
cycle does not clear the problem and a hard reset does not clear the
problem!
The problem clears up only after UNPLUGGING the machine from the power
supply.
All three PCs display the same behavior.  After briefly interrupting power
on one of the PCs about 10 times, I was able to reproduce the problem.

My questions:
1) Is there any way to prevent this type of problem (i.e. BIOS setting,
EEPROM setting)?
2) Assuming I can't prevent it, is there a way to automatically detect
and recover from the problem (i.e. read an MII register and do some
type of reset)?

Thanks,

Mike Herrick
mherrick@openreach.com


Specifics:
Linux 2.2.13
Tyan S2425 motherboard
According to the manual: one LAN interface is provided via 82599 [sic?]
controller and the other one via Intel's ICH2 (8252EM) [sic?].
When I look at the motherboard, I see an 82559 and 82562EM.

dmesg:
...
eth0: Intel Pro/100 V Network at 0xc8000000, 00:E0:81:20:2B:72, IRQ 7.
  Board assembly 000000-000, Physical connectors present: RJ45
  Primary interface chip i82555 PHY #1.
  General self-test: passed.
  Serial sub-system self-test: passed.
  Internal registers self-test: passed.
  ROM checksum self-test: passed (0x04f4518b).
eth1: Intel i82559 rev 8 at 0xc8002000, 00:E0:81:20:2B:73, IRQ 10.
  Board assembly 567812-052, Physical connectors present: RJ45
  Primary interface chip i82555 PHY #1.
  General self-test: passed.
  Serial sub-system self-test: passed.
  Internal registers self-test: passed.
  ROM checksum self-test: passed (0x04f4518b).
eepro100.c:v1.24 7/25/2002 Donald Becker <becker@scyld.com>
  http://www.scyld.com/network/eepro100.html
...

messages:
...
Oct 20 17:40:11 localhost kernel: eth0: Transmit timed out: status 0090
0000 at 62/74 commands 000c0000 000c0000 000c0000.
Oct 20 17:40:11 localhost kernel: eth0: Restarting the chip...
Oct 20 17:40:15 localhost kernel: eth0: Transmit timed out: status 0090
0000 at 62/75 commands 000ca000 000ca000 000ca000.
Oct 20 17:40:15 localhost kernel: eth0: Restarting the chip...
Oct 20 17:40:19 localhost kernel: eth0: Transmit timed out: status 0090
0000 at 62/76 commands 000ca000 000ca000 000ca000.
Oct 20 17:40:19 localhost kernel: eth0: Restarting the chip...
Oct 20 17:40:21 localhost kernel: eth0: Transmit timed out: status 0090
0000 at 62/77 commands 000ca000 000ca000 000ca000.
Oct 20 17:40:21 localhost kernel: eth0: Restarting the chip...
...

ifconfig eth0:
eth0      Link encap:Ethernet  HWaddr 00:E0:81:20:2B:72
          BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100
          Interrupt:7

ifconfig eth1:
eth1      Link encap:Ethernet  HWaddr 00:E0:81:20:2B:73
          inet addr:208.185.50.2  Bcast:208.185.50.127  Mask:255.255.255.128
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:18691 errors:0 dropped:0 overruns:0 frame:0
          TX packets:17611 errors:0 dropped:0 overruns:0 carrier:0
          collisions:193 txqueuelen:100
          Interrupt:10 Base address:0x2000

eepro100-diag -aaeem:
eepro100-diag.c:v2.09 7/15/2002 Donald Becker (becker@scyld.com)
 http://www.scyld.com/diag/index.html
Index #1: Found a Intel i82562 Pro/100 V adapter at 0xc400.
i82557 chip registers at 0xc400:
  00000000 00000000 00000000 00080002 18200000 00000000
  No interrupt sources are pending.
   The transmit unit state is 'Idle'.
   The receive unit state is 'Idle'.
  This status is unusual for an activated interface.
EEPROM contents, size 64x16:
    00: e000 2081 722b 1a03 0000 0201 4701 0000
  0x08: 0000 0000 40f0 3010 8086 0064 ffff ffff
  0x10: ffff ffff ffff ffff ffff ffff ffff ffff
  0x18: ffff ffff ffff ffff ffff ffff ffff ffff
  0x20: ffff ffff ffff ffff ffff ffff ffff ffff
  0x28: ffff ffff ffff ffff ffff ffff ffff ffff
  0x30: 0120 4000 3003 ffff ffff ffff ffff ffff
  0x38: ffff ffff ffff 0000 ffff ffff ffff 8229
 The EEPROM checksum is correct.
Intel EtherExpress Pro 10/100 EEPROM contents:
  Station address 00:E0:81:20:2B:72.
  Board assembly 000000-000, Physical connectors present: RJ45
  Primary interface chip i82555 PHY #1.
Index #2: Found a Intel i82557/8/9 EtherExpressPro100 adapter at 0xc800.
A potential i82557 chip has been found, but it appears to be active.
Either shutdown the network, or use the '-f' flag.


mii-diag -a --force eth0:
mii-diag.c:v2.05 7/13/2002 Donald Becker (becker@scyld.com)
 http://www.scyld.com/diag/index.html
  Using the old SIOCGMIIPHY value on PHY 1 (BMCR 0x0000).
  No MII transceiver present!.
 Basic mode control register 0x0000: Auto-negotiation disabled, with
 Speed fixed at 10 mbps, half-duplex.
 Basic mode status register 0x0000 ... 0000.
   Link status: not established.
   This transceiver is capable of <Warning! No media capabilities>.
   Unable to perform Auto-negotiation, negotiation not complete.
 Link partner information is not exchanged when in fixed speed mode.
   End of basic transceiver information.

 MII PHY #1 transceiver registers:
   0000 0000 0000 0000 0000 0000 0000 0000
   0000 0000 0000 0000 0000 0000 0000 0000
   0000 0000 0000 0000 0000 0000 0000 0000
   0000 0000 0000 0000 0000 0000 0000 0000.
 Basic mode control register 0x0000: Auto-negotiation disabled!
   Speed fixed at 10 mbps, half-duplex.
 Basic mode status register 0x0000 ... 0000.
   Link status: not established.
   Capable of <Warning! No media capabilities>.
   Unable to perform Auto-negotiation, negotiation not complete.
 This transceiver has no vendor identification.
 I'm advertising 0000:
   Advertising no additional info pages.
   Using an unknown (non 802.3) encapsulation.
 Link partner capability is 0000:.
   Negotiation did not complete.


mii-diag -a --force eth1:
mii-diag.c:v2.05 7/13/2002 Donald Becker (becker@scyld.com)
 http://www.scyld.com/diag/index.html
  Using the old SIOCGMIIPHY value on PHY 1 (BMCR 0x3000).
 Basic mode control register 0x3000: Auto-negotiation enabled.
 You have link beat, and everything is working OK.
   This transceiver is capable of  100baseTx-FD 100baseTx 10baseT-FD
10baseT.
   Able to perform Auto-negotiation, negotiation complete.
 Your link partner is generating 10baseT link beat  (no autonegotiation).
   End of basic transceiver information.

 MII PHY #1 transceiver registers:
   3000 782d 02a8 0154 05e1 0021 0000 0000
   0000 0000 0000 0000 0000 0000 0000 0000
   0400 0000 0001 0000 0000 0000 0000 0000
   0000 0000 0000 0000 0000 0000 0000 0000.
 Basic mode control register 0x3000: Auto-negotiation enabled.
 Basic mode status register 0x782d ... 782d.
   Link status: established.
   Capable of  100baseTx-FD 100baseTx 10baseT-FD 10baseT.
   Able to perform Auto-negotiation, negotiation complete.
 Vendor ID is 00:aa:00:--:--:--, model 21 rev. 4.
   No specific information is known about this transceiver type.
 I'm advertising 05e1: Flow-control 100baseTx-FD 100baseTx 10baseT-FD
10baseT
   Advertising no additional info pages.
   IEEE 802.3 CSMA/CD protocol.
 Link partner capability is 0021: 10baseT.
   Negotiation did not complete.