KNE100Tx error; urgent

Brian C. Merrell brian@patriot.net
Mon Jul 19 10:45:28 1999


Our shell server has been running this card flawlessly for four months. It
says it's a 21142 (3?) based card.  A few days ago, the machine began
dropping telnet connections.  Looking at the card, the Link and 100 lights
go out for a brief moment, and the port led on the Cisco switch goes
yellow (and stays yellow until things start working again ~30 sec later).  
All network activity to and from the machine hangs.  This is happening
every 1-2 minutes.

mii-diag reports:

Basic registers of MII PHY #1:  3000 7829 0016 f831 01e1 41e1 ffff ffff.
 Basic mode control register 0x3000: Auto-negotiation enabled.
 Basic mode status register 0x7829 ... 782d.
   Link status: previously broken, but now reestablished.
 Your link partner can do 41e1: 100baseTx-FD 100baseTx 10baseT-FD 10baseT.

But a moment later, it reports

Basic registers of MII PHY #1:  3000 782d 0016 f831 01e1 41e1 ffff ffff.
 Basic mode control register 0x3000: Auto-negotiation enabled.
 You have link beat, and everything is working OK.
 Your link partner can do 41e1: 100baseTx-FD 100baseTx 10baseT-FD 10baseT.

This is repeatable (much to my distress).  using the verbose option:

mii-diag.c:v1.05 2/17/99  Donald Becker (becker@cesdis.gsfc.nasa.gov)
 MII PHY #1 transceiver registers:
   3000 7829 0016 f831 01e1 41e1 ffff ffff
   ffff ffff ffff ffff ffff ffff ffff ffff
   0022 ff00 4ae0 fff0 0008 ffff ffff ffff
   ffff ffff ffff ffff ffff ffff ffff ffff.
 Basic mode control register 0x3000: Auto-negotiation enabled.
 Basic mode status register 0x7829 ... 782d.
   Link status: previously broken, but now reestablished.
   This transceiver is capable of  100baseTx-FD 100baseTx 10baseT-FD 10baseT.
   Able to perform Auto-negotiation, negotiation complete.
 Your link partner can do 41e1: 100baseTx-FD 100baseTx 10baseT-FD 10baseT.

And then:

mii-diag.c:v1.05 2/17/99  Donald Becker (becker@cesdis.gsfc.nasa.gov)
 MII PHY #1 transceiver registers:
   3000 782d 0016 f831 01e1 41e1 ffff ffff
   ffff ffff ffff ffff ffff ffff ffff ffff
   0022 ff00 02d0 fff0 0010 ffff ffff ffff
   ffff ffff ffff ffff ffff ffff ffff ffff.
 Basic mode control register 0x3000: Auto-negotiation enabled.
 You have link beat, and everything is working OK.
   This transceiver is capable of  100baseTx-FD 100baseTx 10baseT-FD 10baseT.
   Able to perform Auto-negotiation, negotiation complete.
 Your link partner can do 41e1: 100baseTx-FD 100baseTx 10baseT-FD 10baseT.

I have been working on this for two days!  We have 12 machines chugging
along with Kingston 10/100 cards in them, all using tulip chips (21142)
but that one (in the shell server) will not work.  I have changed NICs
three times, tried different PCI slots, tried different drivers, tried
different kernels, tried swapping the hard drive to a *new machine*, and
still the goddamn thing won't work on this one machine.  I've looked on
the development site at CESDIS, tried a few of the things I saw there, and
still nothing.  This machine has been running for months with this
card/driver/kernel combination!  Has anyone had an experience like this,
and if so, could you help?

 - I've tried kernels 2.0.36 and 2.2.9
 - I've tried the 0.89 and 0.91 driver versions
 - I've tried swapping NIC's (all three didn't work)
 - I've tried swapping PCI slots
 - I've tried swapping the whole machine (just changing the HDD over)
 - I've changed the Ethernet cable and the port on the router
 - I've tried chanting, crying, and begging.

Help!

-brian

Brian C. Merrell      P a t r i o t  N e t      Systems Staff
brian@patriot.net    http://www.patriot.net    (703) 277-7737
          PatriotNet ICBM address: 38.845 N, 77.3 W