[realtek] Hang of rtl8139

Rudy Zijlstra rudy@edsons.demon.nl
Sat, 02 Dec 2000 10:49:28 +0100


Hi,

in the past days I am experiencing regular hangs of a RTL8139 based NIC.

System: Alpha XL with 128MB of RAM running SuSe Linux 6.4 / 7.0
(Upgraded via ftp from 6.1)
driver version:

Dec  2 02:35:09 silk kernel: rtl8139.c:v1.12 9/14/2000 Donald Becker,
becker@scyld.com.
Dec  2 02:35:09 silk kernel:  http://www.scyld.com/network/rtl8139.html
Dec  2 02:35:09 silk kernel: eth0: RealTek RTL8139 Fast Ethernet at
0x8800, IRQ 15, 00:00:b4:a7:f5:de.

The NIC is connected to an Compaq HB2122 Fast ethernet hub, which in
turn is connected to a Compaq netelligent 5708 switch.
On the switch are connected the servers that handle most of the traffic.

All servers are Linux based.

I get some CRC errors on the cable between hub and switch. Most probably
from the cable.

mii-diag output with the NIC in hang:

bash-2.03# ./mii-diag -aa
Using the default interface 'eth0'.
Basic registers of MII PHY #32:  1000 782d 0000 0000 01e1 0000 0000
0000.
 Basic mode control register 0x1000: Auto-negotiation enabled.
 You have link beat, and everything is working OK.
 Your link partner does not do autonegotiation, and this transceiver
type
  does not report the sensed link speed.

rtl-8239-diag NIC still in hang:

bash-2.03# ./rtl8139-diag -aa
rtl8139-diag.c:v2.00 4/19/2000 Donald Becker (becker@scyld.com)
 http://www.scyld.com/diag/index.html
Index #1: Found a RealTek RTL8139 adapter at 0x8800.
The RealTek chip appears to be active, so some registers will not be
read.
To see all register values use the '-f' flag.
RealTek chip registers at 0x8800
 0x000: a7b40000 0000def5 80000000 00000000 8008a03c 8008a03c 8008a03c
8008a03c
 0x020: 43fe8010 43fe8610 43fe8c10 43fe9210 43fe0000 0d000000 00007bdc
0000c07f
 0x040: 78000400 00000000 18adbac2 00000000 002c14c6 00000000 0088c100
00100400
 0x060: 1000f00f 01e1782d 00000000 00260000 006b0005 000207c8 58fab388
ad38d843.
  No interrupt sources are pending.
 The chip configuration is 0x14 0x2c, MII half-duplex mode.
EEPROM size test returned 6, 0x204a4 / 0x3fffe.
bash-2.03# ifconfig eth0 down
bash-2.03# ifconfig eth0 192.168.1.3
bash-2.03#
bash-2.03# ping 192.168.1.1
PING 192.168.1.1 (192.168.1.1): 56 data bytes
64 bytes from 192.168.1.1: icmp_seq=0 ttl=255 time=1.145 ms
64 bytes from 192.168.1.1: icmp_seq=1 ttl=255 time=0.390 ms
--- 192.168.1.1 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.390/0.767/1.145 ms

Relevant messages found in /var/log/messages:

Dec  2 09:58:25 silk kernel: eth0: Transmit timeout, status 0d 0000
media 00.
Dec  2 09:58:25 silk kernel: eth0: Tx queue start entry 19381  dirty
entry 19381, full.
Dec  2 09:58:25 silk kernel: eth0:  Tx descriptor 0 is 8008a05e.
Dec  2 09:58:25 silk kernel: eth0:  Tx descriptor 1 is 8208a05e. (queue
head)
Dec  2 09:58:25 silk kernel: eth0:  Tx descriptor 2 is 8008a05e.
Dec  2 09:58:25 silk kernel: eth0:  Tx descriptor 3 is 8008a05e.
Dec  2 09:58:25 silk kernel: eth0: MII #32 registers are: 1000 782d 0000
0000 01e1 0000 0000 0000.
Dec  2 10:08:40 silk kernel: eth0: Transmit timeout, status 0d 0000
media 00.
Dec  2 10:08:40 silk kernel: eth0: Tx queue start entry 21341  dirty
entry 21341, full.
Dec  2 10:08:40 silk kernel: eth0:  Tx descriptor 0 is 8008a04e.
Dec  2 10:08:40 silk kernel: eth0:  Tx descriptor 1 is 8208a04e. (queue
head)
Dec  2 10:08:40 silk kernel: eth0:  Tx descriptor 2 is 8008a04e.
Dec  2 10:08:40 silk kernel: eth0:  Tx descriptor 3 is 8008a04e.
Dec  2 10:08:40 silk kernel: eth0: MII #32 registers are: 1000 782d 0000
0000 01e1 0000 0000 0000.
Dec  2 10:08:42 silk kernel: nfs: server garion not responding, timed
out
Dec  2 10:08:42 silk kernel: nfs: server garion OK
Dec  2 10:10:07 silk kernel: nfs: server garion not responding, timed
out
Dec  2 10:10:28 silk kernel: nfs: server garion not responding, still
trying
Dec  2 10:11:09 silk kernel: nfs: server garion not responding, still
trying
Dec  2 10:12:33 silk kernel: nfs: server garion not responding, still
trying
Dec  2 10:13:07 silk kernel: nfs: server garion OK
Dec  2 10:15:05 silk kernel: eth0: Transmit timeout, status 0d 0000
media 00.
Dec  2 10:15:05 silk kernel: eth0: Tx queue start entry 4972  dirty
entry 4972, full.
Dec  2 10:15:05 silk kernel: eth0:  Tx descriptor 0 is 8208a04e. (queue
head)
Dec  2 10:15:05 silk kernel: eth0:  Tx descriptor 1 is 8008a04e.
Dec  2 10:15:05 silk kernel: eth0:  Tx descriptor 2 is 8008a04e.
Dec  2 10:15:05 silk kernel: eth0:  Tx descriptor 3 is 8008a04e.
Dec  2 10:15:05 silk kernel: eth0: MII #32 registers are: 1000 782d 0000
0000 01e1 0000 0000 0000.


As can be seen from the above, it needed an ifconfig down and up to
recover.

Judging from /var/log the hang occured at 10:08:42, and a near hang
again at 10:15.

I am willing to run an expiremental driver, run commands to get driver
status etc. but cannot program drivers myself - I do not have enough
knowledge in that area. I very much would appreciate this driver to NOT
hang.

Cheers,

Rudy Zijlstra