[eepro100-bug] linux-2.0.39 / eepro100 v1.19: transmit timed out errors
Durval Menezes
scyld@tmp.com.br
Thu Jan 10 11:02:03 2002
Hello,
We are seeing strange errors while using the eepro100 v1.19 on hosts
running Linux kernel 2.0.39; the machine works as a gateway, and has
two Intel EtherExpresss 10/100B cards in it.
We had to upgrade to v1.19 because the eepro100 v1.05 that came built into
this kernel had problems losing promiscuous mode in the middle of long-term
libpcap sessions (any packet-capture tool, like tcpdump, after some 4-6
hours max would simply stop seeing other machine's packets; stopping the
libpcap application and restarting it again would cure the problem until
it repeated itself).
The problem with v1.19 is that, when under somewhat heavy load (lets say
two streams of 2Mbps each through it, plus others totalling less than
64Kbps), after 1-3 hours, one of the two eepro100 interfaces (eth1) simply
stops responding to packets (ICMP echos, ARPs, etc); the other eepro100
continues to work OK (eth0): the traffic is flowing from the network
directly connected to eth0 to the network directly connected to eth1;
after 10-20 minutes, the thing apparently recovers by itself, only to
stop again after 1-3 hours (we discovered this because we were transfering
a large quantity of data between machines connected to those networks).
More details:
- at the moment the machine stops responding in eth1 it generates a lot
of warnings in the logs (see end of this message).
- When the problem is manifesting itself, tcpdumps on the machine won't show
any packets coming to eth1, even if another machine on the network sees
the packets.
- After 10-20 minutes the problem magically disappears, only to repeat itself
after 1-2 hours with the same traffic.
- The problem manifested itself in 100% of our tests: while we were running
the above-mentioned 2mbps streams, after 1-2 hours the problem ALWAYS
occurred.
- we moved back to the v1.05 drivers (actually, restored /lib/modules/2.0.39
and rebooted) and the problem was fixed: the above streams run for 8 hours
without any interruptions.
- We even replaced the eth1 card for a brand-new one, but while we had the
v1.19 drivers, the problem remained.
So, does anybody has any inkling why this is happening, and how to fix it?
If we can provide any more info or assistance to help solve this problem,
please contact us.
Another question: where can we find the versions of the eepro100 drivers
between 1.05 and 1.19? version 1.05 has the promiscuos-drop problem, but
is rock-solid regarding heavy traffic; some version up to and including
v1.19 has fixed this problem, but then some other version introduced the
transmit-timeout problem; we were wondering if, as an emergency measure,
we could test versions between 1.05 and 1.19 looking for one that does not
have any of those problems... Is there a CVS anywhere? If not, does someone
maintain a eepro100.c,v file, and could email it to me?
Thanks in advance.
Best Regards,
--
Durval Menezes (scyld AT tmp DOT com DOT br, http://www.tmp.com.br/)
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=[
Jan 9 15:56:43 dekkeret kernel: eth1: Transmit timed out: status 0090 0000 at
9561987/9561989 commands 000c0000 400c0000 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: Tx ring dump, Tx queue 9561989 / 9561987
:
Jan 9 15:56:43 dekkeret kernel: eth1: 0 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 1 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 2 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: * 3 000c0000.
Jan 9 15:56:43 dekkeret kernel: eth1: 4 400c0000.
Jan 9 15:56:43 dekkeret kernel: eth1: =5 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 6 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 7 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 8 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 9 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 10 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 11 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 12 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 13 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 14 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 15 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 16 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 17 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 18 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 19 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 20 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 21 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 22 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 23 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 24 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 25 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 26 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 27 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 28 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 29 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 30 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 31 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1:Printing Rx ring (next to receive into 303
77613).
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 0 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 1 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 2 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 3 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 4 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 5 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 6 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 7 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 8 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 9 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 10 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 11 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 12 c0000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 13 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 14 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 15 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 16 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 17 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 18 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 19 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 20 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 21 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 22 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 23 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 24 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 25 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 26 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 27 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 28 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 29 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 30 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 31 00000001.
Jan 9 15:56:43 dekkeret kernel: PHY index 1 register 0 is 3000.
Jan 9 15:56:43 dekkeret kernel: PHY index 1 register 1 is 782d.
Jan 9 15:56:43 dekkeret kernel: PHY index 1 register 2 is 02a8.
Jan 9 15:56:43 dekkeret kernel: PHY index 1 register 3 is 0150.
Jan 9 15:56:43 dekkeret kernel: PHY index 1 register 4 is 05e1.
Jan 9 15:56:43 dekkeret kernel: PHY index 1 register 5 is 0021.
Jan 9 15:56:43 dekkeret kernel: PHY index 1 register 21 is 0000.
Jan 9 15:56:43 dekkeret kernel: eth1: Restarting the chip...
Jan 9 15:56:43 dekkeret kernel: eth1: Tx ring dump, Tx queue 9561989 / 9561987
:
Jan 9 15:56:43 dekkeret kernel: eth1: 0 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 1 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 2 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: * 3 000c0000.
Jan 9 15:56:43 dekkeret kernel: eth1: 4 400c0000.
Jan 9 15:56:43 dekkeret kernel: eth1: =5 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 6 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 7 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 8 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 9 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 10 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 11 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 12 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 13 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 14 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 15 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 16 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 17 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 18 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 19 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 20 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 21 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 22 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 23 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 24 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 25 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 26 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 27 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 28 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 29 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 30 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1: 31 000ca000.
Jan 9 15:56:43 dekkeret kernel: eth1:Printing Rx ring (next to receive into 303
77613).
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 0 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 1 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 2 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 3 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 4 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 5 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 6 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 7 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 8 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 9 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 10 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 11 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 12 c0000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 13 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 14 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 15 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 16 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 17 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 18 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 19 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 20 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 21 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 22 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 23 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 24 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 25 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 26 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 27 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 28 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 29 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 30 00000001.
Jan 9 15:56:43 dekkeret kernel: Rx ring entry 31 00000001.
Jan 9 15:56:43 dekkeret kernel: PHY index 1 register 0 is 3000.
Jan 9 15:56:43 dekkeret kernel: PHY index 1 register 1 is 782d.
Jan 9 15:56:43 dekkeret kernel: PHY index 1 register 2 is 02a8.
Jan 9 15:56:43 dekkeret kernel: PHY index 1 register 3 is 0150.
Jan 9 15:56:43 dekkeret kernel: PHY index 1 register 4 is 05e1.
Jan 9 15:56:43 dekkeret kernel: PHY index 1 register 5 is 0021.
Jan 9 15:56:43 dekkeret kernel: PHY index 1 register 21 is 0000.
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=]