[vortex-bug] eth1: transmit timed out, tx_status 00 status e000

Aaron Baird aaron@webcreate.com
Tue, 4 Sep 2001 08:53:32 -0600


This is a multi-part message in MIME format.

------=_NextPart_000_000E_01C1351F.13D096A0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit

I have a firewall that is running RedHat 7.1, 2.4.2 kernel, iptables,
and two 3com 3c905b network cards.  We have a 4 workstation LAN that
uses this firewall as the main gateway (and a Cisco router as the
actual gateway).  About every 24 hours or whenever I do a large, high-speed,
network
transfer, I get an error message (shown below) that renders the firewall
useless until I do a reboot
(ctl+alt+delete).  And the strange thing is, it always occurs on eth1, never
on eth0 (eth1 is connected
to a public switch that the router is also connected to).

Error Message (from log):

Jul 24 13:42:38 castor kernel: NETDEV WATCHDOG: eth1: transmit timed
out
Jul 24 13:42:38 castor kernel: eth1: transmit timed out, tx_status 00
status e000.
Jul 24 13:42:38 castor kernel:   diagnostics: net 0cda media 8880 dma
000000a0.
Jul 24 13:42:38 castor kernel:   Flags; bus-master 1, dirty 61401(9)
current 61417(9)
Jul 24 13:42:38 castor kernel:   Transmit list 07343440 vs. c7343440.
Jul 24 13:42:38 castor kernel:   0: @c7343200  length 8000002a status
0000002a
Jul 24 13:42:38 castor kernel:   1: @c7343240  length 8000003e status
0000003e
Jul 24 13:42:38 castor kernel:   2: @c7343280  length 8000002a status
0000002a
Jul 24 13:42:38 castor kernel:   3: @c73432c0  length 8000002a status
0000002a
Jul 24 13:42:38 castor kernel:   4: @c7343300  length 8000002a status
0000002a
Jul 24 13:42:38 castor kernel:   5: @c7343340  length 8000002a status
0000002a
Jul 24 13:42:38 castor kernel:   6: @c7343380  length 8000003e status
0000003e
Jul 24 13:42:38 castor kernel:   7: @c73433c0  length 8000002a status
8000002a
Jul 24 13:42:38 castor kernel:   8: @c7343400  length 8000002a status
8000002a
Jul 24 13:42:38 castor kernel:   9: @c7343440  length 8000003e status
0000003e
Jul 24 13:42:38 castor kernel:   10: @c7343480  length 80000227 status
00000227
Jul 24 13:42:38 castor kernel:   11: @c73434c0  length 8000003e status
0000003e
Jul 24 13:42:38 castor kernel:   12: @c7343500  length 80000227 status
00000227
Jul 24 13:42:38 castor kernel:   13: @c7343540  length 8000003e status
0000003e
Jul 24 13:42:38 castor kernel:   14: @c7343580  length 8000003e status
0000003e
Jul 24 13:42:38 castor kernel:   15: @c73435c0  length 8000002a status
0000002a


My attempts at a solution:

1. Booted with the "noapic" option. Didn't work.

2. Turned off many of the devices (in the BIOS) that I am not using to free
up IRQs and remove conflicts. Didn't work.

3. I am using the 3c59x driver and I thought I should find the 3c90x driver,
but I could not find a recent driver on 3com's site and I found this post:
http://groups.google.com/groups?hl=...d46c0a541d11,2.  Obviously, I need to
use the 3c59x driver.

4. Increased the number of "buckets" in ip_conntrack. Didn't work.

5. Turned of tcp_syncookies. Didn't work.

6. Upgraded from 2.4.2 kernel to 2.4.7 kernel. Didn't work.

7. Upgraded from 2.4.7 kernel to 2.4.9 kernel. Didn't work.

8. Flased the BIOS on the EPOX-8kta3 motherboard with the most recent BIOS
and loaded fail-safe defaults.  Didn't work.

9. Moved the network cards apart (left about 3 slots in between), but did
not put either of the cards in the AGP/PCI slot.  Didn't help.

10.  Rotated the network cards (put eth1 into eth0, and eth0 into eth1) and
the same problem occurred on eth1.  Didn't help.

11. Used another machine that is exactly the same (we needed two firewalls),
and the same problem occurred after only 60 seconds of up-time.  Didn't
work.


As you can see from the dates in the log, I have been working on this issue
for a long time.  I have posted to groups.google.com and LinuxQuestions.org
and received very little help.  In fact, the only good suggestion I received
was to upgrade from the 2.4.2 kernel to a version higher than 2.4.5.  It is
getting to the point where I either need to find an answer, use different
network cards, or go with a completely different setup.  If anyone could
suggest a solution to this problem or different
network cards, I would really appreciate the help.

Thanks,
Aaron


------=_NextPart_000_000E_01C1351F.13D096A0
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
charset=3Diso-8859-1">


<META content=3D"MSHTML 5.50.4616.200" name=3DGENERATOR></HEAD>
<BODY>
<DIV><FONT face=3DArial><FONT size=3D2>I have a firewall that is running =
RedHat 7.1,=20
2.4.2 kernel, iptables,<BR>and two 3com 3c90<SPAN=20
class=3D828014114-04092001>5b</SPAN> network cards.&nbsp; We have a 4 =
workstation=20
LAN that<BR>uses this firewall as the main gateway (and a Cisco router =
as=20
the<BR>actual gateway).&nbsp; About every 24 hours&nbsp;<SPAN=20
class=3D828014114-04092001>or whenever I do a large, high-speed,=20
network</SPAN></FONT></FONT></DIV>
<DIV><FONT face=3DArial><FONT size=3D2><SPAN =
class=3D828014114-04092001>transfer,=20
</SPAN>I get an error message (shown<SPAN class=3D828014114-04092001>=20
</SPAN>below) that renders the firewall useless until I do a=20
reboot<BR>(ctl+alt+delete).<SPAN class=3D828014114-04092001>&nbsp; And =
the strange=20
thing is, it always occurs on eth1, never on eth0 (eth1 is=20
connected</SPAN></FONT></FONT></DIV>
<DIV><SPAN class=3D828014114-04092001></SPAN><SPAN=20
class=3D828014114-04092001></SPAN><FONT face=3DArial size=3D2>t<SPAN=20
class=3D828014114-04092001>o a public switch that the router is also =
connected=20
to).</SPAN><BR><BR></FONT><FONT face=3DArial size=3D2>Error Message =
(from=20
log):<BR><BR>Jul 24 13:42:38 castor kernel: NETDEV WATCHDOG: eth1: =
transmit=20
timed<BR>out<BR>Jul 24 13:42:38 castor kernel: eth1: transmit timed out, =

tx_status 00<BR>status e000.<BR>Jul 24 13:42:38 castor =
kernel:&nbsp;&nbsp;=20
diagnostics: net 0cda media 8880 dma<BR>000000a0.<BR>Jul 24 13:42:38 =
castor=20
kernel:&nbsp;&nbsp; Flags; bus-master 1, dirty 61401(9)<BR>current=20
61417(9)<BR>Jul 24 13:42:38 castor kernel:&nbsp;&nbsp; Transmit list =
07343440=20
vs. c7343440.<BR>Jul 24 13:42:38 castor kernel:&nbsp;&nbsp; 0: =
@c7343200&nbsp;=20
length 8000002a status<BR>0000002a<BR>Jul 24 13:42:38 castor =
kernel:&nbsp;&nbsp;=20
1: @c7343240&nbsp; length 8000003e status<BR>0000003e<BR>Jul 24 13:42:38 =
castor=20
kernel:&nbsp;&nbsp; 2: @c7343280&nbsp; length 8000002a =
status<BR>0000002a<BR>Jul=20
24 13:42:38 castor kernel:&nbsp;&nbsp; 3: @c73432c0&nbsp; length =
8000002a=20
status<BR>0000002a<BR>Jul 24 13:42:38 castor kernel:&nbsp;&nbsp; 4:=20
@c7343300&nbsp; length 8000002a status<BR>0000002a<BR>Jul 24 13:42:38 =
castor=20
kernel:&nbsp;&nbsp; 5: @c7343340&nbsp; length 8000002a =
status<BR>0000002a<BR>Jul=20
24 13:42:38 castor kernel:&nbsp;&nbsp; 6: @c7343380&nbsp; length =
8000003e=20
status<BR>0000003e<BR>Jul 24 13:42:38 castor kernel:&nbsp;&nbsp; 7:=20
@c73433c0&nbsp; length 8000002a status<BR>8000002a<BR>Jul 24 13:42:38 =
castor=20
kernel:&nbsp;&nbsp; 8: @c7343400&nbsp; length 8000002a =
status<BR>8000002a<BR>Jul=20
24 13:42:38 castor kernel:&nbsp;&nbsp; 9: @c7343440&nbsp; length =
8000003e=20
status<BR>0000003e<BR>Jul 24 13:42:38 castor kernel:&nbsp;&nbsp; 10:=20
@c7343480&nbsp; length 80000227 status<BR>00000227<BR>Jul 24 13:42:38 =
castor=20
kernel:&nbsp;&nbsp; 11: @c73434c0&nbsp; length 8000003e=20
status<BR>0000003e<BR>Jul 24 13:42:38 castor kernel:&nbsp;&nbsp; 12:=20
@c7343500&nbsp; length 80000227 status<BR>00000227<BR>Jul 24 13:42:38 =
castor=20
kernel:&nbsp;&nbsp; 13: @c7343540&nbsp; length 8000003e=20
status<BR>0000003e<BR>Jul 24 13:42:38 castor kernel:&nbsp;&nbsp; 14:=20
@c7343580&nbsp; length 8000003e status<BR>0000003e<BR>Jul 24 13:42:38 =
castor=20
kernel:&nbsp;&nbsp; 15: @c73435c0&nbsp; length 8000002a=20
status<BR>0000002a</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial><FONT size=3D2>My attempts at a =
solution:<BR><BR>1. Booted=20
with the "noapic" option. Didn't work. <BR><BR>2. Turned off many of the =
devices=20
(in the BIOS) that I am not using to free up IRQs and remove conflicts. =
Didn't=20
work. <BR><BR>3. I am using the 3c59x driver and I thought I should find =
the=20
3c90x driver, but I could not find a recent driver on 3com's site and I =
found=20
this post: </FONT><A target=3D_blank=20
href=3D"http://groups.google.com/groups?hl=3Den&amp;safe=3Doff&amp;th=3Df=
549d46c0a541d11,2."><FONT=20
size=3D2>http://groups.google.com/groups?hl=3D...d46c0a541d11,2.</FONT></=
A><FONT=20
size=3D2>&nbsp;<SPAN class=3D828014114-04092001> Obviously, I need to =
use the 3c59x=20
driver.</SPAN></FONT><FONT face=3DVerdana><FONT size=3D2><FONT =
face=3DArial><BR><BR>4.=20
Increased the number of "buckets" in ip_conntrack. Didn't work. =
<BR><BR>5.=20
Turned of tcp_syncookies. Didn't work. <BR><BR>6. Upgraded from 2.4.2 =
kernel to=20
2.4.7 kernel. Didn't work. <BR><BR>7. Upgrade<SPAN=20
class=3D828014114-04092001>d</SPAN> from 2.4.7 kernel to 2.4.9 kernel. =
Didn't=20
work.</FONT> </FONT></FONT>
<DIV>&nbsp;</DIV>
<DIV><SPAN class=3D828014114-04092001><FONT size=3D2>8.&nbsp;Flased the =
BIOS on the=20
EPOX-8kta3 motherboard with the most recent BIOS and loaded fail-safe=20
defaults.&nbsp; Didn't work.</FONT></SPAN></DIV>
<DIV><SPAN class=3D828014114-04092001><FONT =
size=3D2></FONT></SPAN>&nbsp;</DIV>
<DIV><SPAN class=3D828014114-04092001><FONT size=3D2>9.&nbsp;Moved the =
network cards=20
apart (left about 3 slots in between), but did not put either of the =
cards in=20
the AGP/PCI slot.&nbsp; Didn't help.</FONT></SPAN></DIV>
<DIV><SPAN class=3D828014114-04092001><FONT =
size=3D2></FONT></SPAN>&nbsp;</DIV>
<DIV><SPAN class=3D828014114-04092001><FONT size=3D2>10.&nbsp; Rotated =
the network=20
cards (put eth1 into eth0, and eth0 into eth1) and the same problem =
occurred on=20
eth1.&nbsp; Didn't help.</FONT></SPAN></DIV>
<DIV><SPAN class=3D828014114-04092001><FONT =
size=3D2></FONT></SPAN>&nbsp;</DIV>
<DIV><SPAN class=3D828014114-04092001><FONT size=3D2>11.&nbsp;Used =
another machine=20
that is exactly the same (we needed two firewalls), and the same problem =

occurred after only 60 seconds of up-time.&nbsp; Didn't=20
work.</FONT></SPAN></DIV></DIV>
<DIV><FONT size=3D2><BR></FONT>&nbsp;</DIV>
<DIV><SPAN class=3D828014114-04092001></SPAN><FONT size=3D2>A<SPAN=20
class=3D828014114-04092001>s you can see from the dates in the log, I =
have been=20
working on this issue for a long time.&nbsp; I have posted to =
groups.google.com=20
and LinuxQuestions.org and received very little help.&nbsp; In fact, the =
only=20
good suggestion I received was to upgrade from the 2.4.2 kernel to a =
version=20
higher than 2.4.5.&nbsp; It is getting to the point where I either need =
to find=20
</SPAN></FONT><FONT size=3D2>a<SPAN class=3D828014114-04092001>n answer, =
use=20
different network cards, or go with a completely different setup.&nbsp; =
If=20
anyone could suggest a solution to this problem or =
different</SPAN></FONT></DIV>
<DIV><SPAN class=3D828014114-04092001></SPAN><SPAN=20
class=3D828014114-04092001></SPAN><FONT size=3D2>n<SPAN=20
class=3D828014114-04092001>etwork cards, I would really appreciate the=20
help.</SPAN><BR><BR>Thanks,<BR>Aaron</FONT></FONT></DIV>
<DIV>&nbsp;</DIV></BODY></HTML>

------=_NextPart_000_000E_01C1351F.13D096A0--