[vortex] transmit timed out - IRQ conflict?
Martin Siegert
siegert@sfu.ca
Thu, 21 Jun 2001 14:03:18 -0700
Hi there:
the problem that I am having may be related to the "NETDEV WATCHDOG:
eth0: transmit timed out" problem reported earlier, but nevertheless I still
don't know how to solve the problem.
This is a dual AMD box (kernel 2.4.5, otherwise RH7.1) with five 3Com NICs,
three of which are used in a channel-bonded configuration. I am using the
3c59x and bonding drivers that come with the 2.4.5 kernel.
# lspci
00:00.0 Host bridge: Advanced Micro Devices [AMD]: Unknown device 700c (rev 11)
00:01.0 PCI bridge: Advanced Micro Devices [AMD]: Unknown device 700d
00:07.0 ISA bridge: Advanced Micro Devices [AMD]: Unknown device 7410 (rev 02)
00:07.1 IDE interface: Advanced Micro Devices [AMD]: Unknown device 7411 (rev 01)
00:07.3 Bridge: Advanced Micro Devices [AMD]: Unknown device 7413 (rev 01)
00:07.4 USB Controller: Advanced Micro Devices [AMD]: Unknown device 7414 (rev 07)
00:08.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone] (rev 30)
00:09.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone] (rev 30)
00:0a.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone] (rev 30)
00:0e.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27)
00:0f.0 Ethernet controller: 3Com Corporation 3c980-TX [Fast Etherlink XL Server Adapter] (rev 78)
00:10.0 Ethernet controller: 3Com Corporation 3c980-TX [Fast Etherlink XL Server Adapter] (rev 78)
# ifconfig
bond0 Link encap:Ethernet HWaddr 00:50:04:9B:30:E1
inet addr:172.16.0.1 Bcast:172.16.0.255 Mask:255.255.0.0
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:2680666 errors:0 dropped:2653 overruns:0 carrier:0
collisions:0 txqueuelen:0
eth0 Link encap:Ethernet HWaddr 00:01:02:60:16:30
inet addr:142.58.1.232 Bcast:142.58.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:83986 errors:0 dropped:0 overruns:0 frame:0
TX packets:49387 errors:0 dropped:0 overruns:0 carrier:0
collisions:1350 txqueuelen:100
Interrupt:10 Base address:0x1400
eth1 Link encap:Ethernet HWaddr 00:50:04:9B:30:E1
inet addr:172.16.0.1 Bcast:172.16.0.255 Mask:255.255.0.0
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:915955 errors:0 dropped:0 overruns:0 frame:0
TX packets:894440 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
Interrupt:5 Base address:0x1480
eth2 Link encap:Ethernet HWaddr 00:50:04:9B:30:E1
inet addr:172.16.0.1 Bcast:172.16.0.255 Mask:255.255.0.0
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:915333 errors:0 dropped:0 overruns:104 frame:0
TX packets:893120 errors:14 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
Interrupt:3 Base address:0x1800
eth3 Link encap:Ethernet HWaddr 00:50:04:9B:30:E1
inet addr:172.16.0.1 Bcast:172.16.0.255 Mask:255.255.0.0
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:915329 errors:0 dropped:0 overruns:106 frame:0
TX packets:893106 errors:13 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
Interrupt:3 Base address:0x1880
eth4 Link encap:Ethernet HWaddr 00:E0:81:03:0F:7D
inet addr:172.17.0.1 Bcast:172.17.0.255 Mask:255.255.0.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:27 errors:0 dropped:0 overruns:0 frame:0
TX packets:27 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
Interrupt:11 Base address:0x1c00
For some reason eth2 and eth3 sharing the same IRQ 3 (eth1, eth2 and eth3
are channel-bonded). The problem occurs at high throughput (running netpipe
at block sizes of 2097149 bytes). The following errors appear on the console
and the connection freezes completely - I have to reboot the box to solve
the problem.
Jun 21 12:02:38 test2 kernel: NETDEV WATCHDOG: eth2: transmit timed out
Jun 21 12:02:38 test2 kernel: eth2: transmit timed out, tx_status 00 status e681.
Jun 21 12:02:38 test2 kernel: diagnostics: net 0cd8 media 8880 dma 0000003a.
Jun 21 12:02:38 test2 kernel: eth2: Interrupt posted but not delivered -- IRQ blocked by another device?
Jun 21 12:02:38 test2 kernel: Flags; bus-master 1, dirty 892903(7) current 892903(7)
Jun 21 12:02:38 test2 kernel: Transmit list 00000000 vs. df25e3c0.
Jun 21 12:02:38 test2 kernel: 0: @df25e200 length 8000005e status 0001005e
... (14 more lines of the same kind)
Jun 21 12:02:38 test2 kernel: NETDEV WATCHDOG: eth3: transmit timed out
Jun 21 12:02:38 test2 kernel: eth3: transmit timed out, tx_status 00 status e681.
Jun 21 12:02:38 test2 kernel: diagnostics: net 0cc6 media 8880 dma 0000003a.
Jun 21 12:02:38 test2 kernel: eth3: Interrupt posted but not delivered -- IRQ blocked by another device?
Jun 21 12:02:38 test2 kernel: Flags; bus-master 1, dirty 892902(6) current 892902(6)
Jun 21 12:02:38 test2 kernel: Transmit list 00000000 vs. df25d380.
Jun 21 12:02:38 test2 kernel: 0: @df25d200 length 8000005e status 0001005e
Is there a way to force all five NICs to use different IRQs?
Or what else can I do to solve the problem?
Thanks for your help in advance!
Cheers,
Martin
========================================================================
Martin Siegert
Academic Computing Services phone: (604) 291-4691
Simon Fraser University fax: (604) 291-4242
Burnaby, British Columbia email: siegert@sfu.ca
Canada V5A 1S6
========================================================================