[vortex] Re: vortex 3c982, 3C90x etc
Thomas Mierau
tmi@wikon.de
Tue Aug 13 08:00:01 2002
Am Mittwoch, 7. August 2002 14:45 schrieb Bogdan Costescu:
> On Tue, 6 Aug 2002, Thomas wrote:
> > I found the January (long time ago) mailing list about hangs in the ethx
> > section.
>
> Could you please post a link to the exact archived message that you reffer
> to ?
>
This is one part of a hole coversation that you had
http://www.geocrawler.com/archives/3/433/2002/1/0/7568423/
> I have a Tyan Thunder K7X with two of these LAN controllers called 982.
> > I get exactly the same problem every few seconds a hang in the system.
> > My average ping time is 0,1ms and ervery few seconds it goes up to 200
> > or more ms'.
>
> What is the network topology ?
>
2 machines on a 100Mbit Switch Edimax 3116RE+, Both machins running on 100Mbit
FD
> > The IRQ (/proc/interrupts) go like crazy ! For a 10 min ping I receive
> > around 10,000,000 Irq's on both CPU's!! Not too bad !
>
> Normal ping (with 1 packet every second) or 'ping -f' ? For normal ping
> the number of interrupts is certainly too high. Do you have several
> devices sharing the same interrupt line (shown in /proc/interrupts) ?
>
1ping/sec (-f actually speeds everything up, especially the IRQ's)
I attached the /proc/pci. That tells you more than the /proc /interrupts.
In fact the multiple serial card would never show up in the proc/interrupts.
It is a PCI card thogh and of course uses an irq. But there is no difference
in this card beeing installed or not, or sharing the IRQ or not.
I attached the /proc/interrupts and a second one taken 1 min later. The Ping's
withe there "funny " timing etc.
> Do you have a standalone PCI network card to test instead of the on-board
> LAN controllers ? Do other devices (if any) in the computer generate
> "normal" interrupt rates or very high like this one ?
>
We tested a regular external card. The effects were the same, but I cannot
recall the number of irq's.
> > With a hang ervery few seconds I also receive an error message APIC
> > error on CPU ... 02(02) and a error 0xEX81 where X=0,4,6
>
> What kernel are you running ? Based on these error messages, I guess it's
> something based on 2.4.18 (I have Tyan Tigers here which log the same APIC
> error with 2.4.18-something from RedHat, while an earlier 2.4.16 based
> kernel doesn't).
> Are you using the 3c59x driver provided with the kernel or the one from
> Scyld ?
>
We tested kernels from 2.4.18 up to 2.4.19-rc3-ac4. The drivers are the
original ones and the one from Scyld ... no difference
By the way the APIC error occurs around every 2 minutes.
> > Ther problem is consistent ! the option nopaic during boot sucks, as the
> > error rate is going up during Raid System active sections
>
> Could you rephrase this ? I don't quite get the meaning, sorry !
>
Very simple, when the raid systems starts with some intensive work like backup
of the data base, the errors on the IRQ for the eth0 increases. Some how the
etho is loosing data packages.
> > Please do a reply all, so that I can get the message on my other account
>
> I added vortex mailing-list to the CC list too...
/proc/interrupts at the start
CPU0 CPU1
0: 3163847 3164837 IO-APIC-edge timer
1: 2839 2811 IO-APIC-edge keyboard
2: 0 0 XT-PIC cascade
5: 2603035427 2603245593 IO-APIC-level eth1
6: 36 36 IO-APIC-edge floppy
8: 0 2 IO-APIC-edge rtc
11: 70888 69025 IO-APIC-level dpti0, eth0
12: 12580 12617 IO-APIC-edge PS/2 Mouse
14: 2 2 IO-APIC-edge ide0
NMI: 0 0
LOC: 6328682 6328831
ERR: 558
MIS: 1458
/proc/interrupts after 1 min
CPU0 CPU1
0: 3166842 3167834 IO-APIC-edge timer
1: 2898 2872 IO-APIC-edge keyboard
2: 0 0 XT-PIC cascade
5: 2605498761 2605702882 IO-APIC-level eth1
6: 42 42 IO-APIC-edge floppy
8: 0 2 IO-APIC-edge rtc
11: 70951 69092 IO-APIC-level dpti0, eth0
12: 12580 12617 IO-APIC-edge PS/2 Mouse
14: 2 2 IO-APIC-edge ide0
NMI: 0 0
LOC: 6334674 6334822
ERR: 558
MIS: 1458
the ifconfig output
eth0 Link encap:Ethernet HWaddr 00:E0:81:21:FF:A2
inet addr:192.168.47.11 Bcast:192.168.47.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:125798 errors:0 dropped:0 overruns:0 frame:0
TX packets:188956 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
RX bytes:12327368 (11.7 Mb) TX bytes:18517194 (17.6 Mb)
Interrupt:11 Base address:0x2400
eth1 Link encap:Ethernet HWaddr 00:E0:81:21:FF:A3
inet addr:192.168.47.12 Bcast:192.168.47.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:126196 errors:0 dropped:0 overruns:0 frame:0
TX packets:63016 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
RX bytes:12366600 (11.7 Mb) TX bytes:6164576 (5.8 Mb)
Interrupt:5 Base address:0x2480
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:8 errors:0 dropped:0 overruns:0 frame:0
TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:400 (400.0 b) TX bytes:400 (400.0 b)
the ping on eth0
The blocks of 5 return at the same time , as the time differnce is always 1sec
off look pretty nasty
PING 192.168.47.47 (192.168.47.47) from 192.168.47.11 eth0: 56(84) bytes of
data.
64 bytes from 192.168.47.47: icmp_seq=1 ttl=255 time=912 ms
64 bytes from 192.168.47.47: icmp_seq=2 ttl=255 time=4912 ms
64 bytes from 192.168.47.47: icmp_seq=3 ttl=255 time=3912 ms
64 bytes from 192.168.47.47: icmp_seq=4 ttl=255 time=2912 ms
64 bytes from 192.168.47.47: icmp_seq=5 ttl=255 time=1912 ms
64 bytes from 192.168.47.47: icmp_seq=6 ttl=255 time=912 ms
64 bytes from 192.168.47.47: icmp_seq=7 ttl=255 time=4905 ms
64 bytes from 192.168.47.47: icmp_seq=8 ttl=255 time=3905 ms
64 bytes from 192.168.47.47: icmp_seq=9 ttl=255 time=2905 ms
64 bytes from 192.168.47.47: icmp_seq=10 ttl=255 time=1905 ms
64 bytes from 192.168.47.47: icmp_seq=11 ttl=255 time=905 ms
64 bytes from 192.168.47.47: icmp_seq=12 ttl=255 time=4912 ms
64 bytes from 192.168.47.47: icmp_seq=13 ttl=255 time=3911 ms
64 bytes from 192.168.47.47: icmp_seq=14 ttl=255 time=2912 ms
64 bytes from 192.168.47.47: icmp_seq=15 ttl=255 time=1912 ms
64 bytes from 192.168.47.47: icmp_seq=16 ttl=255 time=912 ms
64 bytes from 192.168.47.47: icmp_seq=17 ttl=255 time=4902 ms
64 bytes from 192.168.47.47: icmp_seq=18 ttl=255 time=3902 ms
64 bytes from 192.168.47.47: icmp_seq=19 ttl=255 time=2902 ms
64 bytes from 192.168.47.47: icmp_seq=20 ttl=255 time=1902 ms
64 bytes from 192.168.47.47: icmp_seq=21 ttl=255 time=902 ms
--- 192.168.47.47 ping statistics ---
22 packets transmitted, 21 received, 4% loss, time 21039ms
rtt min/avg/max/mdev = 902.583/2812.949/4912.193/1444.073 ms, pipe 5
The ping from eth1 looks a lot better, everything again in blocks of 5 just
this time 4 good one's and 1 bad
PING 192.168.47.47 (192.168.47.47) from 192.168.47.12 eth1: 56(84) bytes of
data.
64 bytes from 192.168.47.47: icmp_seq=1 ttl=255 time=0.169 ms
64 bytes from 192.168.47.47: icmp_seq=2 ttl=255 time=0.180 ms
64 bytes from 192.168.47.47: icmp_seq=3 ttl=255 time=0.173 ms
64 bytes from 192.168.47.47: icmp_seq=4 ttl=255 time=645 ms
64 bytes from 192.168.47.47: icmp_seq=5 ttl=255 time=0.181 ms
64 bytes from 192.168.47.47: icmp_seq=6 ttl=255 time=0.174 ms
64 bytes from 192.168.47.47: icmp_seq=7 ttl=255 time=0.169 ms
64 bytes from 192.168.47.47: icmp_seq=8 ttl=255 time=0.173 ms
64 bytes from 192.168.47.47: icmp_seq=9 ttl=255 time=654 ms
64 bytes from 192.168.47.47: icmp_seq=10 ttl=255 time=0.177 ms
64 bytes from 192.168.47.47: icmp_seq=11 ttl=255 time=0.171 ms
64 bytes from 192.168.47.47: icmp_seq=12 ttl=255 time=0.173 ms
64 bytes from 192.168.47.47: icmp_seq=13 ttl=255 time=0.171 ms
64 bytes from 192.168.47.47: icmp_seq=14 ttl=255 time=653 ms
64 bytes from 192.168.47.47: icmp_seq=15 ttl=255 time=0.171 ms
64 bytes from 192.168.47.47: icmp_seq=16 ttl=255 time=0.174 ms
64 bytes from 192.168.47.47: icmp_seq=17 ttl=255 time=0.173 ms
64 bytes from 192.168.47.47: icmp_seq=18 ttl=255 time=0.170 ms
64 bytes from 192.168.47.47: icmp_seq=19 ttl=255 time=672 ms
64 bytes from 192.168.47.47: icmp_seq=20 ttl=255 time=0.181 ms
64 bytes from 192.168.47.47: icmp_seq=21 ttl=255 time=0.174 ms
64 bytes from 192.168.47.47: icmp_seq=22 ttl=255 time=0.174 ms
--- 192.168.47.47 ping statistics ---
22 packets transmitted, 22 received, 0% loss, time 21066ms
rtt min/avg/max/mdev = 0.169/119.485/672.358/253.133 ms
and last but not least a list of attached PCI devices.
The eth driver used is the 3c59x from Donald Becker
PCI devices found:
Bus 0, device 0, function 0:
Host bridge: Advanced Micro Devices [AMD] AMD-760 MP [IGD4-2P] System
Controller (rev 17).
Master Capable. Latency=64.
Prefetchable 32 bit memory at 0xf8000000 [0xfbffffff].
Prefetchable 32 bit memory at 0xf6200000 [0xf6200fff].
I/O at 0x1010 [0x1013].
Bus 0, device 1, function 0:
PCI bridge: Advanced Micro Devices [AMD] AMD-760 MP [IGD4-2P] AGP Bridge
(rev 0).
Master Capable. Latency=64. Min Gnt=4.
Bus 0, device 7, function 0:
ISA bridge: Advanced Micro Devices [AMD] AMD-768 [Opus] ISA (rev 5).
Bus 0, device 7, function 1:
IDE interface: Advanced Micro Devices [AMD] AMD-768 [Opus] IDE (rev 4).
Master Capable. Latency=64.
I/O at 0x0 [0x7].
I/O at 0x0 [0x3].
I/O at 0xf000 [0xf00f].
Bus 0, device 7, function 3:
Bridge: Advanced Micro Devices [AMD] AMD-768 [Opus] ACPI (rev 3).
Master Capable. Latency=64.
Bus 0, device 9, function 0:
I2O: Distributed Processing Technology SmartRAID V Controller (rev 1).
IRQ 11.
Master Capable. Latency=64. Min Gnt=1.Max Lat=1.
Prefetchable 32 bit memory at 0xfc000000 [0xfdffffff].
Bus 0, device 9, function 1:
PCI bridge: Distributed Processing Technology PCI Bridge (rev 1).
Master Capable. Latency=64. Min Gnt=4.
Bus 0, device 16, function 0:
PCI bridge: Advanced Micro Devices [AMD] AMD-768 [Opus] PCI (rev 5).
Master Capable. Latency=99. Min Gnt=12.
Bus 3, device 4, function 0:
Serial controller: Timedia Technology Co Ltd PCI2S550 (Dual 16550 UART)
(rev 1).
IRQ 11.
I/O at 0x2800 [0x281f].
I/O at 0x2820 [0x282f].
I/O at 0x2848 [0x284f].
I/O at 0x2840 [0x2847].
I/O at 0x2838 [0x283f].
I/O at 0x2830 [0x2837].
Bus 3, device 7, function 0:
VGA compatible controller: ATI Technologies Inc Rage XL (rev 39).
Master Capable. Latency=66. Min Gnt=8.
Non-prefetchable 32 bit memory at 0xf5000000 [0xf5ffffff].
I/O at 0x2000 [0x20ff].
Non-prefetchable 32 bit memory at 0xf4001000 [0xf4001fff].
Bus 3, device 8, function 0:
Ethernet controller: 3Com Corporation 3c980-TX 10/100baseTX NIC [Python-T]
(rev 120).
IRQ 11.
Master Capable. Latency=80. Min Gnt=10.Max Lat=10.
I/O at 0x2400 [0x247f].
Non-prefetchable 32 bit memory at 0xf4002000 [0xf400207f].
Bus 3, device 9, function 0:
Ethernet controller: 3Com Corporation 3c980-TX 10/100baseTX NIC [Python-T]
(#2) (rev 120).
IRQ 5.
Master Capable. Latency=80. Min Gnt=10.Max Lat=10.
I/O at 0x2480 [0x24ff].
Non-prefetchable 32 bit memory at 0xf4002400 [0xf400247f].