[vortex] Re: vortex 3c982, 3C90x etc

Thomas Mierau tmi@wikon.de
Tue Aug 13 08:00:01 2002


Am Mittwoch, 7. August 2002 14:45 schrieb Bogdan Costescu:
> On Tue, 6 Aug 2002, Thomas wrote:
> > I found the January (long time ago) mailing list about hangs in the ethx
> > section.
>
> Could you please post a link to the exact archived message that you reffer
> to ?
>

This is one part of a hole coversation that you had
http://www.geocrawler.com/archives/3/433/2002/1/0/7568423/

 > I have a Tyan Thunder K7X with two of these LAN controllers called 982.
> > I get exactly the same problem every few seconds a hang in the system.
> > My average ping time is 0,1ms and ervery few seconds it goes up to 200
> > or more ms'.
>
> What is the network topology ?
>

2 machines on a 100Mbit Switch Edimax 3116RE+, Both machins running on 100Mbit 
FD

> > The IRQ (/proc/interrupts) go like crazy ! For a 10 min ping I receive
> > around 10,000,000 Irq's on both CPU's!! Not too bad !
>
> Normal ping (with 1 packet every second) or 'ping -f' ? For normal ping
> the number of interrupts is certainly too high. Do you have several
> devices sharing the same interrupt line (shown in /proc/interrupts) ?
>

1ping/sec (-f actually speeds everything up, especially the IRQ's)
I attached the /proc/pci. That tells you more than the /proc /interrupts.
In fact the multiple serial card would never show up in the proc/interrupts. 
It is a PCI card thogh and of course uses an irq. But there is no difference 
in this card beeing installed or not, or sharing the IRQ or not.
I attached the /proc/interrupts and a second one taken 1 min later. The Ping's 
withe there "funny " timing etc.


> Do you have a standalone PCI network card to test instead of the on-board
> LAN controllers ? Do other devices (if any) in the computer generate
> "normal" interrupt rates or very high like this one ?
>

We tested a regular external card. The effects were the same, but I cannot 
recall the number of irq's.

> > With a hang ervery few seconds I also receive an error message APIC
> > error on CPU ... 02(02) and a error 0xEX81 where X=0,4,6
>
> What kernel are you running ? Based on these error messages, I guess it's
> something based on 2.4.18 (I have Tyan Tigers here which log the same APIC
> error with 2.4.18-something from RedHat, while an earlier 2.4.16 based
> kernel doesn't).
> Are you using the 3c59x driver provided with the kernel or the one from
> Scyld ?
>

We tested kernels from 2.4.18 up to 2.4.19-rc3-ac4. The drivers are the 
original ones and the one from Scyld ... no difference
By the way the APIC error  occurs around every 2 minutes.

> > Ther problem is consistent ! the option nopaic during boot sucks, as the
> >   error rate is going up during Raid System active sections
>
> Could you rephrase this ? I don't quite get the meaning, sorry !
>

Very simple, when the raid systems starts with some intensive work like backup 
of the data base, the errors on the IRQ for the eth0 increases. Some how the 
etho is loosing data packages.

> > Please do a reply all, so that I can get the message on my other account
>
> I added vortex mailing-list to the CC list too...



/proc/interrupts at the start
           CPU0       CPU1
  0:    3163847    3164837    IO-APIC-edge  timer
  1:       2839       2811    IO-APIC-edge  keyboard
  2:          0          0          XT-PIC  cascade
  5: 2603035427 2603245593   IO-APIC-level  eth1
  6:         36         36    IO-APIC-edge  floppy
  8:          0          2    IO-APIC-edge  rtc
 11:      70888      69025   IO-APIC-level  dpti0, eth0
 12:      12580      12617    IO-APIC-edge  PS/2 Mouse
 14:          2          2    IO-APIC-edge  ide0
NMI:          0          0
LOC:    6328682    6328831
ERR:        558
MIS:       1458

/proc/interrupts after 1 min
           CPU0       CPU1
  0:    3166842    3167834    IO-APIC-edge  timer
  1:       2898       2872    IO-APIC-edge  keyboard
  2:          0          0          XT-PIC  cascade
  5: 2605498761 2605702882   IO-APIC-level  eth1
  6:         42         42    IO-APIC-edge  floppy
  8:          0          2    IO-APIC-edge  rtc
 11:      70951      69092   IO-APIC-level  dpti0, eth0
 12:      12580      12617    IO-APIC-edge  PS/2 Mouse
 14:          2          2    IO-APIC-edge  ide0
NMI:          0          0
LOC:    6334674    6334822
ERR:        558
MIS:       1458

the ifconfig output
eth0      Link encap:Ethernet  HWaddr 00:E0:81:21:FF:A2
          inet addr:192.168.47.11  Bcast:192.168.47.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:125798 errors:0 dropped:0 overruns:0 frame:0
          TX packets:188956 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100
          RX bytes:12327368 (11.7 Mb)  TX bytes:18517194 (17.6 Mb)
          Interrupt:11 Base address:0x2400

eth1      Link encap:Ethernet  HWaddr 00:E0:81:21:FF:A3
          inet addr:192.168.47.12  Bcast:192.168.47.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:126196 errors:0 dropped:0 overruns:0 frame:0
          TX packets:63016 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100
          RX bytes:12366600 (11.7 Mb)  TX bytes:6164576 (5.8 Mb)
          Interrupt:5 Base address:0x2480

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:8 errors:0 dropped:0 overruns:0 frame:0
          TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:400 (400.0 b)  TX bytes:400 (400.0 b)

the ping on eth0 
The blocks of 5 return at the same time , as the time differnce is always 1sec 
off look pretty nasty
PING 192.168.47.47 (192.168.47.47) from 192.168.47.11 eth0: 56(84) bytes of 
data.
64 bytes from 192.168.47.47: icmp_seq=1 ttl=255 time=912 ms
64 bytes from 192.168.47.47: icmp_seq=2 ttl=255 time=4912 ms
64 bytes from 192.168.47.47: icmp_seq=3 ttl=255 time=3912 ms
64 bytes from 192.168.47.47: icmp_seq=4 ttl=255 time=2912 ms
64 bytes from 192.168.47.47: icmp_seq=5 ttl=255 time=1912 ms
64 bytes from 192.168.47.47: icmp_seq=6 ttl=255 time=912 ms
64 bytes from 192.168.47.47: icmp_seq=7 ttl=255 time=4905 ms
64 bytes from 192.168.47.47: icmp_seq=8 ttl=255 time=3905 ms
64 bytes from 192.168.47.47: icmp_seq=9 ttl=255 time=2905 ms
64 bytes from 192.168.47.47: icmp_seq=10 ttl=255 time=1905 ms
64 bytes from 192.168.47.47: icmp_seq=11 ttl=255 time=905 ms
64 bytes from 192.168.47.47: icmp_seq=12 ttl=255 time=4912 ms
64 bytes from 192.168.47.47: icmp_seq=13 ttl=255 time=3911 ms
64 bytes from 192.168.47.47: icmp_seq=14 ttl=255 time=2912 ms
64 bytes from 192.168.47.47: icmp_seq=15 ttl=255 time=1912 ms
64 bytes from 192.168.47.47: icmp_seq=16 ttl=255 time=912 ms
64 bytes from 192.168.47.47: icmp_seq=17 ttl=255 time=4902 ms
64 bytes from 192.168.47.47: icmp_seq=18 ttl=255 time=3902 ms
64 bytes from 192.168.47.47: icmp_seq=19 ttl=255 time=2902 ms
64 bytes from 192.168.47.47: icmp_seq=20 ttl=255 time=1902 ms
64 bytes from 192.168.47.47: icmp_seq=21 ttl=255 time=902 ms

--- 192.168.47.47 ping statistics ---
22 packets transmitted, 21 received, 4% loss, time 21039ms
rtt min/avg/max/mdev = 902.583/2812.949/4912.193/1444.073 ms, pipe 5

The ping from eth1 looks a lot better, everything again in blocks of 5 just 
this time 4 good one's and 1 bad
PING 192.168.47.47 (192.168.47.47) from 192.168.47.12 eth1: 56(84) bytes of 
data.
64 bytes from 192.168.47.47: icmp_seq=1 ttl=255 time=0.169 ms
64 bytes from 192.168.47.47: icmp_seq=2 ttl=255 time=0.180 ms
64 bytes from 192.168.47.47: icmp_seq=3 ttl=255 time=0.173 ms
64 bytes from 192.168.47.47: icmp_seq=4 ttl=255 time=645 ms
64 bytes from 192.168.47.47: icmp_seq=5 ttl=255 time=0.181 ms
64 bytes from 192.168.47.47: icmp_seq=6 ttl=255 time=0.174 ms
64 bytes from 192.168.47.47: icmp_seq=7 ttl=255 time=0.169 ms
64 bytes from 192.168.47.47: icmp_seq=8 ttl=255 time=0.173 ms
64 bytes from 192.168.47.47: icmp_seq=9 ttl=255 time=654 ms
64 bytes from 192.168.47.47: icmp_seq=10 ttl=255 time=0.177 ms
64 bytes from 192.168.47.47: icmp_seq=11 ttl=255 time=0.171 ms
64 bytes from 192.168.47.47: icmp_seq=12 ttl=255 time=0.173 ms
64 bytes from 192.168.47.47: icmp_seq=13 ttl=255 time=0.171 ms
64 bytes from 192.168.47.47: icmp_seq=14 ttl=255 time=653 ms
64 bytes from 192.168.47.47: icmp_seq=15 ttl=255 time=0.171 ms
64 bytes from 192.168.47.47: icmp_seq=16 ttl=255 time=0.174 ms
64 bytes from 192.168.47.47: icmp_seq=17 ttl=255 time=0.173 ms
64 bytes from 192.168.47.47: icmp_seq=18 ttl=255 time=0.170 ms
64 bytes from 192.168.47.47: icmp_seq=19 ttl=255 time=672 ms
64 bytes from 192.168.47.47: icmp_seq=20 ttl=255 time=0.181 ms
64 bytes from 192.168.47.47: icmp_seq=21 ttl=255 time=0.174 ms
64 bytes from 192.168.47.47: icmp_seq=22 ttl=255 time=0.174 ms

--- 192.168.47.47 ping statistics ---
22 packets transmitted, 22 received, 0% loss, time 21066ms
rtt min/avg/max/mdev = 0.169/119.485/672.358/253.133 ms

and last but not least a list of attached PCI devices.
The eth driver used is the 3c59x from Donald Becker
PCI devices found:
  Bus  0, device   0, function  0:
    Host bridge: Advanced Micro Devices [AMD] AMD-760 MP [IGD4-2P] System 
Controller (rev 17).
      Master Capable.  Latency=64.
      Prefetchable 32 bit memory at 0xf8000000 [0xfbffffff].
      Prefetchable 32 bit memory at 0xf6200000 [0xf6200fff].
      I/O at 0x1010 [0x1013].
  Bus  0, device   1, function  0:
    PCI bridge: Advanced Micro Devices [AMD] AMD-760 MP [IGD4-2P] AGP Bridge 
(rev 0).
      Master Capable.  Latency=64.  Min Gnt=4.
  Bus  0, device   7, function  0:
    ISA bridge: Advanced Micro Devices [AMD] AMD-768 [Opus] ISA (rev 5).
  Bus  0, device   7, function  1:
    IDE interface: Advanced Micro Devices [AMD] AMD-768 [Opus] IDE (rev 4).
      Master Capable.  Latency=64.  
      I/O at 0x0 [0x7].
      I/O at 0x0 [0x3].
      I/O at 0xf000 [0xf00f].
  Bus  0, device   7, function  3:
    Bridge: Advanced Micro Devices [AMD] AMD-768 [Opus] ACPI (rev 3).
      Master Capable.  Latency=64.  
  Bus  0, device   9, function  0:
    I2O: Distributed Processing Technology SmartRAID V Controller (rev 1).
      IRQ 11.
      Master Capable.  Latency=64.  Min Gnt=1.Max Lat=1.
      Prefetchable 32 bit memory at 0xfc000000 [0xfdffffff].
  Bus  0, device   9, function  1:
    PCI bridge: Distributed Processing Technology PCI Bridge (rev 1).
      Master Capable.  Latency=64.  Min Gnt=4.
  Bus  0, device  16, function  0:
    PCI bridge: Advanced Micro Devices [AMD] AMD-768 [Opus] PCI (rev 5).
      Master Capable.  Latency=99.  Min Gnt=12.
  Bus  3, device   4, function  0:
    Serial controller: Timedia Technology Co Ltd PCI2S550 (Dual 16550 UART) 
(rev 1).
      IRQ 11.
      I/O at 0x2800 [0x281f].
      I/O at 0x2820 [0x282f].
      I/O at 0x2848 [0x284f].
      I/O at 0x2840 [0x2847].
      I/O at 0x2838 [0x283f].
      I/O at 0x2830 [0x2837].
  Bus  3, device   7, function  0:
    VGA compatible controller: ATI Technologies Inc Rage XL (rev 39).
      Master Capable.  Latency=66.  Min Gnt=8.
      Non-prefetchable 32 bit memory at 0xf5000000 [0xf5ffffff].
      I/O at 0x2000 [0x20ff].
      Non-prefetchable 32 bit memory at 0xf4001000 [0xf4001fff].
  Bus  3, device   8, function  0:
    Ethernet controller: 3Com Corporation 3c980-TX 10/100baseTX NIC [Python-T] 
(rev 120).
      IRQ 11.
      Master Capable.  Latency=80.  Min Gnt=10.Max Lat=10.
      I/O at 0x2400 [0x247f].
      Non-prefetchable 32 bit memory at 0xf4002000 [0xf400207f].
  Bus  3, device   9, function  0:
    Ethernet controller: 3Com Corporation 3c980-TX 10/100baseTX NIC [Python-T] 
(#2) (rev 120).
      IRQ 5.
      Master Capable.  Latency=80.  Min Gnt=10.Max Lat=10.
      I/O at 0x2480 [0x24ff].
      Non-prefetchable 32 bit memory at 0xf4002400 [0xf400247f].