[eepro100] Packets lost

Roberto Capobianco roberto.capobianco at igi.cnr.it
Wed May 19 23:09:04 PDT 2004


Hi all,
I have ten CompactPCI stations, a Compaq server and some PCs connected by an
HP ProCurve 2524 switch.
All CompactPCI stations mount an SBS cpu board (CT7 model) with RedHat Linux
8 (kernel 2.4.20-28.8),
the Compaq server also has RedHat Linux 8 (Kernel 2.4.20-13.8) and the PCs
have Windows 2000.
Cabling system class 5 enhanced and certified.
In general the network works well (I have no serious problems with your
driver)
but I observe a strange behaviour:
I have packets lost only doing ping -s 5000 (or any other big size) from or
to the CompactPCI nodes.
>From time to time I also have "Frag reassembly time exceeded". LAN
monitoring highlights a growing number of FCS
from CompactPCI nodes (see diagnostic below).
The problem is clearly localized on the CompactPCI nodes. No PCs are
affected.
Changing the switch or using your latest eepro100 driver (eepro100.c,
pci-scan.c, pci-scan.h, kern_compat.h)
doesn't resolve.
Sleep mode is also disabled with the command eepro100-diag -G 0 -w -w -f

Things seems to be sightly better (less packets lost) changing:
RX_RING_SIZE from 32 to 1024
TX_QUEUE_LIMIT from 12 to 0
but I don't know if this setting is correct.

Any idea about this problem (hardware bug, driver bug ...) ?
Are there any problems with 82559ER chipset ?

The difference in memory betwen the NIC mounted on the SBS board and that
mounted on the Compaq server (lspci -v)
Memory at febc0000 (32-bit, non-prefetchable) [size=128K] -----> SBS boards
Memory at c6c00000 (32-bit, non-prefetchable) [size=1M] -----> Compaq Server
could be correleted with my problem in the sense of poor hardware
implementation on SBS boards?


Here is some detailed info about ping:

PING X.X.X.X (X.X.X.X) from X.X.X.X : 20000(20028) bytes of data.
20008 bytes from X.X.X.X: icmp_seq=1 ttl=64 time=3.79 ms
20008 bytes from X.X.X.X: icmp_seq=2 ttl=64 time=3.77 ms
20008 bytes from X.X.X.X: icmp_seq=3 ttl=64 time=3.78 ms
20008 bytes from X.X.X.X: icmp_seq=4 ttl=64 time=3.77 ms
20008 bytes from X.X.X.X: icmp_seq=5 ttl=64 time=3.96 ms
20008 bytes from X.X.X.X: icmp_seq=6 ttl=64 time=3.78 ms
20008 bytes from X.X.X.X: icmp_seq=7 ttl=64 time=3.78 ms
20008 bytes from X.X.X.X: icmp_seq=9 ttl=64 time=3.75 ms
20008 bytes from X.X.X.X: icmp_seq=10 ttl=64 time=3.77 ms
20008 bytes from X.X.X.X: icmp_seq=12 ttl=64 time=3.79 ms
20008 bytes from X.X.X.X: icmp_seq=13 ttl=64 time=3.77 ms
20008 bytes from X.X.X.X: icmp_seq=14 ttl=64 time=3.77 ms
20008 bytes from X.X.X.X: icmp_seq=15 ttl=64 time=3.77 ms
20008 bytes from X.X.X.X: icmp_seq=19 ttl=64 time=3.77 ms
20008 bytes from X.X.X.X: icmp_seq=21 ttl=64 time=3.75 ms
20008 bytes from X.X.X.X: icmp_seq=22 ttl=64 time=3.76 ms

--- X.X.X.X ping statistics ---
22 packets transmitted, 16 received, 27% loss, time 21148ms
rtt min/avg/max/mdev = 3.758/3.788/3.968/0.094 ms



Here is some detailed info using various linux and diagnostic commands:

1) eepro100-diag -ameef

eepro100-diag.c:v2.12 4/15/2003 Donald Becker (becker at scyld.com)
 http://www.scyld.com/diag/index.html
Index #1: Found a Intel 82559ER EtherExpressPro/100+ adapter at 0xef00.
i82557 chip registers at 0xef00:
  00000050 01924000 00000000 00080002 182541e1 000005f0
  No interrupt sources are pending.
   The transmit unit state is 'Suspended'.
   The receive unit state is 'Ready'.
  This status is normal for an activated but idle interface.
EEPROM contents, size 64x16:
    00: 2000 c6ce 7810 0203 0000 0201 4701 0000  _ ___x_______G__
  0x08: ffff ffff 40a0 1050 4c53 0000 0000 0000  _____ at P_SL______
      ...
  0x30: 0120 0000 0000 0000 0000 0000 0000 0000   _______________
  0x38: 0000 0000 0000 4020 0000 0000 0000 3256  ______ @______V2
 The EEPROM checksum is correct.
Intel EtherExpress Pro 10/100 EEPROM contents:
  Station address 00:20:CE:C6:10:78.
  Board assembly ffffff-255, Physical connectors present: RJ45
  Primary interface chip i82555 PHY #1.
Primary transceiver is MII PHY #1. MII PHY #1 transceiver registers:
   3100 782d 02a8 0154 05e1 41e1 0001 0000
   0000 0000 0000 0000 0000 0000 0000 0000
   0a03 0000 0001 0000 0000 0000 0000 0000
   0000 0000 0b20 0000 0000 0000 0000 0000.
Index #2: Found a Intel 82559ER EtherExpressPro/100+ adapter at 0xee80.
i82557 chip registers at 0xee80:
  00000000 00000000 00000000 00080002 183f0000 00000000
  No interrupt sources are pending.
   The transmit unit state is 'Idle'.
   The receive unit state is 'Idle'.
  This status is unusual for an activated interface.
EEPROM contents, size 64x16:
    00: 2000 c6ce 7910 0203 0000 0201 4701 0000  _ ___y_______G__
  0x08: ffff ffff 40a0 1050 4c53 0000 0000 0000  _____ at P_SL______
      ...
  0x30: 0120 0000 0000 0000 0000 0000 0000 0000   _______________
  0x38: 0000 0000 0000 4020 0000 0000 0000 3156  ______ @______V1
 The EEPROM checksum is correct.
Intel EtherExpress Pro 10/100 EEPROM contents:
  Station address 00:20:CE:C6:10:79.
  Board assembly ffffff-255, Physical connectors present: RJ45
  Primary interface chip i82555 PHY #1.
Primary transceiver is MII PHY #1. MII PHY #1 transceiver registers:
   3000 7809 02a8 0154 05e1 0000 0000 0000
   0000 0000 0000 0000 0000 0000 0000 0000
   0000 0000 0001 0000 0000 0000 0000 0000
   0000 0000 0de0 0000 0000 0000 0000 0000.


2) mii-diag -vg

mii-diag.c:v2.09 9/06/2003 Donald Becker (becker at scyld.com)
 http://www.scyld.com/diag/index.html
Using the default interface 'eth0'.
  Using the new SIOCGMIIPHY value on PHY 1 (BMCR 0x3100).
Driver general parameter settings: 3 64 20 200.
 The autonegotiated capability is 01e0.
The autonegotiated media type is 100baseTx-FD.
 Basic mode control register 0x3100: Auto-negotiation enabled.
 You have link beat, and everything is working OK.
   This transceiver is capable of  100baseTx-FD 100baseTx 10baseT-FD
10baseT.
   Able to perform Auto-negotiation, negotiation complete.
 Your link partner advertised 41e1: 100baseTx-FD 100baseTx 10baseT-FD
10baseT.
   End of basic transceiver information.

libmii.c:v2.10 4/22/2003  Donald Becker (becker at scyld.com)
 http://www.scyld.com/diag/index.html
 MII PHY #1 transceiver registers:
   3100 782d 02a8 0154 05e1 41e1 0001 0000
   0000 0000 0000 0000 0000 0000 0000 0000
   0a03 0000 0001 0000 0000 0000 0000 0000
   0000 0000 0b20 0000 0000 0000 0000 0000.
 Basic mode control register 0x3100: Auto-negotiation enabled.
 Basic mode status register 0x782d ... 782d.
   Link status: established.
   Capable of  100baseTx-FD 100baseTx 10baseT-FD 10baseT.
   Able to perform Auto-negotiation, negotiation complete.
 Vendor ID is 00:aa:00:--:--:--, model 21 rev. 4.
   Vendor/Part: Intel 82559 transceiver.
 I'm advertising 05e1: Flow-control 100baseTx-FD 100baseTx 10baseT-FD
10baseT
   Advertising no additional info pages.
   IEEE 802.3 CSMA/CD protocol.
 Link partner capability is 41e1: 100baseTx-FD 100baseTx 10baseT-FD 10baseT.
   Negotiation  completed.
  Intel 8255* PHY #1 extended management registers:
    Error counts, cleared when read:
     False carriers 0
     Link disconnects 0
     Receive errors 0
     Rx symbol errors 0.
     Rx 10Mbps Early End-Of-Frame errors 0.
     Rx 100Mbps Early End-Of-Frame errors 0.
     Tx jabber errors 0.


3) lspci -v

00:04.0 Ethernet controller: Intel Corp. 82559ER (rev 09)
        Subsystem: SBS Technologies: Unknown device 1050
        Flags: bus master, medium devsel, latency 64, IRQ 11
        Memory at febff000 (32-bit, non-prefetchable) [size=4K]
        I/O ports at ef00 [size=64]
        Memory at febc0000 (32-bit, non-prefetchable) [size=128K]
        Expansion ROM at fea00000 [disabled] [size=1M]
        Capabilities: [dc] Power Management version 2

00:05.0 Ethernet controller: Intel Corp. 82559ER (rev 09)
        Subsystem: SBS Technologies: Unknown device 1050
        Flags: bus master, medium devsel, latency 64, IRQ 11
        Memory at febfc000 (32-bit, non-prefetchable) [size=4K]
        I/O ports at ee80 [size=64]
        Memory at feba0000 (32-bit, non-prefetchable) [size=128K]
        Expansion ROM at fe900000 [disabled] [size=1M]
        Capabilities: [dc] Power Management version 2


4) uname -a

Linux lin009 2.4.20-28.8 #1 Thu Dec 18 13:05:06 EST 2003 i686 i686 i386
GNU/Linux


5) ifconfig

eth0      Link encap:Ethernet  HWaddr 00:20:CE:C6:10:78
          inet addr:150.178.34.23  Bcast:150.178.34.127
Mask:255.255.255.128
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:19328 errors:0 dropped:0 overruns:0 frame:0
          TX packets:13488 errors:0 dropped:0 overruns:4 carrier:0
          collisions:0 txqueuelen:100
          RX bytes:15202166 (14.4 Mb)  TX bytes:15329813 (14.6 Mb)
          Interrupt:11 Base address:0xa000

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:154 errors:0 dropped:0 overruns:0 frame:0
          TX packets:154 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:9958 (9.7 Kb)  TX bytes:9958 (9.7 Kb)


6) cat /proc/net/dev

Inter-|   Receive
 face |bytes          packets errs drop fifo frame compressed multicast
    lo:    9958             154    0    0    0     0          0
0
  eth0:15204146   19361    0    0    0     0          0                  0
  eth1:       0                  0    0    0    0     0          0
0

Inter-|  Transmit
 |bytes         packets errs drop fifo colls carrier compressed
     9958           154    0    0      0     0       0          0
 15331733   13507    0    0      4     0       0          0
        0                  0    0    0      0     0       0          0



7) cat /proc/pci

  Bus  0, device   4, function  0:
    Ethernet controller: Intel Corp. 82559ER (rev 9).
      IRQ 11.
      Master Capable.  Latency=64.  Min Gnt=8.Max Lat=56.
      Non-prefetchable 32 bit memory at 0xfebff000 [0xfebfffff].
      I/O at 0xef00 [0xef3f].
      Non-prefetchable 32 bit memory at 0xfebc0000 [0xfebdffff].
  Bus  0, device   5, function  0:
    Ethernet controller: Intel Corp. 82559ER (#2) (rev 9).
      IRQ 11.
      Master Capable.  Latency=64.  Min Gnt=8.Max Lat=56.
      Non-prefetchable 32 bit memory at 0xfebfc000 [0xfebfcfff].
      I/O at 0xee80 [0xeebf].
      Non-prefetchable 32 bit memory at 0xfeba0000 [0xfebbffff].


8) lspci -v (on the Compaq Server. No problem with this card)

00:04.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev
08)
        Subsystem: Compaq Computer Corporation NC3163 Fast Ethernet NIC
(embedded, WOL)
        Flags: bus master, medium devsel, latency 64, IRQ 10
        Memory at c6dfc000 (32-bit, non-prefetchable) [size=4K]
        I/O ports at 3000 [size=64]
        Memory at c6c00000 (32-bit, non-prefetchable) [size=1M]
        Expansion ROM at <unassigned> [disabled] [size=1M]
        Capabilities: [dc] Power Management version 2


9) output of the HP switch

Status and Counters - Port Counters - Port 19

Name :

Link Status : Up

Bytes Rx : 106,349,228 Bytes Tx : 95,206,835
Unicast Rx : 72,029 Unicast Tx : 64,736
Bcast/Mcast Rx : 3 Bcast/Mcast Tx : 751

FCS Rx : 198 Drops Tx : 0
Alignment Rx : 0 Collisions Tx : 0
Runts Rx : 0 Late Colln Tx : 0
Giants Rx : 0 Excessive Colln : 0
Total Rx Errors : 198 Deferred Tx : 0


10) dmesg with debug=31 in your driver

pci-scan.c:v1.11 8/31/2002  Donald Becker <becker at scyld.com>
http://www.scyld.com/linux/drivers.html
eepro100.c:v1.28 7/22/2003 Donald Becker <becker at scyld.com>
  http://www.scyld.com/network/eepro100.html
divert: allocating divert_blk for eth0
eth0: Intel EtherExpress Pro/100+ i82559ER at 0xd08da000, 00:20:CE:C6:10:78,
IRQ 11.
  Board assembly ffffff-255, Physical connectors present: RJ45
  Primary interface chip i82555 PHY #1.
  General self-test: passed.
  Serial sub-system self-test: passed.
  Internal registers self-test: passed.
  ROM checksum self-test: passed (0xdbd8681d).
divert: allocating divert_blk for eth1
eth1: Intel EtherExpress Pro/100+ i82559ER at 0xd08dc000, 00:20:CE:C6:10:79,
IRQ 11.
  Board assembly ffffff-255, Physical connectors present: RJ45
  Primary interface chip i82555 PHY #1.
  General self-test: passed.
  Serial sub-system self-test: passed.
  Internal registers self-test: passed.
  ROM checksum self-test: passed (0xdbd8681d).
eth0: speedo_open() irq 11.
eth0: Done speedo_open(), status 00000090.
eth0: interrupt  status=0x2050.
 scavenge candidate 0 status 1a000.
 scavenge candidate 1 status 2a000.
 scavenge candidate 2 status 3a000.
 scavenge candidate 3 status 3a000.
 scavenge candidate 4 status 4003a000.
eth0: exiting interrupt, status=0x0050.
eth0: interrupt  status=0x2050.
 scavenge candidate 5 status 400ca000.
eth0: exiting interrupt, status=0x0050.
eth0: interrupt  status=0x2050.
 scavenge candidate 6 status 400ca000.
eth0: exiting interrupt, status=0x0050.
eth0: interrupt  status=0x2050.
 scavenge candidate 7 status 4003a000.
eth0: exiting interrupt, status=0x0050.
eth0: interrupt  status=0x2050.
 scavenge candidate 8 status 400ca000.
eth0: exiting interrupt, status=0x0050.
eth0: Interface monitor tick, chip status 0050.
eth0: interrupt  status=0x2050.
 scavenge candidate 9 status 400ca000.
eth0: exiting interrupt, status=0x0050.
eth0: Interface monitor tick, chip status 0050.
eth0: Interface monitor tick, chip status 0050.
eth0: Interface monitor tick, chip status 0050.
eth0: Interface monitor tick, chip status 0050.
eth0: Interface monitor tick, chip status 0050.
eth0: Interface monitor tick, chip status 0050.


11) dmesg with a single ping (ping -c 1 -s 20000). It seems correct.

eth0: interrupt  status=0x2050.
 scavenge candidate 10 status ca000.
 scavenge candidate 11 status ca000.
 scavenge candidate 12 status ca000.
 scavenge candidate 13 status ca000.
 scavenge candidate 14 status ca000.
 scavenge candidate 15 status ca000.
 scavenge candidate 16 status ca000.
 scavenge candidate 17 status ca000.
 scavenge candidate 18 status ca000.
 scavenge candidate 19 status ca000.
 scavenge candidate 20 status ca000.
 scavenge candidate 21 status 400ca000.
eth0: exiting interrupt, status=0x0050.
eth0: interrupt  status=0x2050.
 scavenge candidate 22 status ca000.
 scavenge candidate 23 status 400ca000.
eth0: exiting interrupt, status=0x0050.
eth0: interrupt  status=0x4050.
 In speedo_rx().
  speedo_rx() status 0000a020 len 1514.
eth0: exiting interrupt, status=0x0050.
eth0: interrupt  status=0x4050.
 In speedo_rx().
  speedo_rx() status 0000a020 len 1514.
eth0: exiting interrupt, status=0x0050.
eth0: interrupt  status=0x4050.
 In speedo_rx().
  speedo_rx() status 0000a020 len 1514.
eth0: exiting interrupt, status=0x0050.
eth0: interrupt  status=0x4050.
 In speedo_rx().
  speedo_rx() status 0000a020 len 1514.
eth0: exiting interrupt, status=0x0050.
eth0: interrupt  status=0x4050.
 In speedo_rx().
  speedo_rx() status 0000a020 len 1514.
eth0: exiting interrupt, status=0x0050.
eth0: interrupt  status=0x4050.
 In speedo_rx().
  speedo_rx() status 0000a020 len 1514.
eth0: exiting interrupt, status=0x0050.
eth0: interrupt  status=0x4050.
 In speedo_rx().
  speedo_rx() status 0000a020 len 1514.
eth0: exiting interrupt, status=0x0050.
eth0: interrupt  status=0x4050.
 In speedo_rx().
  speedo_rx() status 0000a020 len 1514.
eth0: exiting interrupt, status=0x0050.
eth0: interrupt  status=0x4050.
 In speedo_rx().
  speedo_rx() status 0000a020 len 1514.
eth0: exiting interrupt, status=0x0050.
eth0: interrupt  status=0x4050.
 In speedo_rx().
  speedo_rx() status 0000a020 len 1514.
eth0: exiting interrupt, status=0x0050.
eth0: interrupt  status=0x4050.
 In speedo_rx().
  speedo_rx() status 0000a020 len 1514.
eth0: exiting interrupt, status=0x0050.
eth0: interrupt  status=0x4050.
 In speedo_rx().
  speedo_rx() status 0000a020 len 1514.
eth0: exiting interrupt, status=0x0050.
eth0: interrupt  status=0x4050.
 In speedo_rx().
  speedo_rx() status 0000a020 len 1514.
eth0: exiting interrupt, status=0x0050.
eth0: interrupt  status=0x4050.
 In speedo_rx().
  speedo_rx() status 0000a020 len 802.


Thanks in advance.

Roberto Capobianco
Consorzio RFX - CNR di Padova
C.so Stati Uniti, 4
35127 - Camin (PD)
email: roberto.capobianco at igi.cnr.it
web: www.igi.pd.cnr.it
tel.: +39-049-8295048
fax: +39-049-8700718



More information about the eepro100 mailing list