[tulip] Tx stats invalid if Tx intr on completion not set?

Bhavesh P. Davda bhavesh@avaya.com
Thu Mar 28 11:43:01 2002


The reason for my questions is this scenario:

I've got a DFE-570TX Quad NIC, serving as eth2-eth5. Kernel version
2.2.17 with lots of mods for several reasons. Tulip driver version
"tulip.c:v0.92 4/17/2000".

Occasionally the interfaces get in a state where the "TX packets"
counter gets stuck, while the "errors" and "carrier" count keeps
climbing up. While it is in this state, I can still use the interfaces
**almost** normally, except that it appears to work really slow compared
to before getting into this state.

The only way I can get out of this state is to "ifconfig down" all the
tulip interfaces, "rmmod tulip", "insmod tulip", and bring up all the
interfaces again. Simply "ifconfig down" and "ifconfig up" doesn't seem
to help.

Here is the output from "ifconfig" and "tulip-diag -aem" while the
interfaces were in this funny state:

eth2      Link encap:Ethernet  HWaddr 00:80:C8:CF:BB:61  
          inet addr:192.11.13.13  Bcast:192.11.13.15 
Mask:255.255.255.252
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:848857 errors:9 dropped:0 overruns:0 frame:9
          TX packets:164937 errors:763531 dropped:0 overruns:5
carrier:763529
          collisions:0 txqueuelen:100 
          Interrupt:7 Base address:0xc00 

eth3      Link encap:Ethernet  HWaddr 00:80:C8:CF:BB:62  
          inet addr:198.152.255.201  Bcast:198.152.255.255 
Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:6771966 errors:0 dropped:0 overruns:0 frame:0
          TX packets:941411 errors:5698157 dropped:0 overruns:1
carrier:5698156
          collisions:0 txqueuelen:100 
          Interrupt:11 Base address:0x2800 

eth4      Link encap:Ethernet  HWaddr 00:80:C8:CF:BB:63  
          inet addr:172.18.18.20  Bcast:172.18.255.255  Mask:255.255.0.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:958106 errors:0 dropped:0 overruns:0 frame:0
          TX packets:27710 errors:212605 dropped:0 overruns:3
carrier:212605
          collisions:0 txqueuelen:100 
          Interrupt:5 Base address:0x4400 

tulip-diag.c:v2.08 5/15/2001 Donald Becker (becker@scyld.com)
 http://www.scyld.com/diag/index.html
Index #1: Found a Digital DS21143 Tulip adapter at 0xdc00.
 * A potential Tulip chip has been found, but it appears to be active.
 * Either shutdown the network, or use the '-f' flag to see all values.
Digital DS21143 Tulip chip registers at 0xdc00:
 0x00: f8a08000 ffffffff ffffffff 0d31b000 0d31b200 f0660000 b20ee002
fbfffbff
 Port selection is MII, half-duplex.
 Transmit started, Receive started, half-duplex.
  The Rx process state is 'Waiting for packets'.
  The Tx process state is 'Idle'.
  The transmit threshold is 1024.
  The NWay status register is 000000c6.
Index #2: Found a Digital DS21143 Tulip adapter at 0xd880.
 * A potential Tulip chip has been found, but it appears to be active.
 * Either shutdown the network, or use the '-f' flag to see all values.
Digital DS21143 Tulip chip registers at 0xd880:
 0x00: f8a08000 ffffffff ffffffff 0d31a800 0d31aa00 f0660000 b20e2002
fbfffbff
 Port selection is MII, half-duplex.
 Transmit started, Receive started, half-duplex.
  The Rx process state is 'Waiting for packets'.
  The Tx process state is 'Idle'.
  The transmit threshold is 128.
  The NWay status register is 000000c6.
Index #3: Found a Digital DS21143 Tulip adapter at 0xd800.
 * A potential Tulip chip has been found, but it appears to be active.
 * Either shutdown the network, or use the '-f' flag to see all values.
Digital DS21143 Tulip chip registers at 0xd800:
 0x00: f8a08000 ffffffff ffffffff 0d31a000 0d31a200 f0660000 b20ee002
fbfffbff
 Port selection is MII, half-duplex.
 Transmit started, Receive started, half-duplex.
  The Rx process state is 'Waiting for packets'.
  The Tx process state is 'Idle'.
  The transmit threshold is 1024.
  The NWay status register is 000000c6.
Index #4: Found a Digital DS21143 Tulip adapter at 0xd480.
Digital DS21143 Tulip chip registers at 0xd480:
 0x00: f8000000 ffffffff ffffffff 08400202 e3f7f0ee f0000000 b20e0000
f3fe0000
 0x40: e0000000 ffffcbf8 ffffffff 00000000 000000c6 ffff0000 fff80000
8ff00000
 Port selection is MII, half-duplex.
 Transmit stopped, Receive stopped, half-duplex.
  The Rx process state is 'Stopped'.
  The Tx process state is 'Stopped'.
  The transmit threshold is 128.
  The NWay status register is 000000c6.
EEPROM 64 words, 6 address bits.
PCI Subsystem IDs, vendor 1186, device 1112.
CardBus Information Structure at offset 00000000.
Ethernet MAC Station Address 00:80:C8:CF:BB:64.
EEPROM transceiver/media description table.
Leaf node at offset 30, default media type 0800 (Autosense).
 1 transceiver description blocks:
  Media MII, block type 3, length 13.
   MII interface PHY 0 (media type 11).
   21143 MII initialization sequence is 0 words:.
   21143 MII reset sequence is 0 words:.
    Media capabilities are 7800, advertising 01e1.
    Full-duplex map 5000, Threshold map 1800.
    No MII interrupt.
 MII PHY found at address 1, status 0x7849.
 MII PHY #1 transceiver registers:
   3100 7849 2000 5c10 01e1 0000 0004 2001
   0000 0000 0000 0000 0000 0000 0000 0000
   0200 0000 0000 0000 0000 0000 0020 0000
   0000 0001 002b 0100 0006 0f00 0000 0000.
  Internal autonegotiation state is 'Autonegotiation disabled'.

Thanks!

- Bhavesh

Donald Becker wrote:
> 
> On Thu, 28 Mar 2002, Bhavesh P. Davda wrote:
> 
> > What happens if all transmit descriptors currently queued for the 21143
> > don't have the TDES1<31> bit set for interrupt on completion?
> 
> The interrupt handler isn't called.
> 
> The Tulip driver uses this logic to reduce the number of Tx-done
> interrupts.
>         if (q_used_cnt < TX_QUEUE_LEN/2) {/* Typical path */
>                 flag = 0x60000000; /* No interrupt */
>         } else if (q_used_cnt == TX_QUEUE_LEN/2) {
>                 flag = 0xe0000000; /* Tx-done intr. */
>         } else if (q_used_cnt < TX_QUEUE_LEN) {
>                 flag = 0x60000000; /* No Tx-done intr. */
>         } else {
>                 tp->tx_full = 1;
>                 flag = 0xe0000000; /* Tx-done intr. */
>         }
>         if (entry == TX_RING_SIZE-1)
>                 flag = 0xe0000000 | DESC_RING_WRAP;
> 
> The last insures that we always raise an interrupt on a ring wrap.  This
> eliminates the chance that a timed transmit pattern will fill the Tx
> ring and stop the transmitter until the TxNoBuf interrupt is handled.
> 
> > The HRM seems to indicate that in that case CSR5 won't have TxIntr set.
> > Then, how does the tulip driver know about completion of these
> > transmitted frames? i.e. how does it update tx_packets and tx_bytes
> > correctly?
> 
> It's pretty easy to read the code to find this:
> 
>         if (csr5 & (TxNoBuf | TxDied | TxIntr)) {
> 
> When the Tulip runs out of packets to transmit it will immediately raise
> an interrupt.  Handling this interrupt insures that the statistics will
> be accurate when the transmitter is idle, and lag only slightly (due to
> interrupt mitigation) when the transmitter is active.
> 
> Some of my other drivers unconditionally scavenge the Tx queue entries
> on any type of interrupt.  This approach to TxDone interrupt
> minimization works well because few protocols continuously transmit
> without receiving any packets.  But I never rely entirely on this
> approach -- the driver should not allow a skbuff to remain on the
> trasnmit queue long after it has been actually transmitted.  The obvious
> bad case is sending ARP packets without getting a response.  The ARP
> code wants its skbuff back in order to try again!
> 
> --
> Donald Becker                           becker@scyld.com
> Scyld Computing Corporation             http://www.scyld.com
> 410 Severn Ave. Suite 210               Second Generation Beowulf Clusters
> Annapolis MD 21403                      410-990-9993