ARP packets lost with Netgear FA310TX?

William J. Earl wje@fir.engr.sgi.com
Tue May 18 21:42:31 1999


     I have a Netgear FA310TX (PCI card with 82c168 PNIC) installed in
a Linux 2.2.9 kernel, using the v0.91 version of tulip.c as a module.
With debug=3, these are the registration messages:

May 18 11:42:46 tanoak kernel: eth0: Lite-On 82c168 PNIC rev 33 at 0xea00, 00:A0:CC:3E:0B:32, IRQ 11. 
May 18 11:42:46 tanoak kernel: eth0:  MII transceiver #1 config 3100 status 782d advertising 01e1. 
May 18 11:42:51 tanoak kernel: eth0: The transmitter stopped.  CSR5 is 2678016, CSR6 814e2002, new CSR6 814e0000. 
May 18 11:42:51 tanoak kernel: eth0: Changing PNIC configuration to half-duplex, CSR6 814e0000. 

and the ifconfig output:

eth0      Link encap:Ethernet  HWaddr 00:A0:CC:3E:0B:32  
          inet addr:150.166.40.91  Bcast:150.166.40.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:11980 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1693 errors:4 dropped:0 overruns:0 carrier:4
          collisions:83 txqueuelen:100 
          Interrupt:11 Base address:0xea00 

My problem is that the system somfails to see an ARP reply which arrives
very quickly after the request.  This trace is of the packets from tanoak,
the Linux system, doing an ARP request, and ether-babylon, an SGI MIPS/IRIX system,
providing the ARP reply.  The trace was taken on a different SGI MIPS/IRIX system.

18:15:22.03399       tanoak -> (broadcast)  ARP C Who is 150.166.40.100, ether-babylon ?
18:15:22.03462 ether-babylon -> tanoak       ARP R 150.166.40.100, ether-babylon is 8:0:69:2:49:4
18:15:23.02916       tanoak -> (broadcast)  ARP C Who is 150.166.40.100, ether-babylon ?
18:15:23.02941 ether-babylon -> tanoak       ARP R 150.166.40.100, ether-babylon is 8:0:69:2:49:4
18:15:23.24932       tanoak -> *            ARP C Who is 150.166.40.28, pcillini ?
18:15:23.24950     pcillini -> tanoak       ARP R 150.166.40.28, pcillini is 0:60:8:9c:fe:6a
18:15:24.02919       tanoak -> (broadcast)  ARP C Who is 150.166.40.100, ether-babylon ?
18:15:24.02945 ether-babylon -> tanoak       ARP R 150.166.40.100, ether-babylon is 8:0:69:2:49:4

A tcpdump on tanoak (150.166.40.91) shows the outgoing messages, and the
missing replies for ether-babylon:

18:15:21.777124 0:a0:cc:3e:b:32 ff:ff:ff:ff:ff:ff 0806 42: arp who-has 150.166.40.100 tell 150.166.40.91
18:15:21.777568 0:a0:cc:3e:b:32 ff:ff:ff:ff:ff:ff 0806 60: arp who-has 150.166.40.100 tell 150.166.40.91
18:15:22.772080 0:a0:cc:3e:b:32 ff:ff:ff:ff:ff:ff 0806 42: arp who-has 150.166.40.100 tell 150.166.40.91
18:15:22.772508 0:a0:cc:3e:b:32 ff:ff:ff:ff:ff:ff 0806 60: arp who-has 150.166.40.100 tell 150.166.40.91
18:15:22.992078 0:a0:cc:3e:b:32 0:60:8:9c:fe:6a 0806 42: arp who-has 150.166.40.28 tell 150.166.40.91
18:15:22.992442 0:60:8:9c:fe:6a 0:a0:cc:3e:b:32 0806 60: arp reply 150.166.40.28 is-at 0:60:8:9c:fe:6a
18:15:23.772078 0:a0:cc:3e:b:32 ff:ff:ff:ff:ff:ff 0806 42: arp who-has 150.166.40.100 tell 150.166.40.91
18:15:23.772507 0:a0:cc:3e:b:32 ff:ff:ff:ff:ff:ff 0806 60: arp who-has 150.166.40.100 tell 150.166.40.91

The timestamps on the two systems are not precisely synchronized.  From a more
detailed trace, I know that the IRIX system sees only 60-byte packets from
tanoak, not 42-byte packets, so I don't know what happens with the latter.

    As you can see, the request and reply are close in time, but not extremely
so (about 300 to 600 us.).  On other occasions, the ARP exchange works just fine.
For some reason, eth0 simply does not see some of the ARP packets.  From the 
statistics, you can see that errors are very low.  (The only errors were
from when I disconnected the cable as an experiment.)  

    Netgear supplied a modified tulip.c with their card, based on an older
tulip.c, but that driver does not work as well as v0.91, in that it gets
errors and seems to be responsible for kernel hangs (at least in that I don't
seem to get kernel hangs with v0.91).  

    Does anyone have an idea what might be going wrong here?  

    The BSD driver for the PNIC has a workaround for a bug in the
PNIC:

/*
 * Grrrrr.
 * The PNIC chip has a terrible bug in it that manifests itself during
 * periods of heavy activity. The exact mode of failure if difficult to
 * pinpoint: sometimes it only happens in promiscuous mode, sometimes it
 * will happen on slow machines. The bug is that sometimes instead of
 * uploading one complete frame during reception, it uploads what looks
 * like the entire contents of its FIFO memory. The frame we want is at
 * the end of the whole mess, but we never know exactly how much data has
 * been uploaded, so salvaging the frame is hard.
 *
 * There is only one way to do it reliably, and it's disgusting.
 * Here's what we know:
 *
 * - We know there will always be somewhere between one and three extra
 *   descriptors uploaded.
 *
 * - We know the desired received frame will always be at the end of the
 *   total data upload.
 *
 * - We know the size of the desired received frame because it will be
 *   provided in the length field of the status word in the last descriptor.
 *
 * Here's what we do:
 *
 * - When we allocate buffers for the receive ring, we bzero() them.
 *   This means that we know that the buffer contents should be all
 *   zeros, except for data uploaded by the chip.
 *
 * - We also force the PNIC chip to upload frames that include the
 *   ethernet CRC at the end.
 *
 * - We gather all of the bogus frame data into a single buffer.
 *
 * - We then position a pointer at the end of this buffer and scan
 *   backwards until we encounter the first non-zero byte of data.
 *   This is the end of the received frame. We know we will encounter
 *   some data at the end of the frame because the CRC will always be
 *   there, so even if the sender transmits a packet of all zeros,
 *   we won't be fooled.
 *
 * - We know the size of the actual received frame, so we subtract
 *   that value from the current pointer location. This brings us
 *   to the start of the actual received packet.
 *
 * - We copy this into an mbuf and pass it on, along with the actual
 *   frame length.
 *
 * The performance hit is tremendous, but it beats dropping frames all
 * the time.
 */

...

		if (sc->pn_rx_war) {
			if ((rxstat & PN_WHOLEFRAME) != PN_WHOLEFRAME) {
				if (rxstat & PN_RXSTAT_FIRSTFRAG)
					sc->pn_rx_bug_save = cur_rx;
				if ((rxstat & PN_RXSTAT_LASTFRAG) == 0)
					continue;
				pn_rx_bug_war(sc, cur_rx);
				rxstat = cur_rx->pn_ptr->pn_status;
			}
		}

I haven't yet worked out what would happen in tulip.c if this condition
occurs (that is, if tulip.c would log a warning or increment an error count,
or would simply discard the packet).  Any opinions in that regard?