Simplex receive problems with 100baseFx

Sun Dec 13 20:45:30 1998

Greetings,

I have a couple of machines with Digital DE500-FA (21143 - 100baseFx)
NICs configured for simplex operation (i.e. only one fibre connected, no
back channel at all). I've had to modify Donald's driver to use the old
tulip_timer call rather than the newer t21142_timer call (since NWay
obviously doesn't work with simplex :).

I've now got a problem with incoming data on the RX end of the link.
After a while (anywhere between 2000 and 100000 packets received),
incoming data just stops. netstat -ci reports that the packets are
simply being dropped. I'm running linux 2.1.129 using large (10MB) IP
receive buffers, and I'm getting failures in the IP fragmentation code
in the kernel and in the driver in the form of errors like:

IP: queue_glue: no memory for gluing queue c2b4cd20

(repeats ad nauseum), shortly followed by:

eth1: 21143 link change. CSR5 = f86980c0.
eth1: 21143 link status interrupt 000000c4, CSR5 = f8680000
eth1: 21143 100baseTx sensed media.

Now, referring to the 21143 hrm this CSR5 interrupt is trying to tell me
that it received an abnormal interrupt because a receive buffer was not
available, and that it has therefore suspended the Rx task.

Next, t21142_lnk_change is called because AIS is set in CSR5, but the
chain of if / else if clauses falls through to the last else because no
other conditions are met (I have forced the media selection to
100baseFx-FD i.e. dev->if_port == 8), and this else clause resets the
card to 100baseTx!

I have added the following to the t21143_lnk_change in an effort to get
the 100baseFx-FD forced but it still doesn't work:

else if (dev->if_port == 8) {
	printk(...)
	tp->csr6 = 0x838E0200 ; /* PCS=1, HBD=1, PS=1, FD=1 */
	outl(0x0000677D, ioaddr + CSR14); /* largely irrelevent since NWay
					     not enabled */
	outl(0x0300, ioaddr + CSR12); /* not MII */
	outl(tp->csr6 | 0x0002, ioaddr + CSR6); /* start Rx */
	outl(tp->csr6 | 0x2002, ioaddr + CSR6); /* start Tx */

Here's some background info:

bash# insmod tulip options=8 debug=3 rx_copybreak=0

dmesg output:

Found Digital DS21143 Tulip at PCI I/O address 0xb800.
tulip.c:v0.89K 8/8/98 becker@cesdis.gsfc.nasa.gov
eth1: Digital DS21143 Tulip at 0xb800, 00 00 f8 08 a8 bb, IRQ 11.
read_eeprom:
1011 500f 0000 0000 0000 0000 0000 0000
0049 0103 0000 08f8 bba8 4100 4400 3545
3030 462d 2341 0008 0000 0000 0000 0000
ac00 00ac 0000 0000 0000 0000 0000 0000
0700 0200 0488 af07 0508 2100 8880 0804
08af 0005 8021 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 d6e0
eth1:  EEPROM default media type 100baseFx.
eth1:  Index #0 - Media 100baseFx (#7) described by a 21143 SYM PHY (4)
block.
eth1:  Index #1 - Media 100baseFx-FD (#8) described by a 21143 SYM PHY
(4) block.
eth1: tulip_open() irq 9.
eth1: Using user-specified media 100baseFx-FD.
eth1: Done tulip_open(), CSR0 ffa04800, CSR5 f0360000 CSR6 b2862202.

Questions
---------

1. What sort of strategies should I employ to prevent the memory
exhaustion of the skbuffs on the Rx end?

2. Given that the skbuff unavailability may be temporary, how can I get
the driver to re-enable the 100baseFx-FD mode after an AIS interrupt?

-- 
Regards,

Sam.

(samm at
vsl dot
com dot au)

Senior Software Engineer,
Vision Abell Pty. Ltd.
http://www.vsl.com.au/abell/

-----------------------------------------------------------------------
Look in the mirror, and don't be tempted to equate transient domination
with either intrinsic superiority or prospects for extended survival.
				-- Stephen Jay Gould, "Life's Grandeur"