[vortex] 3c905CX-TXM

Andrew Morton andrewm@uow.edu.au
Fri, 24 Nov 2000 21:58:25 +1100


Richard Gooch wrote:
> 
> OK. Oh, and I forgot in my last message: thanks for coming up with a
> fix.

s/fix/successful experiment/

I don't know what's going on.  There's some deja vu here.  We've been
slowly incrementing that timeout as more problems like this crop up.

It underwent a large increase when we discovered that the command which
stalls the download engine can take a long time to complete when there
is a massive collision rate.  Again, this is inexplicable because the
command could and should complete immediately because of the way in
which the various threshold registers are programmed.

I'd be very surprised if 3com have suddenly changed the ASICs after
five years, so perhaps there's some interaction with the external
transceiver or somesuch which causes the reset to take a long time.

hmm..  dhinds has changed 3c575_cb.c so that it prints out the ASIC
rev numbers. Cut, paste...

> BTW: is there any way to make these waits non-busy? This is just
> another thing that will screw over scheduling latency. If not, people
> who care about latency will be avoiding 3Com cards...

That's normally OK.  It normally terminates within 0-2 PCI cycles.
It's just these wierd cases where it can take a long time.  As I recall,
timeouts >2,000 PCI cycles occurred every ten minutes or so under
absolutely wicked testing conditions.

So if you can apply the below 2.4 patch and send the debug output
after a bit of usage (insmod/rmmod/up/down/etc) that would be
helpful.  It may be necessary to special-case the RxReset with
some schedule_timeout()s.  But what worries me is:

1: We also do RxReset from within the error handler.  Possibly at
   interrupt time.  What to do there?

2: Does it affect other commands?

3: We don't know what's going on.

Incidentally, we still don't know what's going on with Berkan's NIC.
I can certainly understand it going wierd if the initialisation
isn't waiting long enough for the RxReset to complete.  But it should
have worked with 3c90x.c because that driver uses a 1-second
timeout.

Berkan, could I suggest that you go back to the original driver (0.99Ra)
and increase all the loop counts?  Just do a search for 'CmdInProgress'
and replace all the magical constants (2000, 200, 600) to 2000000 and see
if it starts working.

Richard, silly patch which shows us where the big delays are happening:


--- linux-2.4.0-test11-ac2/drivers/net/3c59x.c	Tue Nov 21 20:11:20 2000
+++ linux-akpm/drivers/net/3c59x.c	Fri Nov 24 21:53:42 2000
@@ -203,7 +203,7 @@
 #include <linux/delay.h>
 
 static char version[] __devinitdata =
-"3c59x.c:LK1.1.11 13 Nov 2000  Donald Becker and others. http://www.scyld.com/network/vortex.html " "$Revision: 1.102.2.46 $\n";
+"3c59x.c:LK1.1.11 13 Nov 2000  Donald Becker and others. http://www.scyld.com/network/vortex.html " "$Revision: 1.102.2.40 $\n";
 
 MODULE_AUTHOR("Donald Becker <becker@scyld.com>");
 MODULE_DESCRIPTION("3Com 3c59x/3c90x/3c575 series Vortex/Boomerang/Cyclone driver");
@@ -863,7 +863,7 @@
 	struct vortex_private *vp;
 	int option;
 	unsigned int eeprom[0x40], checksum = 0;		/* EEPROM contents */
-	int i;
+	int i, step;
 	struct net_device *dev;
 	static int printed_version;
 	int retval;
@@ -1025,6 +1025,13 @@
 			   dev->irq);
 #endif
 
+	EL3WINDOW(4);
+	step = (inb(ioaddr + Wn4_NetDiag) & 0x1e) >> 1;
+	printk(KERN_INFO "  product code '%c%c' rev %02x.%d date %02d-"
+		   "%02d-%02d\n", eeprom[6]&0xff, eeprom[6]>>8, eeprom[0x14],
+		   step, (eeprom[4]>>5) & 15, eeprom[4] & 31, eeprom[4]>>9);
+
+
 	if (pdev && vci->drv_flags & HAS_CB_FNS) {
 		unsigned long fn_st_addr;			/* Cardbus function status space */
 		unsigned short n;
@@ -1148,14 +1155,19 @@
 	return retval;
 }
 
-static void wait_for_completion(struct net_device *dev, int cmd)
+#define wait_for_completion(dev, cmd) _wait_for_completion(dev, cmd, __LINE__)
+
+static void _wait_for_completion(struct net_device *dev, int cmd, int line)
 {
-	int i = 4000;
+	int i;
 
 	outw(cmd, dev->base_addr + EL3_CMD);
-	while (--i > 0) {
-		if (!(inw(dev->base_addr + EL3_STATUS) & CmdInProgress))
+	for (i = 0; i < 4000000; i++) {
+		if (!(inw(dev->base_addr + EL3_STATUS) & CmdInProgress)) {
+			if (i > 1000)
+				printk("wait_for_completion: line=%d, count=%d\n", line, i);
 			return;
+		}
 	}
 	printk(KERN_ERR "%s: command 0x%04x did not complete! Status=0x%x\n",
 			   dev->name, cmd, inw(dev->base_addr + EL3_STATUS));