Crashes with 3c59x.c:v0.99L especially on DELL WS

Shon Martin Shon.Martin@oberlin.edu
Thu Aug 26 09:22:59 1999


Richard:
	Thats odd. I've got a Dell WS 400, and haven't seen any problems
like you've described. Have you tried running Dell's diagnostic? Something
wrong with the network?

-Shon

On Thu, 26 Aug 1999, Richard Black wrote:

> Date: Thu, 26 Aug 1999 09:55:53 +0100
> From: Richard Black <rjb@dcs.gla.ac.uk>
> To: linux-vortex-bug@beowulf.gsfc.nasa.gov
> Cc: rjb@dcs.gla.ac.uk
> Subject: Crashes with 3c59x.c:v0.99L especially on DELL WS
> 
> 
> Hello All,
> 
> I have a DELL WS 400 with:
> 
> 3c59x.c:v0.99L 5/28/99 Donald Becker http://cesdis.gsfc.nasa.gov/linux/drivers/
> vortex.html
> eth0: 3Com 3c905 Boomerang 100baseTx at 0xfcc0,  00:c0:4f:a3:10:1c, IRQ 14
>   8K word-wide RAM 3:5 Rx:Tx split, MII interface.
>   MII transceiver found at address 24, status 786f.
>   Enabling bus-master transmits and whole-frame receives.
> 
> 
> I was seeing serious problems with the 0.99H driver as shipped with the 2.2.x 
> kernel series. Specifically I was getting crashes on this class of machine 
> every day or so.  Sometimes I would get an on screen oops about bad reference 
> to address 00000070 and sometimes the machine would just freeze up.
> 
> I tracked the bad address 00000070 down to line 1660:
> 
> 	DEV_FREE_SKB(vp->tx_skb); /* Release the transfered buffer */
> 
> sometimes this was attempting to free a NULL pointer as the skbuff and the 
> skbuff code would make a reference to base+0x70 as part of the freeing 
> operation.
> 
> I then changed that to:
> 
> 	if (vp->tx_skb) {
> 		DEV_FREE_SKB(vp->tx_skb); /* Release the transfered buffer */
> 		vp->tx_skb = NULL;
> 	} else
> 		printk(KERN_DEBUG "vortex would have crashed here (RJB)\n");
> 
> 
> and now what I see is that I get in syslog:
> 
> Aug 26 05:33:28 easter kernel: eth0: Too much work in interrupt, status e481.  
> Temporarily disabling functions (7b7e).
> Aug 26 05:34:09 easter kernel: eth0: transmit timed out, tx_status 00 status 
> e900.
> Aug 26 05:34:14 easter kernel: eth0: transmit timed out, tx_status 00 status 
> e000.
> Aug 26 05:34:49 easter last message repeated 7 times
> Aug 26 05:35:54 easter last message repeated 13 times
> Aug 26 05:36:44 easter last message repeated 10 times
> Aug 26 05:36:46 easter ypbind[380]: Host name lookup failure 
> Aug 26 05:36:49 easter kernel: eth0: transmit timed out, tx_status 00 status 
> e000.
> Aug 26 05:37:24 easter last message repeated 7 times
> Aug 26 05:38:29 easter last message repeated 13 times
> Aug 26 05:39:24 easter last message repeated 11 times
> Aug 26 05:39:26 easter ypbind[380]: Host name lookup failure 
> Aug 26 05:39:29 easter kernel: eth0: transmit timed out, tx_status 00 status 
> e000.
> Aug 26 05:40:04 easter last message repeated 7 times
> Aug 26 05:41:09 easter last message repeated 13 times
> Aug 26 05:42:14 easter last message repeated 13 times
> Aug 26 05:42:34 easter last message repeated 4 times
> Aug 26 05:42:36 easter ypbind[380]: Host name lookup failure 
> Aug 26 05:42:39 easter kernel: eth0: transmit timed out, tx_status 00 status 
> e000.
> Aug 26 05:43:14 easter last message repeated 7 times
> Aug 26 05:44:19 easter last message repeated 13 times
> Aug 26 05:45:14 easter last message repeated 11 times
> Aug 26 05:45:17 easter ypbind[380]: Host name lookup failure 
> 
> 
> I now recognise the (7b7e) as sometimes when I previously got the crash that 
> was the only thing visible on the screen above the Oops message.
> 
> 
> Can anyone provide a fix for these bugs?
> 
> Thanks,
> 
> Richard.
> 
> 
> 
> 
> 
>