de4x5 (2.3.99-pre6) crash on Xp1000 alpha

Mr. Berkley Shands berkley@cs.wustl.edu
Fri Apr 28 14:52:15 2000


This is a multi-part message in MIME format.
--------------B772B74684BB215ECC6CC3CD
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit


Bug report as follows:

-- 
               berkley@cs.wustl.edu	"Software Enforcer"
--------------B772B74684BB215ECC6CC3CD
Content-Type: text/plain; charset=us-ascii;
 name="de4x5.report"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="de4x5.report"

[1.] One line summary of the problem:  
	de4x5.c kernel fault chasing pointers  

[2.] Full description of the problem/report:
	XP1000 (Monet, alpha 21264 512mb) crashes on all
	2.3.99* kernels when probing the buses for network cards.
	therefore the kernel never boots up all the way.
	This error was reported (incorrectly :-) before. Here is
	the fix...

[3.] Keywords (i.e., modules, networking, kernel):
	redhat 6.1, network, de4x5, alpha, crash

[4.] Kernel version (from /proc/version): 2.3.99-pre6

[5.] Output of Oops.. message (if applicable) with symbolic information 
     resolved (see Documentation/oops-tracing.txt)
[6.] A small shell script or example program which triggers the
     problem (if possible)
[7.] Environment
[7.1.] Software (add the output of the ver_linux script here)

-- Versions installed: (if some fields are empty or look
-- unusual then possibly you have very old versions)
Linux alleycat.arl.wustl.edu 2.2.14 #79 Fri Apr 28 08:31:48 CDT 2000 alpha unknown
Kernel modules         2.1.121
Gnu C                  egcs-2.91.66
Binutils               2.9.1.0.24
Linux C Library        2.1.2
Dynamic linker         ldd (GNU libc) 2.1.2
Procps                 2.0.5
Mount                  2.9u
Net-tools              1.53
Console-tools          1999.03.02
Sh-utils               2.0
Modules Loaded         

Apr 27 10:07:41 alleycat kernel: Linux version 2.2.14 (root@alleycat.arl.wustl.edu)
	(gcc version egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)) #76 Thu Apr 27 09:50:25 CDT 2000
Apr 27 10:07:41 alleycat kernel: Booting GENERIC on Tsunami variation Monet using machine vector Monet from SRM
Apr 27 10:07:41 alleycat kernel: Command line: root=/dev/sdg4 bootdevice=sda5 bootfile=vmlinux-2.2.14
Apr 27 10:07:41 alleycat kernel: Console: colour VGA+ 80x25
Apr 27 10:07:41 alleycat kernel: Calibrating delay loop... 996.15 BogoMIPS
Apr 27 10:07:41 alleycat kernel: Memory: 514736k available

[7.2.] Processor information (from /proc/cpuinfo):
cpu                     : Alpha
cpu model               : EV6
cpu variation           : 7
cpu revision            : 0
cpu serial number       : 
system type             : Tsunami
system variation        : Monet
system revision         : 0
system serial number    : 4003DPKZ1007
cycle frequency [Hz]    : 500000000 
timer frequency [Hz]    : 1024.00
page size [bytes]       : 8192
phys. address bits      : 44
max. addr. space #      : 255
BogoMIPS                : 996.14
kernel unaligned acc    : 0 (pc=0,va=0)
user unaligned acc      : 0 (pc=0,va=0)
platform string         : COMPAQ Professional Workstation XP1000
cpus detected           : 1

[7.3.] Module information (from /proc/modules):
[7.4.] Loaded driver and hardware information (/proc/ioports, /proc/iomem)
[7.5.] PCI information ('lspci -vvv' as root)
[7.6.] SCSI information (from /proc/scsi/scsi)
[7.7.] Other information that might be relevant to the problem
       (please look in /proc and include all information that you
       think to be relevant):
       
       	it seems the xp1000 does things a bit differently than my 433au :-)
	included is the patch to arch/alpha/mm/fault.c from Tru64 engineering
	needed to survive the bug, and locate the problem :-)
	
[X.] Other notes, patches, fixes, workarounds:

diff -Naur linux-2.3.99-pre6-clean/drivers/net/de4x5.c linux-2.3.99-pre6/drivers/net/de4x5.c
--- linux-2.3.99-pre6-clean/drivers/net/de4x5.c	Fri Apr 21 18:08:52 2000
+++ linux-2.3.99-pre6/drivers/net/de4x5.c	Fri Apr 28 10:54:09 2000
@@ -2299,7 +2299,10 @@
     for (walk = walk->next; walk != &dev->bus_list; walk = walk->next) {
 	struct pci_dev *this_dev = pci_dev_b(walk);
 
-	pb = this_dev->bus->number;
+	if (this_dev->bus)
+	   pb = this_dev->bus->number;	/* this_dev->bus may be null */
+	else
+	   pb = 0;			/* default in error */
 	vendor = this_dev->vendor;
 	device = this_dev->device << 8;
 	if (!(is_DC21040 || is_DC21041 || is_DC21140 || is_DC2114x)) continue;
diff -Naur linux-2.3.99-pre6-clean/arch/alpha/mm/fault.c linux-2.3.99-pre6/arch/alpha/mm/fault.c
--- linux-2.3.99-pre6-clean/arch/alpha/mm/fault.c	Mon Apr 24 17:49:21 2000
+++ linux-2.3.99-pre6/arch/alpha/mm/fault.c	Fri Apr 28 11:00:39 2000
@@ -179,8 +179,10 @@
  */
 	printk(KERN_ALERT "Unable to handle kernel paging request at "
 	       "virtual address %016lx\n", address);
-	die_if_kernel("Oops", regs, cause, (unsigned long*)regs - 16);
-	do_exit(SIGKILL);
+/*	die_if_kernel("Oops", regs, cause, (unsigned long*)regs - 16); */
+/*	do_exit(SIGKILL); */
+	regs->pc += 4;	/* WUARL allow bad IOBUS reads */
+	return;
 
 /*
  * We ran out of memory, or some other thing happened to us that made

Thank you

--------------B772B74684BB215ECC6CC3CD
Content-Type: text/x-vcard; charset=us-ascii;
 name="berkley.vcf"
Content-Transfer-Encoding: 7bit
Content-Description: Card for Mr. Berkley Shands
Content-Disposition: attachment;
 filename="berkley.vcf"

begin:vcard 
n:Shands;Berkley
tel;fax:(314)-935-7302
tel;work:(314)-935-6636
x-mozilla-html:FALSE
url:http://www.cs.wustl.edu/~berkley
org:Washington University;Department of Computer Science
adr:;;Campus Box 1045, Bryan Hall room 509;St. Louis;Missouri;63130-4899;USA
version:2.1
email;internet:berkley@cs.wustl.edu
title:Senior Research Associate
note:"Software Enforcer"
x-mozilla-cpt:;-24128
fn:E. F. Berkley Shands
end:vcard

--------------B772B74684BB215ECC6CC3CD--

-------------------------------------------------------------------
To unsubscribe send a message body containing "unsubscribe"
to linux-tulip-request@beowulf.org