Problems running Scyld kernel on a P4?

Kris Boutilier Kris.Boutilier at scrd.bc.ca
Tue Apr 17 13:44:24 PDT 2001


 After all this talk of Pentium IVs, I thought I'd take the time this
weekend to install a Scyld head node a new Dell Dimension 8100 which is
equipped with a 1.3Ghz P4... Let me tell you this: It was not a nice
experience.

 Unfortunately I don't have a Scyld-specific CD, so I planned to use a
RH6.2 cd and then apply the 27BZ-7 beowulf-related RPMs afterwards. The
first problem was with Redhat itself: 'anaconda' kept having exceptions,
even with the latest installer updates. To work around that I used a P3
based machine to install to the hard drive and then transplanted it to
the big machine. From there I was able to successfully install the
beowulf rpms (using the --ignorearch flag as 2.2.14 reports the p4 as a
'i?86' arch)... _but_ when I fire up the scyld kernel it throws an
exception:

	{clipped}
	hda: WDC WD400BB-75AUA1, 38166Mb w/2048kB Cache,
CHS=4865/255/63, UDMA(100)
	Floppy drive(s): fd0 is 1.44M
	general protection fault: 0000
	CPU:	0
	EPI:	010:[<c0112259>]
	EFLAGS:	00010002
	aex: 00000000	ebx: cff7a000	ecx: 000000c1	edx: cff7bf44
	esi: c0244c00	edi: cff663e4	ebp: cff7bf4c	esp: cff7bf1c
	ds: 0018   es: 0018   ss: 0018
	Process swapper (pid: 1, process nr:1, stackpage=cff7b000)
	Stack:	c025d9e0 00000000 cff66000 ffffffff cff7bf44 00000000
c0232000 cff663e4
		cff7a3ec c0244c00 c018c7e2 00000001 cff7bf64 c0112732
00000000 00000247
		cff7a000 c02276ec c025d200 c018e746 00000000 00000000
c0190770 c018dfd0
	Call Trace: [<c018c7e2>] [<c0112732>] [<c018e746>] [<c0190770>]
[<c018dfd0>] [<c0106164>] [<c0106164>]
		    [<c0106164>] [<c010616b>] [<c0108ae8>]
	Code: 0f 32 89 45 f8 8b 45 e0 89 c6 89 50 04 41 d2 8a 56 04 89
56

 For comparative purposes I downloaded a clean copy of 2.2.17 from
kernel.org and compiled it and it booted ok, though the nic (a 3c920) is
misbehaving. I tried 2.2.19 (also from kernel.org) and everything works
just fine. 

 So the question is: how much work would it be to tailor and apply the
head-end related patches to 2.2.19? Or should I just wait patiently and
hope the next round of Scyld patches will be out soon and will resolve
my difficulties? This isn't a production system, I'm just doing this for
my own 'entertainment' so unorthodox workarounds or beta-quality patches
would be ok.

 Attached is the dmesg from the 2.2.17-clean kernel...

-------

 Linux version 2.2.17-clean (root at neo) (gcc version egcs-2.91.66
19990314/Linux (egcs-1.1.2 release)) #2 SMP Mon Apr 16 19:54:38 PDT 2001
Intel MultiProcessor Specification v1.4
    IMCR and PIC compatibility mode.
OEM ID: DELL     Product ID: Dim 8100     APIC at: 0x0
Warning: BIOS table gives no I/O APIC.
Warning: switching to non APIC mode.
Processors: 1
mapped APIC to ffffe000 (00000000)
Detected 1296102 kHz processor.
Console: colour VGA+ 80x25
Calibrating delay loop... 2582.11 BogoMIPS
Memory: 257220k/261632k available (988k kernel code, 420k reserved,
2952k data, 52k init)
Dentry hash table entries: 32768 (order 6, 256k)
Buffer cache hash table entries: 262144 (order 8, 1024k)
Page cache hash table entries: 65536 (order 6, 256k)
8K L1 data cache
12K L1 instruction cache
8192K L1 instruction cache
CPU: L1 I Cache: 8204K  L1 D Cache: 8K
CPU:               Intel(R) Pentium(R) 4 CPU 1300MHz
Checking 386/387 coupling... OK, FPU using exception 16 error reporting.
Checking 'hlt' instruction... OK.
POSIX conformance testing by UNIFIX
mtrr: v1.35a (19990819) Richard Gooch (rgooch at atnf.csiro.au)
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
8K L1 data cache
12K L1 instruction cache
8192K L1 instruction cache
CPU: L1 I Cache: 8204K  L1 D Cache: 8K
CPU:               Intel(R) Pentium(R) 4 CPU 1300MHz
per-CPU timeslice cutoff: 0.00 usecs.
CPU0: Intel               Intel(R) Pentium(R) 4 CPU 1300MHz stepping 07
calibrating APIC timer ... 
..... CPU clock speed is 1296.1670 MHz.
..... system bus clock speed is 0.0000 MHz.
Error: only one processor found.
PCI: PCI BIOS revision 2.10 entry at 0xfc11e
PCI: Using configuration type 1
PCI: Probing PCI hardware
Linux NET4.0 for Linux 2.2
Based upon Swansea University Computer Society NET3.039
NET4: Unix domain sockets 1.0 for Linux NET4.0.
NET4: Linux TCP/IP 1.0 for NET4.0
IP Protocols: ICMP, UDP, TCP
TCP: Hash tables configured (ehash 262144 bhash 65536)
Starting kswapd v 1.5 
Detected PS/2 Mouse Port.
Serial driver version 4.27 with no serial options enabled
ttyS00 at 0x03f8 (irq = 4) is a 16550A
pty: 256 Unix98 ptys configured
PCI_IDE: unknown IDE controller on PCI bus 00 device f9, VID=8086,
DID=244b
PCI_IDE: not 100% native mode: will probe irqs later
    ide0: BM-DMA at 0xffa0-0xffa7, BIOS settings: hda:DMA, hdb:pio
    ide1: BM-DMA at 0xffa8-0xffaf, BIOS settings: hdc:DMA, hdd:pio
hda: WDC WD400BB-75AUA1, ATA DISK drive
hdc: SAMSUNG CD-ROM SC-148C, ATAPI CDROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
ide1 at 0x170-0x177,0x376 on irq 15
hda: WDC WD400BB-75AUA1, 38166MB w/2048kB Cache, CHS=4865/255/63
hdc: ATAPI 48X CD-ROM drive, 128kB Cache
Uniform CD-ROM driver Revision: 3.11
Floppy drive(s): fd0 is 1.44M
FDC 0 is a National Semiconductor PC87306
Partition check:
 hda: hda1 hda3 hda4
VFS: Mounted root (ext2 filesystem) readonly.
Freeing unused kernel memory: 52k freed
Adding Swap: 297192k swap-space (priority -1)
Creative EMU10K1 PCI Audio Driver, version 0.7, 19:56:48 Apr 16 2001
emu10k1: EMU10K1 rev 7 model 0x8022 found, IO at 0xece0-0xecff, IRQ 11
3c59x.c 15Sep00 Donald Becker and others
http://www.scyld.com/network/vortex.html
eth0: 3Com 3c905C Tornado at 0xec00,  00:b0:d0:e4:54:6a, IRQ 3
  8K byte-wide RAM 5:3 Rx:Tx split, autoselect/Autonegotiate interface.
  MII transceiver found at address 1, status   24.
  MII transceiver found at address 2, status   24.
  Enabling bus-master transmits and whole-frame receives.




More information about the Beowulf mailing list