Can't boot diskless Pentium IV node

Mario Storti mstorti at intec.unl.edu.ar
Wed Mar 20 06:39:09 PST 2002


Hi all,

We have been using a diskless Beowulf cluster with 11 nodes (Pentium
III) for Computational Fluid Dynamics/Finite Element computations
since a couple of years. The operating system is RedHat 7.1 (kernel
2.4.2). 

Now, we are upgrading the nodes to Pentium IV processors (Rimbus
memory, DFI motherboard) and we are experiencing some problems. First
the boot process hanged after the "NET4: Unix domain sockets 1.0/SMP
for Linux NET4.0." message. We found that the problem was caused by
PCMCIA code, then we deactivated PCMCIA support in the config file,
recompiled and now the kernel boots OK after the "Welcome to RedHat
Linux" message and then hangs. We traced back that the correct
`/tftpboot/<ip-address>/etc/rc.sysinit' script is running and the
system hangs in a `sleep 1' line in the script after echoing the
"Welcome to RedHat Linux" message. Commenting out the `sleep 1' line
(just in order to understand what is happening) the init sequence
continues until the `action $"Mounting proc filesystem: " mount -n -t
proc /proc /proc' line. In fact what hangs is the `initlog' line
defined in the `action' function defined in
`/etc/rc.d/init.d/functions'. Commenting out the `initlog' line in
`action' (and other `initlog' lines explicitly appearing in the
`rc.sysinit' script) the boot sequence progress more but eventually
crashes and several messages seem to indicate that the root filesystem
is mounted readonly at that stage (Is this wrong?) or cannot access
some files.

The same kernel (identical with the older one we used in the Pentium
III machines, but with the PCMCIA support deactivated) boots OK in the
P III nodes. 

We ask:

1./ The new kernel was compiled in a P III machine. Is it correct to
compile a kernel in a P III machine and use it in a P IV machine? We
wonder that it is OK, but we wish to confirm this. 

2./ Why the PCMCIA code hangs in the P IV node?

3./ Any ideas about why the `rc.sysinit' script hangs at the `sleep 1'
and `initlog ...' lines. Is this caused by the root filessytem to be
mounted readonly or may be due to errors in the NFS mounted
filesystem?

Any ideas or suggestions are welcome!!

Thnks in advance,

Mario

%%%%%%<>%%%%%%<>%%%%%%<>%%%%%%<>%%%%%%<>%%%%%%<>%%%%%%<>%%%%%%<>%%%%%%<>%%%%%%
Mario Alberto Storti                              | Fax: (54)(342) 455.09.44 |
Centro Internacional de Metodos Computacionales   | Tel: (54)(342) 455.91.75 |
  en Ingenieria - CIMEC (INTEC/CONICET-UNL)       |..........................|
INTEC, Guemes 3450 - 3000 Santa Fe, Argentina                                |
Reply: mstorti at intec.unl.edu.ar, http://venus.ceride.gov.ar/CIMEC            |




More information about the Beowulf mailing list