Can't boot diskless Pentium IV node
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Mario Storti mstorti at intec.unl.edu.arWed Mar 20 06:39:09 PST 2002
- Previous message: M-VIA, other alternatives to TCP for Gb-enet
- Next message: Friday
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi all, We have been using a diskless Beowulf cluster with 11 nodes (Pentium III) for Computational Fluid Dynamics/Finite Element computations since a couple of years. The operating system is RedHat 7.1 (kernel 2.4.2). Now, we are upgrading the nodes to Pentium IV processors (Rimbus memory, DFI motherboard) and we are experiencing some problems. First the boot process hanged after the "NET4: Unix domain sockets 1.0/SMP for Linux NET4.0." message. We found that the problem was caused by PCMCIA code, then we deactivated PCMCIA support in the config file, recompiled and now the kernel boots OK after the "Welcome to RedHat Linux" message and then hangs. We traced back that the correct `/tftpboot/<ip-address>/etc/rc.sysinit' script is running and the system hangs in a `sleep 1' line in the script after echoing the "Welcome to RedHat Linux" message. Commenting out the `sleep 1' line (just in order to understand what is happening) the init sequence continues until the `action $"Mounting proc filesystem: " mount -n -t proc /proc /proc' line. In fact what hangs is the `initlog' line defined in the `action' function defined in `/etc/rc.d/init.d/functions'. Commenting out the `initlog' line in `action' (and other `initlog' lines explicitly appearing in the `rc.sysinit' script) the boot sequence progress more but eventually crashes and several messages seem to indicate that the root filesystem is mounted readonly at that stage (Is this wrong?) or cannot access some files. The same kernel (identical with the older one we used in the Pentium III machines, but with the PCMCIA support deactivated) boots OK in the P III nodes. We ask: 1./ The new kernel was compiled in a P III machine. Is it correct to compile a kernel in a P III machine and use it in a P IV machine? We wonder that it is OK, but we wish to confirm this. 2./ Why the PCMCIA code hangs in the P IV node? 3./ Any ideas about why the `rc.sysinit' script hangs at the `sleep 1' and `initlog ...' lines. Is this caused by the root filessytem to be mounted readonly or may be due to errors in the NFS mounted filesystem? Any ideas or suggestions are welcome!! Thnks in advance, Mario %%%%%%<>%%%%%%<>%%%%%%<>%%%%%%<>%%%%%%<>%%%%%%<>%%%%%%<>%%%%%%<>%%%%%%<>%%%%%% Mario Alberto Storti | Fax: (54)(342) 455.09.44 | Centro Internacional de Metodos Computacionales | Tel: (54)(342) 455.91.75 | en Ingenieria - CIMEC (INTEC/CONICET-UNL) |..........................| INTEC, Guemes 3450 - 3000 Santa Fe, Argentina | Reply: mstorti at intec.unl.edu.ar, http://venus.ceride.gov.ar/CIMEC |
- Previous message: M-VIA, other alternatives to TCP for Gb-enet
- Next message: Friday
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
