[Beowulf] moving of Linux HDD to other node: udev problem at boot
Mikhail Kuzminsky
kus at free.net
Wed Aug 19 10:12:41 PDT 2009
As it was discussed here, there are NUMA problems w/Nehalem on a set
of Linux distributions/kernels. I was informed that may be old
OpenSuSE 10.3 default kernel (2.6.22) works w/Nehalem OK in the sense
of NUMA, i.e. gives right /sys/devices/system/node content.
I moved Western Digital SATA HDD w/SuSE 10.3 installed (on dual
Barcelona server) to dual Nehalem server (master HDD on Nehalem
server) with Supermicro X8DTi mobo.
But loading of SuSE 10.3 on Nehalem server was not successful. Grub
loader (which menu.lst configuration uses "by-id" identification of
disk partitions) works OK. But linux kernel booting didn't finish
successfully: /boot/04-udev.sh script (which task is udev
initialization) - I think, it's from initrd content - do not see root
partition (1st partition on HDD) !
At the boot I see the messages
....
SCSI subsystem initialized
ACPI Exception (processor_core_0787): Processor device isn't present
....
<a set of messages about usb>
...
Trying manual resume from /dev/sda2 /* it's swap
partition*/
resume device /dev/sda2 not found (ignoring)
...
Waiting for device
/dev/disk/by-id/scsi-SATA-WDC_WD<name_of_disk>-part1 ... /* echo from
udev.sh */
and then the proposal to try again. After finish of this script I
don't see any HDDs in /dev.
BIOS setting for this SATA device is "enhanced". "compatible" mode
gives the same result.
What may be the source of the problem ? May be HDD driver used by
initrd ?
Mikhail Kuzminsky
Computer Assistance to Chemical Research Center
Zelinsky Institute of Organic Chemistry RAS
Moscow
PS. If I see (after finish of udev.sh script) the content of /sys -
it's right in NUMA sense, i.e.
/sys/devices/system/node contains normal node0 and node1.
More information about the Beowulf
mailing list