[Beowulf] RAID question
mathog
mathog at caltech.edu
Wed Mar 18 14:00:15 PDT 2015
J?rg Sa?mannshausen <j.sassmannshausen at ucl.ac.uk> wrote:
> $ smartctl -i /dev/sda -d megaraid,X
Right.
The issues have been resolved. If anybody is still curious, this is
what happened.
The disappearing files/directories were the result of a script that was
run as root which moved /boot and /bin to an obscure subdirectory
belonging to that user.
The disk errors were a red herring. The system had a Seagate USB disk
plugged into it which I was not aware of. (It was less not obvious
because of the rats nest of cables behind it.) This disk's partition
table was marked bootable - even though there was nothing on that disk
which would have supported a boot. This was the disk that was showing
up as /dev/sdb. When CentOS booted normally it was automatically
mounting this disk, which is why there was no mention of it in
/etc/fstab. However, nothing was using this disk. It looks like at 30
minute intervals the OS "pinged" the device to see if it was still
there, and the enclosure/disk did not fully support whatever command was
being used for this operation, resulting in the sense error messages in
the log files. When the rescue DVD was
used it saw this device, created /dev/sda for it (yes, device names were
exchanged in the two environments) and didn't mount it.
Long SMART tests have now been run on each of the internal disks using
smartctl commands like the one above, and all the disks are fine.
megacli also comes up clean. The USB disk is no longer plugged in,
which solved the issue of sense error messages going to
/var/log/messages.
Thanks for all of the suggestions,
David Mathog
mathog at caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech
More information about the Beowulf
mailing list