[Beowulf] RAID5 rebuild, remount with write without reboot?

John Hearns hearnsj at googlemail.com
Tue Sep 5 10:43:14 PDT 2017


David, I have never been in that situation. However I have configured my
fair share of LSI controllers so I share your pain!
(I reserve my real tears for device mapper RAID).

How about  a mount -o remount   Did you try that before rebooting?

I am no expert here - in the past when I have had non-RAID systems which
have had disks go read-only the only cure is a reboot.
Someone who is more familiar wiht how the kernel behaves when it has
decided that a device is not writeable should please correct me.
I would guess that a rescan-scsi-bus would have no effect - the disk is
still there!





On 5 September 2017 at 19:28, mathog <mathog at caltech.edu> wrote:

> Short form:
>
> An 8 disk (all 2Tb SATA) RAID5 on an LSI MR-USAS2 SuperMicro controller
> (lspci shows " LSI Logic / Symbios Logic MegaRAID SAS 2008 [Falcon]")
> system was long ago configured with a small partition of one disk as /boot
> and logical volumes for / (root) and /home on a single large virual drive
> on the RAID.  Due to disk problems and a self goal (see below) the array
> went into a degraded=1 state (as reported by megacli) and write locked both
> root and home.  When the failed disk was replaced and the rebuild completed
> those were both still write locked.  "mount -a" didn't help in either
> case.  A reboot brought them up normally but ideally that should not have
> been necessary.  Is there a method to remount the logical volumes writable
> that does not require a reboot?
>
> Long form:
>
> Periodic testing of the disks inside this array turned up pending sectors
> with
> this command:
>
>    smartctl -a  /dev/sda -d sat+megaraid,7
>
> A replacement disk was obtained and the usual replacement method applied:
>
> megacli -pdoffline -physdrv[64:7] -a0
> megacli -pdmarkmissing -physdrv[64:7] -a0
> megacli -pdprprmv -physdrv[64:7] -a0
> megacli -pdlocate -start -physdrv[64:7] -a0
>
> The disk with the flashing light was physically swapped.  The smartctl was
> run again and unfortunately its values were unchanged.  I had always
> assumed that the "7" in that smartctl was a physical slot, turns out that
> it is actually the "Device ID".  In my defense the smartctl man page does a
> very poor job describing this:
>
>   megaraid,N - [Linux only] the device consists of one or more SCSI/SAS
> disks
>   connected to  a  MegaRAID controller.   The  non-negative  integer N (in
>   the range of 0 to 127 inclusive) denotes which disk on the controller
>   is monitored.  Use syntax such as:
>
> In this system, unlike the others I had worked on previously, Device ID and
> slots were not 1:1.
>
> Anyway, about a nanosecond after this was discovered the disk at Device ID
> 7 was marked as Failed by the controller whereas previously it had been
> "Online, Spun Up".
> Ugh. At that point the logical volumes were all set read only and the OS
> became barely usable, with commands like "more" no longer functioning.
> Megacli and sshd, thankfully, still worked.  Figuring that I had nothing to
> lose the replacement disk was removed from slot 7 and the original,
> hopefully still good disk replaced.  That put the system into this state.
>
> slot 4 (device ID 7) failed.
> slot 7 (device ID 5) is Offline.
>
> and
>
> megacli -PDOnline -physdrv[64:7] -a0
>
> put it at
>
> slot 4 (device ID 7) failed.
> slot 7 (device ID 5) Online, Spun Up
>
> The logical volumes were still read only but "more" and most other
> commands now worked again.  Megacli still showed the "degraded" value as
> 1.  I'm still not clear
> how the two "read only" states differed to cause this change.
>
> At that point the failed disk in slot 4 (not 7!) was replaced with the
> new disk (which had been briefly in slot 7) and it immediately began to
> rebuild.  Something on the order of 48 hours later that rebuild completed,
> and the controller set "degraded" back to 0.  However, the logical volumes
> were still readonly.  "mount -a" didn't fix it, so the system was rebooted,
> which worked.
>
>
> We have two of these back up systems.  They are supposed to have identical
> contents but do not.  Fixing that is another item on a long todo list.
> RAID 6 would have been a better choice for this much storage, but it does
> not look like this card supports it:
>
>   RAID0, RAID1, RAID5, RAID00, RAID10, RAID50, PRL 11, PRL 11 with
> spanning,
>   SRL 3 supported, PRL11-RLQ0 DDF layout with no span,
>   PRL11-RLQ0 DDF layout with span
>
> That rebuild is far too long for comfort.  Had another disk failed in
> those two days that would have been it. Neither controller has battery
> backup, and the one in question is not even on a UPS, so a power glitch
> could be fatal too. Not a happy thought while record SoCal temperatures
> persisted throughout the entire rebuild! The systems are in different
> buildings on the same campus, sharing the same power grid.  There are no
> other backups for most of this data.
>
> Even though the controller shows this system as no longer degraded, should
> I believe that there was no data loss?  I can run checksums on all the
> files (even though it will take forever) and compare the two systems.  But
> as I said previously, the files were not entirely 1:1, so there are
> certainly going to be some files on this system which have no match on the
> other.
>
> Regards,
>
> David Mathog
> mathog at caltech.edu
> Manager, Sequence Analysis Facility, Biology Division, Caltech
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20170905/5e021eac/attachment-0001.html>


More information about the Beowulf mailing list