[Beowulf] MD check/scrub
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Bill Broadley bill at cse.ucdavis.eduTue Nov 13 10:03:22 PST 2007
- Previous message: [Beowulf] MD check/scrub
- Next message: [Beowulf] MD check/scrub
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Leif Nixon wrote: > Reconstruction. With raid 6, you can recover from single-disk > corruption (As opposed to *failures*, where you get read errors from a > disk. Raid 6 can handle two simultaneous disk *failures*.). > > See section 4 in: > > http://www.kernel.org/pub/linux/kernel/people/hpa/raid6.pdf > I just read it. > Just recalculating the parity blocks does give you a consistent raid > stripe, but destroys your data (unless it actually was one of the > parity blocks that was corrupted). Er, that's not how I read it at all. To quote: In the case of data drive corruption, once the faulty drive has been identified, recover using the P drive in the same way as a one-disk erasure failure. So you want to catch these single disk corruptions (data or parity) as soon as possible so they don't accumulate. In general if you have the redundancy at the software RAID it seems best not push too hard on the individual drive. Don't retry excessively (and depend on the per block checksums) or allow long timeouts. As soon as the error hits do a write (to remap the block), after all do you trust a drive to read the sector on the 10th time more than you trust your parity calculations? If the driver error rates gets too high drop the drive like a hot potato and scream bloody murder so the admin feeds you a disk asap.
- Previous message: [Beowulf] MD check/scrub
- Next message: [Beowulf] MD check/scrub
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
