[Beowulf] MD check/scrub

Wed Nov 14 11:18:05 PST 2007

Leif Nixon wrote:
> Mark Hahn <hahn at mcmaster.ca> writes:
> 
>>> If they find an inconsistent stripe they don't try to identify the
>>> corrupt block. Instead, they dumbly *recompute P and Q*, which of
>>> course makes the stripe consistent, but *leaves the corrupt data in
>>> place*.
>> I'm skeptical that MD would be so willfully stupid.
> 
> Me too, but I'm looking at a trashed test file at this very moment.
> 
>> did you conclude this from reading the code, empirical testing, or
>> from the author?
> 
> Empirical testing, using raid 6 over file backed loop devices. I

RAIDS in general depend on two things:
#1 When you ask for a write and do not get an error, the write happened.
#2 that corruptions in the media don't happen

That sounds bad, but drives are pretty reliable, have per sector checksums,
it's pretty unlikely to have a corrupted sector still manage to produce the
correct checksum.  For this reason using dd to damage a single disk of a raid
will not work, since all the sector checksums will be correct and this will
corrupt the RAID set.

While continuously reading from all drives and doing the parity calculation
isn't practical, I do agree that the scrub (which is fairly new btw) should do
this.  While rare in practice (a corruption that has the correct block ECC)
there's no reason for scrub to not handle this correctly.  On the Seagate ES
drive I checked each sector is protected by a 10 bit of ECC and they claim 1
non-recoverable error per 10E15 bits.

> *hope* there is something wrong with our methodology. We'll need to do
> a proper summary of our findings and raise the issue on the linux-raid
> mailing list.

I'll join and watch, thanks.  I would be surprised if scrubbing doesn't do the
right thing, but linux-raid is the best place to find out.