[Beowulf] Big storage

Bruce Allen ballen at gravity.phys.uwm.edu
Thu Aug 30 01:11:52 PDT 2007


Hi Jakob,

> 2) You read it. In which case the disk can see that the checksum has 
> failed. There's nothing it can do about this except give you a read 
> error on that block.
>
> A good RAID controller *may* help you with scenario (2) above; it can read the
> block from a mirror or compute it from parity (etc.), and may even re-write the
> correct block to the disk again to cure this single bad-block failure (by
> letting the disk re-allocate the block on a write as in case (1) above).

Yes, precisely.

When I buy RAID controllers, I put this requirement directly in my bid 
specifications.  I say something like the following:

------------------------------------------------------------------

During a READ operation, if the RAID controller finds an unreadable 
(uncorrectable) disk sector, then it will immediately reconstruct the 
missing data for that sector using redundant data from the rest of the 
array, and WRITE that data to the unreadable (uncorrectable) sector to 
force sector reallocation if needed by the disk.

In addition, the RAID controller will perform a continous or regular (at 
least daily) background scan of all disk sectors to identify and repair 
any unreadable sectors as described above.

------------------------------------------------------------------

I think it is a mistake to purchase a RAID controller without these 
features.  The absence of these features is the main reason that I don't 
think Linux software RAID is very good.

Question for the Sun/ZFS experts: Does the Sun X4500 (Thumper) do what I 
describe above?

Cheers,
 	Bruce



More information about the Beowulf mailing list