[Beowulf] Big storage

Jakob Oestergaard jakob at unthought.net
Tue Aug 28 03:50:16 PDT 2007


On Mon, Aug 27, 2007 at 03:47:47PM -0700, Geoff Galitz wrote:
> 
> 
> I was under the impression that all modern disk adapters/controllers
> handle bad sectors at the hardware layer transparently.   If you are
> actually seeing bad sector and bad block errors then the disk is already
> repair.  Is this not the case?

No. If a block on the disk as gone bad, the following can happen:

1) You write to it.
 In which case the disk automatically re-allocates the block somewhere else and
 writes your data perfectly - the only impact of this is possibly lower
 sequential read performance later on, if the disk needs to seek somewhere else
 to read that particular block.
2) You read it.
 In which case the disk can see that the checksum has failed. There's nothing it
 can do about this except give you a read error on that block.

Of course, modern disks have pretty sophisticated error detection/correction,
and I wouldn't be surprised if disks automatically moved or re-wrote blocks if
they saw more frequent non-fatal (correctable) read failures on a block. But
this is probably vendor and model specific. No matter what, you still end up in
either situation (1) or (2) from above, if you get to the point where a block
is distorted beyond repair.

A good RAID controller *may* help you with scenario (2) above; it can read the
block from a mirror or compute it from parity (etc.), and may even re-write the
correct block to the disk again to cure this single bad-block failure (by
letting the disk re-allocate the block on a write as in case (1) above).


-- 

 / jakob




More information about the Beowulf mailing list