Disk reliability (Was: Node cloning)

Josip Loncaric josip at icase.edu
Fri Apr 6 14:34:36 PDT 2001

Trent Piepho wrote:
> When a drive block goes bad, the drive automatically remaps it to a spare
> block.  If you find bad blocks on a modern hard disk, something is wrong with
> your drive.

I've heard that, and yet 'badblocks' finds problems in 20-30% of our IDE
drives.  This leads to two possible conclusions:

(1) IDE drives have huge failure rates, or
(2) drives are not capable of detecting and remaping all problematic

I'd pick (2), particularly because an area of the disk can fail in many
ways, not all of which would trigger automatic remaping.  We are mostly
troubled by SeekComplete errors, although I've also seen other error

kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error } 
kernel: hda: dma_intr: error=0x84 { DriveStatusError BadCRC } 
kernel: hda: read_intr: status=0x5b { DriveReady SeekComplete
DataRequest Index Error } 

It is possible that SeekComplete errors are due to some difficulty that
the drive has in tracking the servo signal in a few spots.  Not
accessing those spots gets around the problem.

FYI, here is how our experience breaks down by drive model:

Seagate ST36530A:
7200rpm, UDMA, 6.5GB, made in 1998
29 drives in use, more than 6 failed and were replaced, an additional 8
have bad blocks detected by 'badblocks' program

IBM DTTA-371440/DPTA-372050/DTLA-307030:
7200rpm, UDMA, 14.4/20.5/30GB, made in 1999-2000
36 drives in use, 1 failed and was replaced, an additional 8 have bad
blocks detected by 'badblocks' program

Overall, we've had serious uncorrectable problems with over 10% of the
IDE drives we bought, while an additional 25% have problems we corrected
using 'e2fsck -c ...'.  Obviously, Linux can do more to bypass problem
areas than the drive's hardware can.

Regarding 'mkswap -c ...'

> Sure it does, from mkswap(8):
>        The old setup wastes most of this bitmap page, because zero bits denote
>        bad blocks or blocks past the end of the swap space, and a simple
>        integer suffices to indicate the size of the swap space, while the bad
>        blocks, if any, can simply be listed.

Thanks for pointing that out.  I was looking only at this part:

       -c     Check  the device (if it is a block device) for bad
              blocks before creating the swap area.  If  any  are
              found, the count is printed.

which does not mention mapping out bad blocks.  However, please note
that just after the section you quoted the author of mkswap(8) suggest
NOT using swap partitions where any bad blocks are found:

                         Nobody wants to use a  swap  space  with
       hundreds of bad blocks. (I would not even use a swap space
       with 1 bad block.)


Dr. Josip Loncaric, Research Fellow               mailto:josip at icase.edu
ICASE, Mail Stop 132C           PGP key at http://www.icase.edu./~josip/
NASA Langley Research Center             mailto:j.loncaric at larc.nasa.gov
Hampton, VA 23681-2199, USA    Tel. +1 757 864-2192  Fax +1 757 864-6134

More information about the Beowulf mailing list