Disk reliability (Was: Node cloning)

Mark Hahn hahn at coffee.psychology.mcmaster.ca
Sat Apr 7 10:33:40 PDT 2001

> I've heard that, and yet 'badblocks' finds problems in 20-30% of our IDE
> drives.  This leads to two possible conclusions:
> (1) IDE drives have huge failure rates, or

this is a ridiculous overgeneralization, especially since you're 
talking about something like 6 generations of disks.  
consider that density has increased by a factor of *30*.

> (2) drives are not capable of detecting and remaping all problematic
> blocks

all bad blocks will be reflected in SMART values, afaik.

> kernel: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error } 
> kernel: hda: dma_intr: error=0x84 { DriveStatusError BadCRC } 
> kernel: hda: read_intr: status=0x5b { DriveReady SeekComplete
> DataRequest Index Error } 

this has NOTHING to do with bad blocks; it's purely a cabling
(or possibly clocking) issue.  a transfer (just the transfer)
has failed its checksum, and will been retried.  it indicates
the signal was corrupted on the cable.  IDE rules:

	1. ide cables must be 18" or less.  there are some other
	constraints, within this, about how close connectors can be spaced.
	2. both ends must be plugged in (no stub).
	3. udma > 33 requires 80-conductor cable.
	4. in general, avoid mixing vendors on the same channel
	(master/slave).  of course, avoid slaves in the first place.

> Seagate ST36530A:
> 7200rpm, UDMA, 6.5GB, made in 1998
> 29 drives in use, more than 6 failed and were replaced, an additional 8
> have bad blocks detected by 'badblocks' program

notoriously hot disks, of the awefully old 2G/platter generation.
in fact, just about everything about them screams "version 1.0";
(I have one myself, still in use.  early SMART implementations like 
this one don't seem to provide any useful numbers.)

> 36 drives in use, 1 failed and was replaced, an additional 8 have bad
> blocks detected by 'badblocks' program

have you run badblocks multiple times, and compared the output?

More information about the Beowulf mailing list