Disk reliability (Was: Node cloning)

Josip Loncaric josip at icase.edu
Fri Apr 6 12:27:53 PDT 2001

Greg Lindahl wrote:
> On Fri, Apr 06, 2001 at 10:11:27AM -0400, Josip Loncaric wrote:
> > We now use a combination of [7200rpm] Seagate and IBM drives, and over
> > the past two years about 20% of them have developed at least some bad
> > blocks that we had to map out using the 'e2fsck -c ...' command.
> How interesting. Centurion I and II have 2 TB of disk (mostly 5400 rpm
> IDE), and we've never had to manually do that.

"Had to" is a strong word.  We did not have to, except initially when
some drives were DOA.  Did you ever try to look for bad blocks after the
initial filesystem build?  If you see any drive related warning messages
in the system log, you may want to do "echo '-f -c' >/fsckoptions",
reboot the system, and watch what happens.  Also, do you use PIO on
these drives?  In a couple of cases, we had lower reliability when DMA
was in use, which improved once we turned DMA off (I suspect noisy
internal IDE cables and/or marginal interface electronics).  Other
drives work fine in DMA mode.  We've also tried various Linux utilities
to read the drives' S.M.A.R.T. data (we enabled this in BIOS) but the
printed interpretation of their results does not make sense (they only
distinguish Seagate from IBM drives).  

BTW, the nonrecoverable error rate of 7200rpm IDE drives is typically
rated at 1 part in 10^13, so you have a decent chance of finding
problems in a 1-2 TB disk farm (which contains about 10^13 bits).

> BTW you can "mkswap -c" to mark bad blocks in swap. I don't know why
> you'd find a bad block in swap less acceptable than a bad block in the
> filesystem.

As far as I know, "mkswap -c" only checks and reports the count of bad
blocks, just in case you'd like to be told.  The man page does not say
anything about *mapping* them out.  My understanding is that swap gets
used as a straight data area with no gaps, so if you did have a bad
block, this could cause big trouble.

Correct me if I'm wrong and feel free to comment more on reliability of
various components.  This is an important topic because Beowulfs get
very hard use even though they are not built with gold plated stuff...


Dr. Josip Loncaric, Research Fellow               mailto:josip at icase.edu
ICASE, Mail Stop 132C           PGP key at http://www.icase.edu./~josip/
NASA Langley Research Center             mailto:j.loncaric at larc.nasa.gov
Hampton, VA 23681-2199, USA    Tel. +1 757 864-2192  Fax +1 757 864-6134

More information about the Beowulf mailing list