[Beowulf] Are disk MTBF ratings at all useful?

mathog mathog at caltech.edu
Fri Apr 19 16:38:09 PDT 2013

Joe Landman <landman at scalableinformatics.com> wrote

> Use AFR and warranty, ignore everything else.  MTBF does not 
> correlate
> at all against AFR, and AFR is an objective measure.

MTBF is the inverse of the AFR times the number of hours in a year.

The specs for a randomly selected Seagate drive are:

MTBF hours 1.4 million
AFR 0.63%

The MTBF is 365 * 24/AFR = 1.39 x 10^6, as they claim.

Unfortunately the MTBF is nonsense because the AFR will not
stay at 0.63%, and most likely would not be measured at 0.63% at
any point during the drive's life by the end users.  It is not even
clear if the AFR is the average over the 5 year warranty period.  It
might be the value in the "trough" of the bathtub curve.
In any case on a really fantastically reliable
set of drives the real MTBF is perhaps 15 years, 10x lower than
the spec.  Hard to say because the disk spec sheets do not actually 
where the AFR number came from, and few people keep disks that long.

The ratings I would really like the industry to use might be called
ef1, ef5, and ef10, where each  is the percent of disks that are
Expected to be Functioning (defined as: works at full rated speed,
has suffered zero data losing events, and still has
unused blocks available) at the end of the specified number
of years.  It would be really easy to compare disks with that system.
With AFR etc., not so much.


David Mathog
mathog at caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech

