[Beowulf] looking for a reference on failure rates

Joe Landman landman at scalableinformatics.com
Mon Mar 7 12:31:53 PST 2005

Hi Don:

   This is excellent.  More detail than I need for this, but useful 
nonetheless.  I am familiar with/have used Arrhenius models for rate 
prediction in the past, but did not make the connection to failure rates.

   The next question that this begs is the failure mode.  Each failure 
mode will likely have a different set of activation energies.

   I'll go grab this info (have some old stuff around).  Any similarity 
for disk drives and other components?  Of course the question of what 
the activation energy would mean for a macroscopic failure (hard disks) 
is relevant ...



Don Holmgren wrote:
> On Mon, 7 Mar 2005, Joe Landman wrote:
>>Hi folks:
>>   I am looking for a reference which describes failure rates of modern
>>computer components as a function of temperature.  The usual rule of
>>thumb is that every 10 degrees above a certain value doubles the failure
>>rate (or decreases lifetime).  I would like to look at this analysis and
>>refer to it for something I am working on.
>>   Thanks
> Joe -
> Take a look at this Test and Measurement World article for starters:
>   http://www.reed-electronics.com/tmworld/article/CA187523.html
> The rule of thumb that you mention comes from using an Arrhenius model
> to describe the relationship between temperature and failure rates.
> Arrhenius first published this equation (now named after him) in 1889
>          k(T)  =  A  exp ( -Ea / RT)
> to explain the variation of reaction rates with temperature of several
> elementary chemical reactions.   Here, k is the reaction rate, A is a
> constant, Ea is the activation energy for the reaction, R is the ideal
> gas constant, and T is the temperature in Kelvin.  It turns out that
> many semiconductor degradation mechanisms - electromigration, corrosion,
> defect growth, etc. - fit this relationship well.  Note that you'll
> usually see Boltzmann's constant (another 'k') instead of 'R' in the
> semiconductor reliability literature.  Chemists use R and express Ea in
> units of kJoule/mole, physicists and engineers tend to use k and express
> Ea in electron volts.
> In the reliability literature, you'll often see the Arrhenius model
> written in term of time to failure, which is proportional to the inverse
> of the reaction rate.  At two different temperatures T1 and T2, the
> times to failure would be given by
>     t1 = A exp (Ea / kT1)               # k = Boltzmann's constant
>     t2 = A exp (Ea / kT2)
> and so the ratio of lifetimes is given by
>    t1/t2 = exp [ Ea/k  * (1/T1 - 1/T2) ]
> If t1 is room temperature (~ 298K), an activation energy of about
> 0.54 eV would give a doubling in failure rate at a 10 degree C higher
> temperature.
> There's a handy chemist's page at
> http://antoine.frostburg.edu/chem/senese/101/kinetics/faq/temperature-and-reaction-rate.shtml
> that will let you plug in 3 of the 4 variables (T1, T2, Ea, reaction
> rate ratio) and it will give you the third.
> I've got a number of semiconductor reliability texts with tables of Ea
> versus failure mechanism - I can post the references if you request,
> though they're a bit dated (15 years old).  Ea varies widely in these
> tables from about 0.3 eV to as high as 2.0 eV.  There are even some
> negative Ea's, corresponding to failure mechanisms that decelerate with
> increasing temperature.  The "factor of 2 with every 10 degrees" is only
> a very rough rule of thumb.
> Don Holmgren

Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 734 786 8452
cell : +1 734 612 4615

More information about the Beowulf mailing list