[Beowulf] 96 cores in silent and small enclosure

Lux, Jim (337C) james.p.lux at jpl.nasa.gov
Wed Apr 14 16:39:53 PDT 2010


Try this
http://rel.intersil.com/docs/rel/calculation_of_semiconductor_failure_rates.pdf

You might also look for MIL-HDBK-217

Of course, a paper by H.S. Blanks makes the following statement:
Although the temperature dependence of failure rate can be very high, in most situations it is much less than that of the Arrhenius acceleration factor. It is very improbable that the temperature dependence of component failure rate can be meaningfully modelled for reliability prediction purposes or for the purpose of optimizing thermal design component layout.
(from abstract for "Arrhenius and the temperature dependence of non-constant failure rate" Quality and Reliability Engineering International, Vol 6, #4, pp259-265, 20 Mar 2007)

You might also browse around http://www.weibull.com/  or http://www.klabs.org/ 


Jim


From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Jon Tegner
Sent: Wednesday, April 14, 2010 1:12 AM
To: Mark Hahn
Cc: beowulf at beowulf.org
Subject: Re: Re: [Beowulf] 96 cores in silent and small enclosure


the max temp spec is not some arbitrary knob that the chip vendors
choose out of spiteful anti-green-ness. I wouldn't be surprised to see some

****************************************************************

Issue is not the temp spec of current cpus, problem is that it is hard to get relevant information. I haven't found any that states that the failure rate in year 5 should be significantly higher if you operate the cpu at 65 C instead of 55 C. I'm just saying this kind of information would be valuable (and I would be glad to find it).





More information about the Beowulf mailing list