Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] Cooling vs HW replacement

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Alvin Oga alvin at Mail.Linux-Consulting.com
Tue Jan 18 13:26:23 PST 2005


hi ya luc

On Tue, 18 Jan 2005, Luc Vereecken wrote:

> The first summer I had a failure rate of over 60%. Some motherboards 

the normal failure rate is say 5% or so for first 30 days or first year..
	- if you lose too much more systems, than it's a vendor parts
	problem ( where you or they get their parts to build systems )

> failed, plenty of powersupplies failed, I had 10 brandnew disks that ran so 
> hot at times i couldn't put my hand on them at these ambient temperatures. 
> 5 of them failed in the first 6 months, the other 5 a few months later.

the disks should be coool to the touch ... say no more than 30C
for its operating temp ( hddtemp seems to be good measure )
	- silly things like a $3 or $15 fan will keep a disk from
	failing, and use 2 of um to avoid single fan failure problem

yyp.. after an AC failure, lots of disks will die within 2-3 months
if some died during the ac failure

c ya
alvin




More information about the Beowulf mailing list