[Beowulf] The fan from hell

David Mathog mathog at mendel.bio.caltech.edu
Fri Jun 10 12:29:59 PDT 2005

> Are you certain that replacing the fans will solve your heat
> problem in the long run? 

Not ALL my heat problems, just the one that results from the
fan going slower as the CPU gets hotter.

> My parallel database code runs
> continuously for >1day, up to 40 processors. In the end
> we measured that the system consumed more energy
> than what the cooling system consumed! When a friend arrived
> the other day he found the system room at 45 celcius,
> with hot steam in the air :) I don't know what the board
> temperature was, but I'm sure they got pretty hot.

You are living dangerously.

install lm_sensors.  Run a shell script to check the CPU temps
and fan speeds periodically with "sensors".  Have it shut
down if the machine gets too hot.  Mine shuts down preemptively
on a fan failure since burning up a CPU would be quite painful
given the low probability of being able to buy a replacement
Athlon MP 2000+.  Well, probably on Ebay, but who knows for how

Motherboard monitor can do the same thing on a Windows machine.


David Mathog
mathog at caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech

