[Beowulf] Cooling vs HW replacement

George Georgalis george at galis.org
Sun Jan 16 22:33:22 PST 2005

On Sun, Jan 09, 2005 at 02:09:40PM +0100, Ariel Sabiguero wrote:
>The question arises as most current hardware comes with 3 or more years 
>of warranty. During that period of time Moore twofolded twice hardware 
>performance... is it worth spending money cooling down a cluster or just 
>rebuilding it after it "burns out" (and is at least 4 times slower than 
>state-of-the art)?
>Is it worth cooling down the room to a Class A Computer room standard or 
>save the money for hardware upgrade after three years? In warm countries 
>keeping 18?C the air inside a room (PC-heated) when outside temperature 
>is 30?C average it becomes pretty expensive to pay electricity bills. It 
>is cheaper to "circulate" 30?C air and have from 40-50?C inside the chassis.

I don't have numbers or proof, but some experience and well...

Use a SAN/NAS (nfs) and keep the disks in a separate room than the CPUs.
Disk drives generate a lot of heat, and compared to on board components
don't really need cooling, circulated air should largely cover them.

Minimize the disk count in the CPU room, use efficient power supplies,
and they won't need as much capacity since they aren't driving
disks. Much less cooling will be required.

That's about all I can say for sure. A site I know was doing that and
replacing CPU about every 12 months, per Moor's law. Sorry no real
numbers about actual or abusive temperatures, but I would avoid abusive
temperatures. If you have 3% failure at 65F at 3 years, and 15% failure
at 80F at 3 years, do you really think your production CPUs are going
to wait 3 years to start failing? Unpredictable errors and nontrivial
diagnostic and repair.  ...A failed disk in a hot swap mirrored raid
array, is trivial to detect and replace. (careful not to fry your raid

If you really want to focus on efficiency and engineering, I bet one
(appropriately sized) power-supply per 3 or 5 computers is a sweet spot.
They could possibly run outside the CPU room too.

// George
PS - sorry my smtp doesn't accept mail from uy subnets.
most free webmail gets through if you want to contact
me directly.

George Georgalis, systems architect, administrator Linux BSD IXOYE
http://galis.org/george/ cell:646-331-2027 mailto:george at galis.org

More information about the Beowulf mailing list