Maximum room temperature
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Robert G. Brown rgb at phy.duke.eduMon Apr 22 11:58:44 PDT 2002
- Previous message: Maximum room temperature
- Next message: Maximum room temperature
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Mon, 22 Apr 2002, Manel Soria wrote: > I'm wondering what is the maximum reasonable ambient > temperature to have in a cluster room. In our room > with 72 nodes we have about 29-30 oC (84-86 oF). > Is this too high ? Can this be the cause of hardware > failures ? Yes, it can. This is pretty high for a server room. The best way to think of temperature and heat disposal in a cluster is to think in layers. Heat generally flows from hot to cold, at a rate proportional to the difference in temperture in degrees Kelvin. More specifically, the rate of flow is influenced by things like conductivities, convective flow, and radiative trapping. The CPU core generates heat at some roughly constant rate under load. Current/modern CPU's "can" operate at very high temperatures, order of 100C, although they will almost certainly operate more reliably and longer at considerably cooler core temperatures. This heat generally flows from the CPU into the attached heat sink/fan at a rate determined by the temperature DIFFERENCE between the heatsink and the CPU. If the conductivity of the heatsink is high, and the conductivity of the interface is also high, a small temperature difference will cause a lot of heat to flow from the hotter to the cooler. The CPU is thus cooled until it isn't too much warmer than the operating temperature of the heatsink. The heatsink then has to be cooled so that IT is cooler than the desired operating temperature of the CPU. The hotter it is, the faster it loses heat to the ambient air. The cooler the ambient air, the faster it loses heat. Here things get a bit arcane. Air is not all that great a conductor of heat. It does have some heat capacity and will warm up when in contact with a warmer surface. Heat sinks therefore generally have lots of surface area and fans in the case and heatsink itself move (hopefully cooler) air rapidly across this surface. All things being equal, though, when the CPU produces heat at a constant rate the heatsink/fan/air arrangement can remove heat at that rate only when the air and the heatsink have a given, approximately constant, temperature difference. This warmed air has to then be removed from the case and replaced with cooler ambient air from the server room, and the warmed air eventually has to be circulated over actively cooled (refrigerated) coils to remove it from the room altogether and eventually dump it, plus all the energy required to do the cooling, into the outside air. The cooler the room air, the cooler all the components inside your system, especially the CPU. Cooling down the room air temperature 10C should reduce the operating temperature of your CPU by very close to 10C. Most systems are probably engineered with the assumption that they will operate in air in the 68-75F temperature range (20-23C), and can probably tolerate ambient air up to 80F or 26C without much risk. If the ambient temperatures get much higher than this, though, your risk of catastrophic heat-induced failure starts creeping up. At around 100F/38C they become very high indeed -- close to "certain" if you try operating a system 24 hours under a high load at or above this ambient air temperature. If a system is ever operated for an extended period over 30C (in the 90s F) it may not fail, but even if you cool it back down you may have marginally damaged components that will fail later. An additional risk for even fairly short periods of high temperature operation is that hard disks are made of metal that expands when heated. If a disk expands too much, the write head can actually become misaligned with the tracks and your disk can be instantly and irrecoverably trashed. This can also happen if the disk is COOLED too much -- it is a bad idea to crank up a laptop after it has sat all night in a sub-zero car without letting it come to a "normal" operating temperature first... If I were you I'd engineer enough cooling to drop the ambient air in your cluster space by at least 5C, if not 10C, and make sure that there is enough air circulation and mixing that no systems are in local "hot spots" (where air exhausted from one system is sucked into another system, for example). A really happy server room is one you need to wear a jacket or sweater in to be comfortable, not one that makes you want to take clothes off...;-) rgb > > Thanks. > > -- > =============================================== > Dr. Manel Soria > ETSEIT - Centre Tecnologic de Transferencia de Calor > C/ Colom 11 08222 Terrassa (Barcelona) SPAIN > Tf: +34 93 739 8287 ; Fax: +34 93 739 8101 > E-Mail: manel at labtie.mmt.upc.es > > > -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
- Previous message: Maximum room temperature
- Next message: Maximum room temperature
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
