[Beowulf] 96 cores in silent and small enclosure
hahn at mcmaster.ca
Tue Apr 13 10:37:22 PDT 2010
> I find it strange with this rather large temp range, and 55 seems very low to
> my experience. Could they possibly stand for something else? Did not find any
> description of the numbers anywhere on that address.
I think you should always worry about any temperature measured
on a system that's in the >= 65C range. as Jim mentioned, the temps
that matter are actually on-chip and not really accessible -
and it's unknown to us what they should be anyway, or how long
they can tolerate particular temps. and whether over-temp failure
modes would be transient (conductivity in semiconductors changes
rapidly as a function of temperature) or gradual (electromigration
or perhaps the solder-ball problems nvidia had)...
the original question was about wheter 60-65C is a safe operating
temperature. I think it's pretty clearly high - whether it's critical
depends on how it's measured, the specific chip's specs, etc.
but it's not the sort of operating range I'd be aiming for.
More information about the Beowulf