[Beowulf] lm_sensors and clusters and wrong intel cpu readings

John Hearns hearnsj at googlemail.com
Thu Aug 9 00:21:38 PDT 2012


Well, I don't use lm_sensors for a start!
Use the ipmitool utility to probe the readings from BMC cards (ILO,
DRAC, they're the same thing).
I don;t trust the absolute calibration of the sensors - generally
you're looking at setting a limit on which to alarm or shutdown so
just take a reading under no load on the CPU and call that the
'normal' reading.
I may be wrong. YMMV.

On 08/08/2012, Vincent Diepeveen <diep at xs4all.nl> wrote:
> hi,
>
> How do you guys monitor the CPU core temperatures?
>
> if i run lm_sensors, it's 30C higher at every node than a few nodes i
> tried compare with windows.
> Also under full load it reports temperatures like end 60s and up to
> 78C i've seen reported.
> Am guessing it should be 30-40+ at most.
>
> It blows cool air from and outside the cpu's. Nothing is even 'warm'.
>
> Nodes here: supermicro X7DWE inside Xeons L5420. They are not
> overclocked.
>
> I also downloaded some similar motherboards definitions - seems they
> uploaded it for motherboards with dual core Xeons
> and such, not for the quadcores. None of those defines 'corrects' the
> temperature of the quadcore Xeons, they basically kick out
> readings that are not getting used.
>
> Now i bet several clusters/supercomputers had these cpu's. How did
> you solve this problem with the intel L5420's?
>
> Maybe someone still has the lm_sensors script lying around somewhere
> fixing it for the intel Xeons?
>
> Thanks in advance,
> Vincent
>
>
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>



More information about the Beowulf mailing list