[Beowulf] Problems with Dell M620 and CPU power throttling
hahn at mcmaster.ca
Fri Aug 30 10:48:03 PDT 2013
> [root at r2c3n4 thermal_throttle]# ls
> core_power_limit_count core_throttle_count package_power_limit_count
> [root at r2c3n4 thermal_throttle]# cat *
> This was what led us to how the chassis was limiting power. We had been
I don't mean to be pedantic, but to me, this is the cpu throttling itself,
based on its own temperature readings and power rating. the coretemp
module, from its modinfo, seems to be purely on-chip.
/sys/bus/platform/devices/coretemp.0 probably contains some other
stuff which might be interesting - for instance, what your *_max
> using redundancy and switched to non-redundant to try and eliminate. We
> believe that we see these messages when the CPU is throttling up in power.
I read the *_limit_count as meaning "18781048 times the core was
down-clocked because it exceeded power limits." ie, not "throttling up",
though I suppose these things are almost symmetric...
> These are E5-2670 0 @ 2.60GHz. Two per node.
so spec is 115 W and Tcase max 80C. that's not as low a threshold
as some chips (67C seems pretty low, for instance).
More information about the Beowulf