[Beowulf] Power draw of cluster nodes under heavy load

Prentice Bisbal prentice.bisbal at rutgers.edu
Mon Jul 28 11:53:05 PDT 2014


On 07/28/2014 02:13 PM, Mark Hahn wrote:
>> Are any of you monitoring the power draw on your clusters? If so, can 
>> any of you provide me with some statistics on your power draw under 
>> heavy load?
>
> good question; it's something that deserves more attention and coverage.
>
> ATM, I can only provide one non-answer:
>
> http://www.sharcnet.ca/~hahn/saw-power-by-node.png
>
> this is active mixed-user load (45 unrelated users, approximately 85%
> CPU utilization due to memory scheduling and job layout constraints). 
> this an older cluster, HP dual-socket E5440 (2.833G) whose IPMI 
> happens to
> return nice power measures.

Thanks. That image is more helpful than you think - I didn't even think 
of using IPMI to report power consumption. Using that, I could run HPL 
on some nodes here and get measurements.
>
>
>> Ideally, I'm looking for the power load for a worst-case scenario, 
>> such as running HPL, on a per-rack basis.
>
> I don't understand the "per-rack" part - aren't you interested in 
> per-node?

Ideally, per-node is even better, but I figured most measurements would 
be at the PDU or circuit level, with one or two PDUs/Circuits per rack. 
I figured this is the granularity most people are measuring at, which is 
why I asked that way.
>
>
>> I have some numbers from a friend who lurks on this list, but the 
>> more data points I have, the better I can justify my power 
>> requirements for a new cluster purchase I'm working on.
>
> my experience is that vendors are useless in this regard: they always 
> want
> to quote the PSU max rating, and then often don't even use the number 
> right.
> (ie, put all the low-dissipation stuff like networking together, etc.)
>
> has anyone tried to rate the accuracy of vendor power calculators?
> at least a few years ago, they were absurdly inflated.

This is why I'm asking for actual, measured numbers. I read a whitepaper 
by APC or Raritan that said that if you go with the nameplate on a PDU, 
you can oversize your power requirements by a factor of 2x. For HPC, I 
imagine it wouldn't be that extreme, since cluster nodes tend to be at 
100% more of the time and therefore use more power. One vendor said they 
assume 60% - 90% of nameplate ratings when estimating power needs, which 
is still a pretty broad range.
>
> regards, mark hahn.



More information about the Beowulf mailing list