[Beowulf] Power draw of cluster nodes under heavy load

Mon Jul 28 14:14:36 PDT 2014

On 07/28/2014 03:07 PM, Joe Landman wrote:
> On 7/28/14, 2:55 PM, Prentice Bisbal wrote:
>> On 07/28/2014 01:29 PM, Jeff White wrote:
>>> Power draw will vary greatly depending on many factors.  Where I am 
>>> at we currently have 16 racks of HPC equipment (compute nodes, 
>>> storage, network gear, etc.) using about 140kVA but can use up to 
>>> 160 kVA.  A single rack with 26 compute nodes each with 64 cores 
>>> worth of AMD 6276 (Supermicro boxes) is using about 18 kW across the 
>>> PDUs, 3 phase at 240 volts, with most of the nodes at 100% CPU usage.
>>
>> Agreed there's a lot of variability. Since I don't exactly what's 
>> going in my new space yet, I'm looking for everyone's input to come 
>> up with an average, or ballpark amount. the 5 - 10 kW one vendor 
>> specified seems waaaay too low for a rack of high-density HPC nodes 
>> running at or near 100% utilization.
>
> Seriously, don't design for average, shoot for worst case scenario. 
> Nothing suck so much as having too low of a power or cooling budget 
> and a big new shiny that can't be fully turned on thanks to that.

This is exactly what I'm trying to do. I assume HPL will provide a worst 
case scenario, based on the average of everyone else's worst case 
scenario. I know that doesn't make sense, but I need to eliminate 
outliers that are extremely high density, like HP's new Apollo systems. 
If my systems don't have enough power to run HPL, I can't even perform 
acceptance testing!

>
> I can't speak to what other vendors say/do in this regard, but I can 
> say that we try to make sure we never use more than 50% of the 
> capacity of any particular PDU, and that the PDUs have enough head 
> room to be able to handle sudden loads (say one of the PDUs falling 
> over).

In engineering, they call this a safety factor. When I was in school, a 
common safety factory was something like worst case scenario + 20%, but 
extreme safety considerations, like bridges or amusement park rides, got 
a much higher safety factor.
>
> We've had a situation (years ago) where we were pressed not to 
> "over-spec" the power, and despite our protests, this is what was 
> installed.  First time a PDU tripped a breaker (did I mention that 
> they overloaded our original design? No? Well ...), all the load hit 
> the second PDU, full force.  This was not pretty.
>
> The cost to "over spec" is in the noise relative to the opportunity 
> cost for under spec'ing, not to mention the "additional" cost of more 
> power (and cooling ... don't forget the cooling!).

I agree. If I overspec, no one will notice, except the accountants. If I 
underspec, and we can't use the datacenter at it's designed capacity, 
everyone will notice, and it will be an embarassment for our group.
>
> You can set the maximum boundary on power pretty easily with maximum 
> draw per node and basic math.  This ignores inrush current and power, 
> but lets assume you do a phased power on (1-3 second intervals between 
> nodes).  If you want to hit all the power buttons at once, just make 
> sure you have enough headroom for that inrush.
>
> Its not a dark art per se, but be quite aggressive in what you think 
> your power draws are going to be.   Use that to set your upper bound, 
> and assume you don't want to run your PDUs to 75% capacity normally 
> (though under extreme load with half of your other PDUs offline, this 
> isn't a bad target).

I want to be very aggressive and allow excess capacity as a safety 
margin and for future growth, but we hitting our budget limits, and some 
are trying to 'right size' our power and cooling, which I'm afraid could 
be disastrous. Some involved in the discussion have stated only 5 - 10 
kW per full rack, which is too small. Since I don't know exactly what 
systems I'm going to get from my RFP, I can't do exact calculations 
based on specific models. I could do a few different models, but that 
can be time consuming, and it's not always easy to get all that 
information from the vendors.
>
>