[Beowulf] Mark Hahn's Beowulf/Cluster/HPC mini-FAQ for newbies & some further thoughts
Prentice Bisbal
prentice.bisbal at rutgers.edu
Tue Nov 6 06:46:05 PST 2012
On 11/05/2012 08:14 PM, Christopher Samuel wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 05/11/12 20:02, Mark Hahn wrote:
>
>>> For serious work, the cluster and its software needs to survive
>>> power outages,
>> well, it's a cost-benefit tradeoff. my organization has no power
>> protection on any compute nodes, though we do UPSify the storage.
> Agreed - we do the same here at VLSCI. Having UPS for our IBM
> systems (iDataplex and BlueGene/Q) wouldn't make sense given our power
> around here is pretty good (touch wood) and would have had a sizeable
> impact on the cost of the data centre to house them. However, all our
> storage and infrastructure stuff like management node, etc, are on UPS.
>
> Our SGI cluster though is on UPS, it's in a different data centre
> where all racks are on UPS (DRUPS in this case) and there is no
> non-protected option.
>
For those too lazy to google 'DRUPS', I did it for you:
http://en.wikipedia.org/wiki/Diesel_rotary_uninterruptible_power_supply
At my previous employer, my cluster was only 64-nodes, so it's a bit
smaller than most of your clusters, I think. The head node had software
from the UPS vendor installed so it could listen for a signal from the
UPS. As soon as the UPS lost incoming power and switched to battery, the
head node did the following: (I was using SGE, so some of this
terminology is SGE-specific):
1. Disable all queues on all execution nodes.
2. Kill all running jobs, and requeue them at the same time. Since all
queues were disabled, the jobs would sit in the scheduler queue but not
run.
3. Shutdown all cluster nodes using IPMI.
The head node would stay up so that if the power outage was for a brief
time, I could log into it and turn on all the other nodes. If the UPS
was running out of battery life, it would shut itself down gracefully.
On power restoration, the queues would remain disabled so that no jobs
would be started until I was confident power was permanently restored,
and there was no issues that I needed to fix before jobs could run.
Enabling the queues manually, was trivial, so this worked well.
For diskful cluster nodes of reasonable small size, I think this is a
good strategy. By having the nodes shut down gracefully, you don't have
to wait for your nodes to do fsck on boot up or anything like that which
can happen with an abrupt loss of power. Imagine if I had to get on the
console of every node to press 'y' at some fsck prompt during boot up.
However, as others have pointed out, for larger clusters this just isn't
a practical approach.
Currently, I manage a Blue Gene /P, which is diskless/stateless, so the
Blue Gene itself has no backup of any kind, but the service nodes and
file system are on UPS. My area was hit very hard by Hurricane Sandy
last week, so I learned the hard way what still needs to be configured
for my Blue Gene to shutdown gracefully when the UPS runs out of battery.
--
Prentice
More information about the Beowulf
mailing list