[Beowulf] Re:UPS system for Linux cluster
cousins at umit.maine.edu
Thu Apr 30 09:07:11 PDT 2009
David Mathog wrote:
> Do you really need a UPS for the whole cluster?
> In many instances it is good enough to put a UPS on the master node and
> just use surge suppressors on the compute nodes. The up side being that
> only a small and relatively inexpensive UPS is required. The down side
> being of course that any power failure will break ongoing calculations.
> However, if your work can be check pointed a power failure will only
> wipe out the work since the last check point. If your power is
> reasonably reliable, this is a reasonable way to save a lot of money.
> Also, unless you are also buying a generator, a whole cluster UPS will
> only buy you limited up time during a power failure, so you may well
> lose the calculation despite the large UPS.
As you remark, it really depends on how reliable your power is. The vast
majority of our power outages are either just blips or last under a
minute. The blips are enough to reboot the computers. Generally (for us)
if power goes out for more than a couple of minutes there is a good chance
that it is going to be out for long enough that the nodes will need to be
shut down. Having the whole cluster on UPS power saves a lot of down time.
Whether it is one large UPS or many small ones (a la Google) depends on
your setup. One large one is easy to manage but it is expensive (and
noisy! our Liebert emits a terrible high frequency noise). The small ones
(one per node) are a lot cheaper but there is a lot of clutter unless you
can do something similar to Google, but those aren't commodity parts.
More information about the Beowulf