Quick survey -- UPSs on slave nodes?
Maurice Hilarius
maurice at harddata.com
Mon Feb 10 22:54:56 PST 2003
With regards to your message:
>At the University of Idaho we are preparing to order a new beowulf cluster
>and a vendor seemed to be shocked that we wanted UPSs attached to ALL of
>the nodes. He stated that most people only use UPSs on the Master Node,
>unless the cluster was used for some kind of mission critical purpose.
>So the question, do you use UPSs on your slave nodes? Arguments for not
>using UPSs on slave nodes (and vice versa)?
In general it depends on what your code is designed to do.
IF it can sustain being stopped and started again, then there is little
benefit to the operation from a computing point of view.
However some people can not afford this.
The amount of UPS facility needed to keep any reasonable size of cluster
going for anything more than a few minutes is prohibitively large and
expensive. Beyond that you have to look at substantial backup generators,
to power the cluster AND to power the air conditioning.
So, given that the above yields an answer to the uptime question as "No",
then we are left with 2 electrical protection modes:
1) The concept that UPS will protect your equipment from spikes, noise and
so on.
2) Protection from brown-outs ( low voltage temporary conditions).
A UPS that can give real protection MUST fully isolate all lines, hot,
neutral, and most importantly ground. To effectively provide some useful
protection against bad power conditions it must protect against a number of
different conditions, many of which occur simultaneously.
It is expensive to do this. There are essentially 3 major classes of UPS:
Level 3 - Standby or Off-line design:
Protects from :
Power failure
Power sag
Power spike or surge on hot or neutral lines.
Does NOT protect from:
Undervoltage (Brownout)
Overvoltage
Line Noise
Frequency variation
Switching transients
Harmonic distortion
Approximate Cost for a typical 650VA/400W unit, enough to run 2 or 3 nodes:
$ 160
Rather useless for protecting computers, I might add. All it really
protects you from is short power failures.
Level 5 - Line Interactive:
Protects from :
Power failure
Power sag
Power spike or surge on hot or neutral lines.
Undervoltage (Brownout)
Overvoltage
Does NOT protect from:
Line Noise
Frequency variation
Switching transients
Harmonic distortion
Approximate Cost for a typical 750VA/500W unit, enough to run 3 typical
nodes: $ 270
Fairly good for this application, but not for absolutely mission critical
equipment.
Level 9 - Online UPS:
Protects from :
Power failure
Power sag
Power spike or surge on hot or neutral lines.
Undervoltage (Brownout)
Overvoltage
Line Noise
Frequency variation
Switching transients
Harmonic distortion
Approximate Cost for a typical 700VA/490W unit, enough to run 3 typical
nodes: $ 425
Probably overkill for cluster nodes. Maybe a good idea for a master node
and mass storage/arrays.
See:
NO UPS can effectively protect you from really big surges and spikes, such
as a transformer failure in the power distribution, or a local lightning
strike.
There is equipment designed to do that to a great degree, but it is not
generally included in UPS equipment. The best methods of doing tat kind of
line spike protection that I have seen usually involve large, centre tap
toroidal transformers.
See:
http://www.oneac.com/powercon.html
http://www.powervar.com/english/solutions/prod_spc_na_gg.asp
Few UPS give any real isolation on the ground line.
As the most spikes and noise occurs on this line, I question the usefulness
in using a UPS as a filtering device in many cases.
Batteries are expensive, and MUST be maintained and tested.
Typically a UPS after 1 year has lost at least 40% of it's capacity.
Ultimately it all depends on your budget.
I do not consider anything less than level 5 worth considering.
Assuming you DO want to add level 5 protection, you are looking at roughly
$200+ per node for UPS protection.
As a typical dual processor node goes for "around" $2000 nowadays, this
means a 10% immediate cost overhead, plus as a typical rack of 1U machines
and switches is about a rack full, and that the UPS equipment needed to
keep these up for 5 minutes of reliable shutdown time takes about 6U, it
means that you consume about 15% of your racks with UPS gear.
If you are more comfortable with this , and can afford to give up 10% of
your hardware budget and 15% of your space for UPS equipment, then it is
not money wasted.
In our experience we rarely see the addition or lack of UPS on clusters we
build and support as having much significant effect on hardware failure
rates. IF you use decent quality power supplies with good design and
components, and sufficient capacity, they can run in a fairly bad brownout
without issues. Good power supplies are worth the money, and personally I
would recommend you look seriously at ensuring you get that criteria
fulfilled first.
I am sure lots of people on this list have their own experiences and
beliefs, but in quite a few years of building hardware, as a designer,
manufacturer and supplier of clusters and servers we are much more
concerned with power supply quality than line conditions.
The cases where we feel UPS is more vital is in industrial sites where the
building current is subject to sever use and conditions, and in some areas,
especially rural ones, where lightning storms and power failures are a
regular occurrence.
We have built many machines for use in Japan, and there they have to deal
with lower voltage than North America, often below 100V, and a 50 cycle
power grid. That effectively reduces a typical 400W supply to be
effectively 300W. By using good power supplies we do not see problems with
that.
In general, power supplies rated for European Power Factor Correction (PFC)
specifications are able to handle much worse conditions without issues.
Sorry to be so wordy, but it is not a simple question to answer..
With our best regards,
Maurice W. Hilarius Telephone: 01-780-456-9771
Hard Data Ltd. FAX: 01-780-456-9772
11060 - 166 Avenue mailto:maurice at harddata.com
Edmonton, AB, Canada http://www.harddata.com/
T5X 1Y3
More information about the Beowulf
mailing list