[Beowulf] RE: Capitalization Rates - How often should you replace a cluster? (resent - 1st sending wasn't posted ).

Mark Hahn hahn at mcmaster.ca
Fri Jan 16 17:10:43 PST 2009


> The question was raised as "When should all these servers be upgraded or replaced again?"

3-5 years, IMO.  if you replace hardware in <3 years, you're obviously
burning money.  that's defensible sometimes, but always pretty dubious - 
or else you need an inflated sense of self-importance like oh, say the 
dearly departed financial industry ;)

when pushing past 5 years, it's not terrible to keep the cluster running,
perhaps by sacrificing a few nodes, but it becomes increasingly unattractive
to _use_.  that is, after 5 years, "current" machines will have noticably 
better clock, flops, cache, memory bw and latency, memory size, disk, 
interconnect, power efficiency, etc.  there are still people who will use
a 7-year old cluster, but they're outliers, one way or another ;)

> But there are other factors - over time the "older systems" are harder to
> maintain.... don't run newer licenses of SW products,

what is this "license" thing of which you speak?  ;)
seriously, the industry isn't change _that_ fast.  yes, you can probably find
some software which doesn't bother providing ia32 versions, only x86_64, 
but the latter has been around for quite a while (6 years?).  the big changes
have been mainly cache and cores and memory.

> need spare parts, some of which are hard or very hard to find (e.g. old RAM modules  - on Ebay?!).

ddr2 has been pretty long-lived, and it's still pretty easy to find early
(low-clock) modules.  that already takes you back something like 5 years.

> Sometimes the newer technology uses less power and is cheaper to operate....(anyone ever create a KW/MFLOP vs. Time curve?  has that really gone down? )

unambiguously, w/flop has been drastically improved, through both process and
architectural changes.  65W quad-core, 4flops/cycle is pretty amazing,
considering that only 3 years ago, 95W single-core 2flops/cycle was 
reasonable.  (dual-core at the time compromised clock fairly seriously).
3-5 years ago, the main action was clock scaling from about 1.4 -> 3 GHz,
but that was generally more flops for more power, rather than 4x flops 
for 30% _less_ power.

> After several years (e.g.. 6,7, or 8) the systems Admin costs on the older systems may be higher - e.g.. more labor, specialized training, unique tools ..

past 5 years is, imo, more of a museum curation effort, rather than really
running a compute facility.

> If we know the required life is a long time our customer insists on tracking end of life points and buying spares to have on hand.  That costs more for longer time spans.

you can keep doing the same thing for 6+ years, and I don't see why the costs
would blow up if you're smart about spares and/or cannibalization.  but you
have to ask yourself: is it worth using this old stuff which is so slow and
inefficient, relative to cheap new stuff?

consider a car analogy - old cars can be pretty neat.  but keeping a 1957 chevy
running makes the most sense in, say, Cuba, where labor is cheap,
replacements are difficult, and where the weather and other climate make you
not care so much about whether you have side impact airbags or AWD or 40 MPG
at 70 MPH or builtin gps/handsfree/blueray-player.

> Do operating costs really go up as a cluster ages?  What other factors are there?

well, I think the main thing is opportunity costs of a new, faster, more
efficient cluster.  power and admin overhead don't really add up very fast,
maybe 5% of purchase cost/year.  I've heard people quote MUCH higher numbers
than that - no doubt their sysadmins make more than me, and might have UPS 
for compute nodes, etc.

> For some the upgrade is necessary to tackle a larger or more complex
>problem  - but others can just let the system churn a bit longer to get the
>answer.  Is that the real driver?

"a bit longer" is fine - I assume you mean a base2 order of magnitude ;)
the cluster closest to my chair is around .035 Gflops/W.  a modern
replacement would be about .256, or a factor of 7.4x better.  IMO, an 
honest figure of merit would be higher than this, since the new machine
would also give the answer faster.

> I recall that one Beowulf user facility operated both a new and an old
> production cluster, and replaced the "old one" with a "newer" one (the new
> "new one" ) on a regular basis.

my organization has a large variety of clusters, installed since 2001.  in
fact, we're just now decommissioning our original 2001 stuff (compaq es40's).
we could have kept it going, and it still got some use.  the main problems
with it were that we have 20 other clusters and no excess staff, and that 
it's been some time since anyone produced a modern distro for alphas (which
you can think of as a reiteration of the staff issue.)  in the end, the fact
that it was ~100 4p nodes that provided 6.6 Gflops for ~600W also factored
into the decision.

> Is that common?  Do most of you as users see the old system just chucked
> out as the new one is brought online?

we overlap our systems.  partly because government funding is ah, "whimsical".



More information about the Beowulf mailing list