[Beowulf] Lifespan of a cluster

Andrew M.A. Cater amacater at galactic.demon.co.uk
Sun Apr 27 04:04:03 PDT 2014

On Sun, Apr 27, 2014 at 09:45:41AM +0100, Jörg Saßmannshausen wrote:
> Dear all,
> in some of the discussions here I came across the 'lifespan of a cluster' 
> argument. What I was wondering is: how long is that in HPC for number 
> crunching?
> Is it 3 years (end of warranty), 5 years (making good use of hardware) or 
> longer?

It depends: it depends very much on what you're crunching, for how many people
at what demand profile, how much data you've got ... a whole host of variables.

It also depends on which manufacturer makes your hardware. IBM server, IBM diskshelves
- 3 - 5 year full hardware support for the server, likewise for the shelf, individual
disks may be available for ten years - but you may end up getting factory refurbished
for the last few years. That comes at the high cost of having a service contract but
you do get peace of mind - and salesmen pestering you after year 3 to make that upgrade ...

If you're running a university's set of clusters, then you may find that the 
department head who can shout loudest / has the most money gets this year's sexy
hardware and everyone plays shuffle down to the next set of hardware cast off
by the department above.

Power / cooling / rack shape and density also change: if you're still running a
working cluster or three that are 10 years old, your power efficiency is probably 
not great and you're costing more to run the cluster than the hardware cycles are
worth - but the refit costs of the data centre start to add up.

There is a good argument on ecology / power and cooling costs alone.

I have lived in a data centre where it was worth it to retrofit screening and
move some racks to create hot/cold aisle separation to decrease overall cooling
costs - but that decision, in itself, cost $$$.

Likewise HPC interconnects and networking are "better" (for some values of
better) on newer technologies

Have you asked this question of your friendly rivals at Imperial and Queen Mary :)

> The reason behind that asking is: I got clusters here which are 10 years old, 
> and quite a number of them, and I would like to get a scheme implemented to 
> get the hardware replaced every X years with X being the 'lifespan of a 
> cluster'. One of the various options which are currently thrown around is to 
> move from my local data-centre (3 rooms, one is purely for the backup/file 
> storage and the other two for HPC) into the College shared data centre (single 
> room). IF we are doing that, I am a bit worried that I get told in 5 years 
> time (for the sake of that argument): your clusters are end of lifetime, you 
> have to get rid of them as we need space / they are consuming too much energy.
> Thus, I am looking to get some answers for: how long are clusters run 
> typically and how is that done in other shared data centres?
> The current funding situation here means it is difficult, if not impossible, to 
> get HPC hardware from funding agencies. Even if you get a bit of money, it is 
> just enough to get a new node. So most clusters are a bit organically grown 
> which makes administration difficult if you want to get really the best out of 
> waht you paid for. In an ideal world, I would like to have that replaced every 
> 5 years: old kit out, new kit in. In the real world, I got to run the kit 
> until it falls apart and hope that the Principal Investigator, i.e. the owner 
> of the cluster, got some money to replace the old/broken nodes. Hence the 
> questions so I can build up a good case to change there.

Central funding from something like the old SERC [Science and Engineering 
Research Council] / joint university projects?
> I hope that makes sense to you.
> All the best from a overcast London!
> Jörg
> -- 
> *************************************************************
> Dr. Jörg Saßmannshausen, MRSC
> University College London
> Department of Chemistry
> Gordon Street
> London
> WC1H 0AJ 
> email: j.sassmannshausen at ucl.ac.uk
> web: http://sassy.formativ.net
> Please avoid sending me Word or PowerPoint attachments.
> See http://www.gnu.org/philosophy/no-word-attachments.html

> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

More information about the Beowulf mailing list