[Beowulf] Lifespan of a cluster
j.sassmannshausen at ucl.ac.uk
Sun Apr 27 01:45:41 PDT 2014
in some of the discussions here I came across the 'lifespan of a cluster'
argument. What I was wondering is: how long is that in HPC for number
Is it 3 years (end of warranty), 5 years (making good use of hardware) or
The reason behind that asking is: I got clusters here which are 10 years old,
and quite a number of them, and I would like to get a scheme implemented to
get the hardware replaced every X years with X being the 'lifespan of a
cluster'. One of the various options which are currently thrown around is to
move from my local data-centre (3 rooms, one is purely for the backup/file
storage and the other two for HPC) into the College shared data centre (single
room). IF we are doing that, I am a bit worried that I get told in 5 years
time (for the sake of that argument): your clusters are end of lifetime, you
have to get rid of them as we need space / they are consuming too much energy.
Thus, I am looking to get some answers for: how long are clusters run
typically and how is that done in other shared data centres?
The current funding situation here means it is difficult, if not impossible, to
get HPC hardware from funding agencies. Even if you get a bit of money, it is
just enough to get a new node. So most clusters are a bit organically grown
which makes administration difficult if you want to get really the best out of
waht you paid for. In an ideal world, I would like to have that replaced every
5 years: old kit out, new kit in. In the real world, I got to run the kit
until it falls apart and hope that the Principal Investigator, i.e. the owner
of the cluster, got some money to replace the old/broken nodes. Hence the
questions so I can build up a good case to change there.
I hope that makes sense to you.
All the best from a overcast London!
Dr. Jörg Saßmannshausen, MRSC
University College London
Department of Chemistry
email: j.sassmannshausen at ucl.ac.uk
Please avoid sending me Word or PowerPoint attachments.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 198 bytes
Desc: This is a digitally signed message part.
More information about the Beowulf