[Beowulf] recommendations for cluster upgrades

Rahul Nabar rpnabar at gmail.com
Wed May 13 11:57:33 PDT 2009

On Wed, May 13, 2009 at 10:57 AM,  <richard.walsh at comcast.net> wrote:
> if you are planning a socket-only upgrade, the result might be different.

I am not sure what exactly you mean by a "socket only upgrade" (I'm
still getting up to speed with HPC jargon!).
But these new servers will be in addition to our existing machines. We
definitely want to retain the current crop of AMD-Opteron SC1435's
since they are only a year old.

>  This
> is an occasion to break out Excel, and crunch your numbers.  In addition,
> I would consider the average processor count of the your work load.  If it
> typically
> does not exceed your current system size, running two clusters with
> different
> architectures as a throughput engine might make sense.

That is an interesting thought. In fact it has always been a tough
question for me. The choice being adding to a cluster or making a new
one. I dread heterogeneity for many reasons: Jobs spanning across slow
and fast nodes are a problem. Maintaining various executibles for
different archs is also a pain. On the other hand making seperate
clusters means a duplication of the login servers, scheduling daemons,
dhcp, etc. Node allocation also gets suboptimal when at times one
cluster's queue would be full and the other not.

But grant-money does not come in one huge chunk so I guess gradual
expansions are just a fact of life. Frequently when I come to an
expansion date the last hardware is end-of-lifecycle from the vendor.


