[Beowulf] While the knives are out... Wulf Keepers

Mike Davis
Mon Aug 21 08:19:21 PDT 2006

Greg Lindahl wrote:
> On Thu, Aug 10, 2006 at 10:47:00AM +1000, SIM DOG wrote:
>>I recently visited a large educational institution (that shall remain
>>nameless) that hosts an excellent, world class, science research team.
>>They also have a reasonably large Beowulf environment (over 100 dual nodes).
>>Now maybe it was just the people I was talking too (management) but I
>>get the distinct impression that they treat their 'Wulf as an
>>'appliance'. It came as a great disappointment :/
Why so?
> That cluster didn't cost that much compared to half a person, unless
> the person is a grad student. Which doesn't fit their reliability
> criterion ;-)

For the most part, I think that if a cluster is run correctly, it is an 
appliance for the scientists. Their job is to produce research, mine is 
to manage clusters and smp machines.

A problem that sometimes crops up is that these days, everyone thinks 
that they can manage a cluster (or large smp for that matter), because 
they have a linux box or maybe a 4-1p nodes at their house. Sometimes 
its a real issue getting these people to understand that managing a 
machine for 1 person and managing it for 5,50,500 are entirely different.

For example, on friday, one of our applications analysts wanted to 
upgrade a piece of software on one of the clusters. He didn't know what 
it would affect (libraries, other installed software, users already 
using that software). After a bit of investigation it turned out that 
the PI in question could use the version already installed (which is 
about 6 months old).

I guess that I'm rather "old school" but upgrades have to be for a 
reason other than there's a new version. Maybe they are needed for 
features, or security, or stability. But IMO, they are seldom needed 
because they are new.


