[Beowulf] Repeated Dell SC1435 crash / hang. How to get the vendor to resolve the issue when 20% of the servers fail in first year?

Prentice Bisbal prentice at ias.edu
Mon Apr 6 10:51:46 PDT 2009

Mark Hahn wrote:
> buying an extended warranty might help.  buying a shrink-wrapped cluster
> might help too.

Not really. My cluster was a "shrink-wrapped" cluster from Dell. Turns
out Dell hired someone from a 3rd-party to actually turn on the cluster
(for the first time) and install all the software (nothing more than a
vanilla ROCKS installed, without even a queuing system!) *after* the
cluster arrived at our site.

An arrangement like this just muddies the situation even further. If I
had a software problem, do I call cluster, or the 3rd-party hired to
install the software?

I think you mean "buy a shrink-wrapped cluster from a well-respected,
cluster-specific vendor that has proven in-house cluster expertise"


