[Beowulf] Repeated Dell SC1435 crash / hang. How to get the vendor to resolve the issue when 20% of the servers fail in first year?
jan.heichler at gmx.net
Mon Apr 6 12:00:36 PDT 2009
Montag, 6. April 2009, meintest Du:
PB> Mark Hahn wrote:
>> buying an extended warranty might help. buying a shrink-wrapped cluster
>> might help too.
PB> Not really. My cluster was a "shrink-wrapped" cluster from Dell. Turns
PB> out Dell hired someone from a 3rd-party to actually turn on the cluster
PB> (for the first time) and install all the software (nothing more than a
PB> vanilla ROCKS installed, without even a queuing system!) *after* the
PB> cluster arrived at our site.
What was in the "Statement of Work"? I learnt: if you don't specify
everything you need/want then some company will offer without
that/these features - even if it is a crucial one for the solution.
PB> An arrangement like this just muddies the situation even further. If I
PB> had a software problem, do I call cluster, or the 3rd-party hired to
PB> install the software?
What does your "Statement of Work" say about this?
Contractors can be "first point of contact" for the customer. So you
always call them - they tell you when you have to call Dell (or call
Dell for you if it is covered by the contract).
PB> I think you mean "buy a shrink-wrapped cluster from a well-respected,
PB> cluster-specific vendor that has proven in-house cluster expertise"
Right! Go for the specialists. There are some "hardware independent"
companys. They use whatever Hardware you like. As a customer (of a
certain size) you can even make the big ones work with a small
specialised company. The big guys just care about the number of
servers they sell... whatever makes that happen is okay...
More information about the Beowulf