[Beowulf] Re: Third-party drives not permitted on new Dell servers?
landman at scalableinformatics.com
Tue Feb 16 09:09:45 PST 2010
David Mathog wrote:
> Joe Landman <landman at scalableinformatics.com> wrote
>> So along comes a drive manufacturer, with some nice looking specs on 2TB
>> (and some 1.5 and 1 TB) drives. They look great on paper. We get them
>> into our labs, and play with them, and they seem to run really well.
>> Occasional hiccup on building RAIDs, but you get that in large batches
>> of drives.
>> So now they are out in the field for months, under various loads. Some
>> in our DeltaV's, some in our JackRabbits. The units in the DeltaV's
>> seem to have a ridiculously high failure rate. This is not something we
>> see in the lab. Even with constant stress, horrific sustained workloads
>> ... they don't fail in ou testing. But get these same drives out into
>> the users hands ... and whammo.
>> Slightly different drives in our JackRabbit units, with a variety of
>> RAID controllers. Same types of issues. Timeouts, RAID fall outs, etc.
>> This is not something we see in the lab in our testing. We try
>> emulating their environments, and we can't generate the failures.
>> Worse, we get the drives back after exchanging them at our cost with new
>> replacements, only to find out, upon running diagnostics, that the
>> drives haven't failed according to the test tool. This failing drive
>> vendor refuses to acknowledge firmware bugs, effectively refuses to
>> release patches/fixes.
> While there is no doubting that these drives didn't work reliably in
> your arrays, that doesn't necessarily mean they were "defective". Just
> playing devil's advocate here, but it could be the array controller is
> using some feature where there is a bit of wiggle room in the standard,
> so that both the disk and the controller are "conforming", but they
> still won't work together reliably. In a situation like that I would
> expect the vendor to disclose the issue, so it would be clear why the
> disks had to come from A and not B. As long as the vendor explained the
> problem clearly most customers would be fine buying the preferred disks.
I agree that some devices work well with others. This is what we see.
Some do not. We have a few boxful's of 1TB drives that don't play well
And yes, standards do leave wiggle room. Interop testing days are
critical. A connect-a-thon very helpful.
But the point is, just because it says SATA, you shouldn't expect that
it will work with all SATA controllers. No ... seriously. Likewise
this is true with many other components.
Some stuff doesn't play well with others.
I didn't sanction the language used, I thought it wrong. But from a
support scenario, it can be (and often is) a nightmare. We take
ownership of as little or as much of what our customers want us to do.
If your name is on the box, no-one appreciates a finger pointing
exercise rather than a path to solution.
> It's when the vendor says "you have to use OUR disks" and doesn't tell
> you why, and when, as far as you can tell, these are the same devices
> that you could buy directly from the manufacturer without the 5X markup,
> that things smell bad.
I agree with this paragraph. We won't name specific names in public, we
do speak about our drive issues in private with our customers.
5X markup? We must be doing something wrong :/
> David Mathog
> mathog at caltech.edu
> Manager, Sequence Analysis Facility, Biology Division, Caltech
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics, Inc.
email: landman at scalableinformatics.com
web : http://scalableinformatics.com
phone: +1 734 786 8423 x121
fax : +1 866 888 3112
cell : +1 734 612 4615
More information about the Beowulf