[Beowulf] RE: Storage - the end of RAID?
Ellis H. Wilson III
ellis at runnersroll.com
Fri Oct 29 11:46:35 PDT 2010
On 10/29/10 14:06, Lux, Jim (337C) wrote:
>> -----Original Message-----
>> From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Hearns, John
>> Sent: Friday, October 29, 2010 9:43 AM
>> To: beowulf at beowulf.org
>> Subject: [Beowulf] Storage - the end of RAID?
>>
>> Quite a perceptive article on ZDnet
>>
>> http://www.zdnet.com/blog/storage/the-end-of-raid/1154?tag=nl.e539
>>
>> Class, discuss.
>>
>
> Yes, indeed, his comments makes sense..
>
> After all, the acronym was "Redundant Arrays of Inexpensive Disks"
>
> Granted, these implementations had useful side effects (e.g. improving read speed by sharing)
>
> The real question is whether drive reliability has improved commensurate with the drive capacity (that is, is the failure rate per drive basically constant, as opposed to the "bit error rate")
>
> RAID was designed to solve the "failed drive" problem, more than the "bad bit" problem. And to do it using a less than "rate 1/2" code.. that is, rather than store 2 copies of your data, you could store, essentially, 11/8ths copies of your data (using a Hamming code to generate 3 syndrome bits for each 8 data bits for instance), thereby saving money.
>
> However, if drives get cheap, then using 2 copies (or 3) isn't a big deal.
Drives (of the commodity variety) are pretty darn cheap already. I'd be
surprised if this (RAID 1) isn't the better solution today (rather than
RAID2-6), rather than some point in the future.
The major issue I see with the article is that the author refers to RAID
being "dead" when really he should be saying RAID 2-6 is less preferable
to RAID 1 (but it does make for a "catchier" article title). RAID 0
will always be around to soften the bottleneck created by the gap in
performance between CPU and disk. I would actually be surprised if it
wasn't common in big HPC in five years to have cpu nodes talking to I/O
forwarding nodes that had RAID1 caches of SSDs in them who in turn
talked to Server nodes connected directly to LUNs (who also have RAID,
although I cannot say whether it would be 1/10/01/etc). This setup
lessens the need for tons of expensive RAM at the client or forwarding
nodes since SSD is closer to CPU speed than disk in terms of latency for
reads and fixes some of the canonical "durability" problems in HPC.
Also, I think he would be hard-pressed to make a case against varieties
of hybrid RAID which use 0 and 1. In those situations on failure you
are basically performing a straightforward copy - and it can happen
from/to multiple disks at once. Slight performance degradation, but
nothing as serious as parity-based rebuilds.
I personally do not see certain versions of RAID going away anytime soon
- they are just too basic a concept for performance/redundancy to kill
them off.
ellis
More information about the Beowulf
mailing list