[Beowulf] Software Raid and Raid in general

Joel Jaeggli joelja at darkwing.uoregon.edu
Tue Dec 13 15:15:41 PST 2005


On Tue, 13 Dec 2005, Vincent Diepeveen wrote:

> My experience with linux software raid is that it is very very unreliable.
>
> Another big misconception of Raid is that striping is good for latency.
>
> Raid in general is *not* good for latency. The raid card
> in question is going to handle all your requests in a
> sequential manner anyway.
>
> With the same raid controller,
> JBOD usually is more reliable than striped raid-0.
>
> I do understand why for real important work loads companies use very
> expensive SCSI disks for their raid 10 setups.
>
> If money is irrelevant to the importance of the (usually database) data,
> then you sure must do that.

Money is not irrevelevant to such subsystems, rather revenue per 
transaction justifies their existance. There's a continium between what I 
pay for nas storage (about $2000 per TB) what my currenent employer pays 
(around $10,000 per TB) and what financial services companies that value 
performance, and other features over capacity pay (closer to $100,000 per 
tb) that covers quite a broad range of hardware.

> However the effective bandwidth of nearly 400MB/s which a simple s-ata
> RAID5 is delivering is far above the bandwidth that a raid array needs to
> deliver.
>
> Now raid5 is ugly for writing, but it allows you that a harddisk can fail.
>
> And they *will* fail after a while. This is where expensive SCSI disks
> do a better, but they still fail a lot in raid10 arrays.
>
> So in general a setup with a good raid controller card,
> don't save money on that, get at least a card with memory on
> the controller, don't do all that in software,
> and fill it up with cheap harddrives.
>
> If raid5 is enough for you, go for that. Imagine the price a megabyte
> for your storage when using something like that.
>
> For example if i look at a shop, probably not cheapest. I see a
> 3ware Escalade 9500S-8 Multi Lane SATA RAID, bulk
> for 613 euro.
>
> Just one example. There is many good cards of many
> companies.
>
> Want reliability, so that if a disk fails, that a second disk is there
> to take it over, and still have the full raid-0 write bandwidth?
>
> Take raid 10 in that case. Use cheapo s-ata or even ide disks. use
> good ide cables in case of ide disks. Many cables which you can
> buy for 1 euro are not capable of delivering full IDE speed.
>
> Anyway, for a fraction of the amount of money that a business solution of 2
> TB costs, you can make your own very reliable raid10 array.
>
> For companies up to a million or 50 turnover who need a fast raid array,
> raid10 is EXCELLENT.
>
> The huge disadvantage of expensive scsi disks in such arrays is the huge
> price, as each partition is tiny.
>
> This where effectively the bandwidth a s-ata array delivers is far above
> what is needed.
>
> So in this case if we make a 2TB array from it. We need for raid10
> a price of just 4TB of s-ata disks and a card of 613 euro, and most likely
> a huge case for the computer hosting it with many fans to cool all the
> drives :)
>
> In fact if i look in the expensive dutch computershops around:
> Seagate BARRACUDA 7200.8 250GB for 93 euro.
>
> There is plenty of cheaper alternatives. In USA it's all probably even
> cheaper as there is no 19% salestax there.
>
> Anyway, for a couple of thousands of euro's dollars, you have a GREAT and
> reliable raid array.
>
> Yet i cannot advice to use a software raid. That's real ugly. Bugs bugs
> bugs in linux there.
>
> So for a cluster for reasonable reliability, knowing you backup your data
> regurarly, you can get huge storage for a small price and still be very
> reliable, for a small part of the price of a reliable a-brand scsi array,
> as delivered by Sun or whatever highend company.
>
> I saw them recently offer a raid array of 1 terabyte for 100k euro.
>
> Now that is of course more reliable than the cheapo solution i above quoted,
> but not *that* much more reliable.
>
> Yet of course no manager ever was fired for doing business with Sun...
>
> Vincent
>
> At 16:12 13-12-2005 -0500, Bill Rankin wrote:
>>
>> On Dec 12, 2005, at 7:26 PM, Paul wrote:
>>
>>> I read in a post somewhere that it was not possible to use a Linux
>>> software RAID configuration for shared file storage in a cluster. I
>>> know that it is possible to use software RAID on individual compute
>>> nodes but the post stated that software RAID would not support
>>> properly support simultaneous accesses on a file server. Is this true?
>>
>> News to us - we have a fileserver with 2TB of SCSI disks (14 drives)
>> serving out NFS to our 450 node cluster.
>>
>> It's not the fastest solution in the world (see my earlier post
>> regarding NFS performance) but it does work.
>>
>>
>>>
>>> Assuming that hardware RAID is required (or at least preferable) I
>>> was wondering if the built in RAID on some motherboards would be
>>> adequate or do we need to look into a dedicated piece of hardware.
>>
>> It depends - whose "built-in" hardware RAID?  Be aware - some
>> "hardware RAIDS" are not full implementations, but rather onboard
>> hardware that can support RAID-like operations.  Without the proper
>> drives (which often do not exist for Linux) your RAID is simply a JBOD.
>>
>>> We will have about 10 - 12 cpus initially that will be connected
>>> with giganet network. We currently have about a terrabyte of
>>> storage space and are planning to mount it using NFS in a RAID 5
>>> configuration. Our applications for now will be database intensive
>>> bioinformatics apps. I would be very interested in any comments.
>>> Thanks
>>
>> Personally, for that load I would initially go with a simple software
>> RAID setup, especially if you are heavy on file reads and light on
>> writes.  Do some analysis of your I/O loads once you have everything
>> up and *then* determine from there if you need to throw additional
>> hardware at the problem.
>>
>> Good luck,
>>
>> -bill
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org
>> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>>
>>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>

-- 
--------------------------------------------------------------------------
Joel Jaeggli  	       Unix Consulting 	       joelja at darkwing.uoregon.edu
GPG Key Fingerprint:     5C6E 0104 BAF0 40B0 5BD3 C38B F000 35AB B67F 56B2




More information about the Beowulf mailing list