[Beowulf] Software Raid and Raid in general

Gerry Creager gerry.creager at tamu.edu
Tue Dec 13 15:50:27 PST 2005

Vincent Diepeveen wrote:
> My experience with linux software raid is that it is very very unreliable.
> Another big misconception of Raid is that striping is good for latency.
> Raid in general is *not* good for latency. The raid card 
> in question is going to handle all your requests in a 
> sequential manner anyway. 
> With the same raid controller,
> JBOD usually is more reliable than striped raid-0.
> I do understand why for real important work loads companies use very
> expensive SCSI disks for their raid 10 setups. 
> If money is irrelevant to the importance of the (usually database) data,
> then you sure must do that.
> However the effective bandwidth of nearly 400MB/s which a simple s-ata
> RAID5 is delivering is far above the bandwidth that a raid array needs to
> deliver.
> Now raid5 is ugly for writing, but it allows you that a harddisk can fail.
> And they *will* fail after a while. This is where expensive SCSI disks
> do a better, but they still fail a lot in raid10 arrays. 
> So in general a setup with a good raid controller card, 
> don't save money on that, get at least a card with memory on 
> the controller, don't do all that in software, 
> and fill it up with cheap harddrives.
> If raid5 is enough for you, go for that. Imagine the price a megabyte
> for your storage when using something like that. 
> For example if i look at a shop, probably not cheapest. I see a
> 3ware Escalade 9500S-8 Multi Lane SATA RAID, bulk 
> for 613 euro. 
> Just one example. There is many good cards of many
> companies.
> Want reliability, so that if a disk fails, that a second disk is there
> to take it over, and still have the full raid-0 write bandwidth?
> Take raid 10 in that case. Use cheapo s-ata or even ide disks. use
> good ide cables in case of ide disks. Many cables which you can 
> buy for 1 euro are not capable of delivering full IDE speed.
> Anyway, for a fraction of the amount of money that a business solution of 2
> TB costs, you can make your own very reliable raid10 array.
> For companies up to a million or 50 turnover who need a fast raid array,
> raid10 is EXCELLENT.
> The huge disadvantage of expensive scsi disks in such arrays is the huge
> price, as each partition is tiny.
> This where effectively the bandwidth a s-ata array delivers is far above
> what is needed.
> So in this case if we make a 2TB array from it. We need for raid10
> a price of just 4TB of s-ata disks and a card of 613 euro, and most likely
> a huge case for the computer hosting it with many fans to cool all the
> drives :)
> In fact if i look in the expensive dutch computershops around: 
> Seagate BARRACUDA 7200.8 250GB for 93 euro.
> There is plenty of cheaper alternatives. In USA it's all probably even
> cheaper as there is no 19% salestax there.
> Anyway, for a couple of thousands of euro's dollars, you have a GREAT and
> reliable raid array.
> Yet i cannot advice to use a software raid. That's real ugly. Bugs bugs 
> bugs in linux there.
> So for a cluster for reasonable reliability, knowing you backup your data
> regurarly, you can get huge storage for a small price and still be very
> reliable, for a small part of the price of a reliable a-brand scsi array,
> as delivered by Sun or whatever highend company.
> I saw them recently offer a raid array of 1 terabyte for 100k euro.
> Now that is of course more reliable than the cheapo solution i above quoted,
> but not *that* much more reliable. 
> Yet of course no manager ever was fired for doing business with Sun...
> Vincent
> At 16:12 13-12-2005 -0500, Bill Rankin wrote:
>>On Dec 12, 2005, at 7:26 PM, Paul wrote:
>>>I read in a post somewhere that it was not possible to use a Linux  
>>>software RAID configuration for shared file storage in a cluster. I  
>>>know that it is possible to use software RAID on individual compute  
>>>nodes but the post stated that software RAID would not support  
>>>properly support simultaneous accesses on a file server. Is this true?
>>News to us - we have a fileserver with 2TB of SCSI disks (14 drives)  
>>serving out NFS to our 450 node cluster.
>>It's not the fastest solution in the world (see my earlier post  
>>regarding NFS performance) but it does work.
>>>Assuming that hardware RAID is required (or at least preferable) I  
>>>was wondering if the built in RAID on some motherboards would be  
>>>adequate or do we need to look into a dedicated piece of hardware.
>>It depends - whose "built-in" hardware RAID?  Be aware - some  
>>"hardware RAIDS" are not full implementations, but rather onboard  
>>hardware that can support RAID-like operations.  Without the proper  
>>drives (which often do not exist for Linux) your RAID is simply a JBOD.
>>>We will have about 10 - 12 cpus initially that will be connected  
>>>with giganet network. We currently have about a terrabyte of  
>>>storage space and are planning to mount it using NFS in a RAID 5  
>>>configuration. Our applications for now will be database intensive  
>>>bioinformatics apps. I would be very interested in any comments.  
>>Personally, for that load I would initially go with a simple software  
>>RAID setup, especially if you are heavy on file reads and light on  
>>writes.  Do some analysis of your I/O loads once you have everything  
>>up and *then* determine from there if you need to throw additional  
>>hardware at the problem.

We've several RAID 50 systems config'd around the 3Ware 9500 SATA 
systems.  Reasonalbe performance (significantly better than our earlier 
experience with Promise cards).  We've also worked with the LSILogic 
controllers and they work as advertised.

We've started experimenting with ATA-over-Ethernet and see initial good 
results.  To the point that we just snagged 20TB of RAID 50 in A-o-E 
from CoRAID (and other sources for the media).


Gerry Creager -- gerry.creager at tamu.edu
Texas Mesonet -- AATLT, Texas A&M University	
Cell: 979.229.5301 Office: 979.458.4020  FAX 979.862.3983
Physical: 1700 Research Parkway, Suite 160,
College Station, TX 77843-3139

More information about the Beowulf mailing list