[Beowulf] Software RAID?

Ekechi Nwokah ekechi at alexa.com
Mon Nov 26 17:42:49 PST 2007

Reposting with (hopefully) more readable formatting.


Sorry I was not quite clear. See below. 

> -----Original Message-----
> From: Bill Broadley [mailto:bill at cse.ucdavis.edu]
> Sent: Wednesday, November 21, 2007 4:35 PM
> To: Ekechi Nwokah
> Cc: Beowulf Mailing List
> Subject: Re: [Beowulf] Software RAID?
> Ekechi Nwokah wrote:
> > Hi,
> > 
> > Does anyone know of any software RAID solutions that come 
> close to the
> Yes, better even.
> > performance of a commodity RAID card such as LSI/3ware/Areca for 
> > direct-attached drives?
> Yes.  Of course there's a factor of 10 or more in the various 
> cards offered by those vendors.  Adaptec is another popular brand.
> > With the availability multi-core chips and SSE instruction sets, it 
> > would seem to me that this is doable. Would be nice to not 
> have to pay
> > for those RAID cards if I don't have to. Just wondering if anything 
> > already exists.
> Linux has software RAID built in.
> Of course there are a zillion things you didn't mention.  How 
> many drives did you want to use?  What kind? (SAS? SATA?)  If 
> you want 16 drives often you get hardware RAID hardware even 
> if you don't use it.
> What config did you want? 
> Raid-0? 1? 5? 6? Filesystem?

So let's say it's 16. But in theory it could be as high as 192. Using
multiple JBOD cards that present the drives individually (as separate
LUNs, for lack of a better term), and use software RAID to do all the
things that a 3ware/Areca, etc. card would do across the total span of

RAID 0/1/5/6, etc., hotswap, SAS/SATA capability, etc.

> Oh, and how do you measure performance?  Bandwidth?  Seeks?
> Transactions?
> Transaction size?  Mostly read? write?

All of the above. We would be max per-drive performance, say 70MB/s
reads with 100 IOPs on SATA, 120MB/s reads with 300 IOPs on SAS using 4k
transaction sizes. Hopefully eliminate any queueing bottlenecks on the
hardware RAID card.

Assume that we are using RDMA as the network transfer protocol so there
are no network interrupts on the cpus being used to do the XORs, etc.
Right now, all the hardware cards start to precipitously drop in
performance under concurrent access, particularly read/write mixes.
Areca is the best of the bunch, but it's not saying much compared to
Tier 1 storage ASICs/FPGAs. 
The idea here is twofold. Eliminate the cost of the hardware RAID and
handle concurrent access accesses better. My theory is that 8 cores
would handle concurrent ARRAY access much better than the chipsets on
the hardware cards, and that if you did the parity calculations, CRC,
etc. using SSE instruction set you could acheive a high level of
parallelism and performance. 

I just haven't seen something like that and I was not aware that md
could acheive anything close to the performance of a hardware RAID card
across a reasonable number of drives (12+), let alone provide the
feature set. 

-- Ekechi

More information about the Beowulf mailing list