[Beowulf] Doing i/o at a small cluster

Vincent Diepeveen diep at xs4all.nl
Fri Aug 17 05:03:39 PDT 2012


Which free or very cheap distributed file system choices do i have  
for a 8 node cluster that has QDR infiniband (mellanox)?
Each node could have a few harddrives. Up to 8 or so SATA2. Could  
also use some raid cards.

And i'm investigating what i need.

I'm investigating to generate the 7 men EGTBs at the cluster. This is  
a big challenge.
To generate it is high i/o load. I'm looking at around a 4 GB/s i/o  
from which a tad more than
1GB/s is write and a tad less than 3GB/s is readspeed from harddrives.

This for 3+ months nonstop. Provided the CPU's can keep up with that.  
Otherwise a few months more.

This 4GB/s i/o is aggregated speed.

What raid system you'd recommend here?

A problem is the write speed + read speed i need. From what i  
understand at the edges of drives the speed is
roughly 133MB/s SATA2 moving down to a 33MB/s at the innersides.

Is that roughly correct?

Of course there will be many solutions. I could use some raid cards  
or i could equip each node with some drives.
Raid card is probably sata-3 nowadays. Didn't check speeds there.

Total storage is some dozen to a few dozens of terabytes.

Does the filesystem automatically optimize for writing at the edges  
instead of starting at the innerside?
which 'raid' level would you recommend for this if any is appropriate  
at all :)

How many harddrives would i need? What failure rate can i expect with  
modern SATA drives there?
I had several fail at a raid0+1 system before when generating some  
EGTBs some years ago.

Thanks in advance for tips/hints and suggestions!

Note there is more questions. Like which buffer size i must read/ 
write. Most files get streamed.
 From 2 files that i do reading from, i read big blocks from a random  
spot in that file. Each file is
a couple of hundreds of gigabyte.

I used to grab chunks of 64KB from each file, but don't see how to  
get to gigabytes a second i/o with
todays hardware that manner.

Am considering now to read blocks of 10MB. Which size will get me  
there to the maximum bandwidth the i/o
can deliver?

Note that each node is busy with a bunch of files in this manner.

So if a big distributed file system doesn't work out well i can try  
JBOD at each node.

Kind Regards,

More information about the Beowulf mailing list