[Beowulf] RAID for home beowulf

Tomislav Maric tomislav.maric at gmx.com
Sun Oct 4 04:08:14 PDT 2009


Mark Hahn wrote:
>> So, maybe the bold question to ask would be: what would be the best RAID
>> config for 3 HDDS and a max 6 node HPC cluster? Should I just use RAID 1
> 
> do you mean for each node?

No, the nodes are diskless. I plan to scale the cluster and 1TB of
storage is quite enough, even if I use 6 nodes, or 2x6 nodes. That's
actually what I know from my small experience in running CFD codes on 96
cores cluster. That's the reason for thinking about RAID in the first
place: create stable and good performing centralized storage for the
future number of the nodes (i.e. 12 nodes with 4 cores and 16 GB of RAM
each).

> 
>> for the system partitions on one disk,  and RAID 0 for the simulation
>> data placed on the same partitions on other two disks: after
>> post-processing, the data is gone anyway... and with a good backup
>> strategy, I don't have to worry about RAID0 not recovering from a disk
>> fail...
> 
> you're going to back up a raid0?

>From your question, I sense it's a bad idea... :) I have no clue, this
is the first time I'm doing this.

> in any case, I think you should consider net-booting and using the node
> disks as a 3x raid0.  if the local files are really transient, then 
> your startup script can just reinitialize the local disks every boot.
> (which would leave you with a working node even after a disk failure or two!)
> that's assuming you need or can benefit from the capacity or bandwidth.
> 

OK, I want a net boot because the nodes are diskless, the remaining
question is how to use 3 HDDs with RAID to get a performance boost where
I need it (like the /home where the data is written) and HA for the /
dir., in case of disk fail. Is this the right way of thinking?

>>> or better yet, don't bother booting of the local disk.  simply make your
>>> head/admin/master server reliable and net-boot.  it's likley that nodes
>>> won't be functional without the master server anyway, and net-booting
>>> doesn't mean you can't use the local disk for swap/scratch/...
>> Well, I want to configure the net boot for all diskless nodes and use
>> the master node and it's RAID for a performance gains with writing CFD
>> simulation data against network communication and to be able to scale
>> more easily.
> 
> I'm not sure I parse that.  net booting is orthogonal to whether or not
> you store data locally or over the net.  but yes, gigabit is somewhat 
> slower than a single modern disk, so local IO will win.
> 

Thanks.




More information about the Beowulf mailing list