[Beowulf] RAID for home beowulf
Tomislav Maric
tomislav.maric at gmx.com
Sun Oct 4 04:08:14 PDT 2009
Mark Hahn wrote:
>> So, maybe the bold question to ask would be: what would be the best RAID
>> config for 3 HDDS and a max 6 node HPC cluster? Should I just use RAID 1
>
> do you mean for each node?
No, the nodes are diskless. I plan to scale the cluster and 1TB of
storage is quite enough, even if I use 6 nodes, or 2x6 nodes. That's
actually what I know from my small experience in running CFD codes on 96
cores cluster. That's the reason for thinking about RAID in the first
place: create stable and good performing centralized storage for the
future number of the nodes (i.e. 12 nodes with 4 cores and 16 GB of RAM
each).
>
>> for the system partitions on one disk, and RAID 0 for the simulation
>> data placed on the same partitions on other two disks: after
>> post-processing, the data is gone anyway... and with a good backup
>> strategy, I don't have to worry about RAID0 not recovering from a disk
>> fail...
>
> you're going to back up a raid0?
>From your question, I sense it's a bad idea... :) I have no clue, this
is the first time I'm doing this.
> in any case, I think you should consider net-booting and using the node
> disks as a 3x raid0. if the local files are really transient, then
> your startup script can just reinitialize the local disks every boot.
> (which would leave you with a working node even after a disk failure or two!)
> that's assuming you need or can benefit from the capacity or bandwidth.
>
OK, I want a net boot because the nodes are diskless, the remaining
question is how to use 3 HDDs with RAID to get a performance boost where
I need it (like the /home where the data is written) and HA for the /
dir., in case of disk fail. Is this the right way of thinking?
>>> or better yet, don't bother booting of the local disk. simply make your
>>> head/admin/master server reliable and net-boot. it's likley that nodes
>>> won't be functional without the master server anyway, and net-booting
>>> doesn't mean you can't use the local disk for swap/scratch/...
>> Well, I want to configure the net boot for all diskless nodes and use
>> the master node and it's RAID for a performance gains with writing CFD
>> simulation data against network communication and to be able to scale
>> more easily.
>
> I'm not sure I parse that. net booting is orthogonal to whether or not
> you store data locally or over the net. but yes, gigabit is somewhat
> slower than a single modern disk, so local IO will win.
>
Thanks.
More information about the Beowulf
mailing list