[Beowulf] cluster storage design

Joe Landman landman at scalableinformatics.com
Wed Mar 23 12:36:41 PST 2005


Hi Brian:

Brian Henerey wrote:

> Hello all,
>
> I have a 32 node cluster with 1 master and 1 data storage server with 
> 1.5 TB’s of storage. The master used to have storage:/home mounted on 
> /home via NFS. I moved the 1.5TB RAID array of storage so it was 
> directly on the master. This decreased the time it took for our 
> program to run by a factor of 4. I read somewhere that mounting the 
> data to the master via NFS was a bad idea for performance, but am not 
> sure what the best alternative is. I don’t want to have to move data 
> on/off the master each time I run a job because this will slow it down 
> as more people are using it.
>

If your problems are I/O bound, and you have enough local storage on 
each compute node, and you can move the data in a reasonable amount of 
time, the local I/O will likely be the fastest solution. You have 
already discovered this when you moved to a local attached RAID. If you 
have multiple parallel reads/writes to the data from each compute node, 
you will want some sort of distributed system. If the master thread is 
the only one doing IO, then you want the fast storage where it is. NFS 
provides effectively a single point of data flow, and hence is a 
limiting factor (generally).

I presume most of your users start their runs on the head node itself 
(based upon your description). If this is the case and the method that 
you want the users to continue to use, then you shouldn't necessarily 
change it. If the file system is not broken, you dont need to fix it. If 
you want to increase the bandwidth to the file server, you might look at 
channel bonding some ethernets together. This works quite nicely in a 
number of cases, and it is cheap/easy to do. I might recommend this if 
you determine you need it....

... and that is the critical aspect. Are your runs slowing down as more 
users start running on the system? Can you identify the culprit (NFS, 
disk io, buffer cache, memory pressure, network traffic,...)? Basically 
the question is, do you need to do any changes, and if so, what changes 
make the most sense? To answer that, you need to watch how your system 
is being used, and identify the hotspots, as well as talk with users 
about their needs. More often than not, simple things (kernel 
tuning/tweaking, adding more memory, etc) can go a really long way to 
fix problems.

Joe

> I know there are probably many solutions but I’m curious what the 
> people on this list do. It seems to me that SAN’s are very expensive 
> compared to just building servers with 4 x 500GB hard drives. I’ve 
> considered just launching my lam-mpi jobs from whatever storage server 
> has the appropriate data on it, but this doesn’t seem ideal.
>
> How does performance compare from having the data local on the master 
> via running it off a PVFS?
>
> Thanks in advance,
>
> Brian Henerey
>
> Systems Analyst
>
> Cardiovascular Magnetic Resonance Laboratories
>
> Washington University Medical School
>
> 660 S. Euclid Ave
>
> Campus Box 8086
>
> St. Louis, MO 63110
>
> 314-454-8368 314-454-7490(Fax)
>
>------------------------------------------------------------------------
>
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>  
>





More information about the Beowulf mailing list