[Beowulf] NFS shared file system

Mon Dec 5 08:37:05 PST 2005

Hi Mark,

Thanks for your comments. We combined clusters into a single one in 
anticipation of jobs running on more than ~60 cpu and in a hope to get 
better utilization of the existing resources. While in 3x30 state the 
typical CPU load was ~1-5% and CFD IO was ~700-1000 req/s. It required a 
lot of nfs tuning but once tuned it worked fine. Now I am trying to find 
out what is the cheapest solution for  90-node cluster. One solution in 
20k-30k range we have been offered is a 3 TB fiber channel  nSTORE SAN 
with Montilio's RapidFile I/O engine.  Does anybody have experience with 
it? Lustre, GPFS, HP's SFS, Panasys, etc. will be our next  longterm step.

Thanks, Ted

Mark Hahn wrote:

>>Each cluster had its own head node and its own cheap, in-house build
>>RAID exported over GB NFS. Recently we combined the existing clusters
>>    
>>
>
>when it was in the 3x30 state, did you do any measurements of the raid's
>internal performance, and performance when under "normal" load by the nodes?
>also, have you characterized the IO load of your CFD application?
>
>  
>
>>into one and the first problem we have is with the mass storage,
>>occasionally it cannot handle the IO load. My question is if I buy a
>>commercial NAS what are the chances that after that I'll need to replace
>>GB with Mirinet (e.g.)? 
>>    
>>
>
>well, the better question is why you got rid of two of the IO nodes - 
>or did you?
>
>  
>
>>clusters but from what I read in this newsgroups my understanding is
>>that 90 nodes is a small cluster and I didn't expect scalability
>>problems at this level.
>>    
>>
>
>the traffic here is somewhat specialized, of course - people doing 16-node
>clusters are not having any problems, and so don't speak up ;)
>
>90 nodes is clearly enough to show real scaling problems if the load is 
>reasonably intensive and from multiple nodes simultaneously.  is it safe 
>to assume you've done the basic first steps in tuning (lots of nfsd's,
>perhaps also higher AC parameters on the client side, probably not using
>the default 32K packets?)
>
>  
>
>>If a commercial storage optimized for IO is a
>>solution what is the price I'm facing? Any recomendations?
>>    
>>
>
>depends on what your IO goals are.  do you insist on a single filesystem
>implemented across multiple server nodes?  if so, you have to look into 
>cluster-fs things like Lustre, GPFS, HP's SFS, Panasys, etc.  the overhead
>(dollars and brains) is nontrivial.
>
>I would probably split the workload across three independent NFS's,
>and also try some basic tuning.  these are cheap, easy to do and will
>definitely improve performance.
>
>more speculative things:
>
>	- use LACP or related techniques to provide more bandwidth out 
>	of the NFS server(s).  this will probably not improve the bandwidth
>	seen by a single node, but should come close to doubling the
>	aggregate.
>
>	- try out fscache - this is an add-on layer being promulgated by 
>	RH which creates a local disk cache to unload your NFS.
>
>  
>