[Beowulf] how large of an installation have people used NFS with? would 300 mounts kill performance?
gmkurtzer at gmail.com
Wed Sep 9 12:32:03 PDT 2009
On Wed, Sep 9, 2009 at 10:40 AM, Rahul Nabar <rpnabar at gmail.com> wrote:
> Our new cluster aims to have around 300 compute nodes. I was wondering
> what is the largest setup people have tested NFS with? Any tips or
> comments? There seems no way for me to say if it will scale well or
> I have been warned of performance hits but how bad will they be?
> Infiniband is touted as a solution but the economics don't work out.
> My question is this:
> Assume each of my compute nodes have gigabit ethernet AND I specify
> the switch such that it can handle full line capacity on all ports.
> Will there still be performance hits as I start adding compute nodes?
> Why? Or is it unrealistic to configure a switching setup with full
> line capacities on 300 ports?
> If not NFS then Lustre etc options do exist. But the more I read about
> those the more I am convinced that those open another big can of
> worms. Besides, if NFS would work I do not want to switch.
NFS itself doesn't have any hard limits and I have seen clusters well
over a thousand nodes using it. With that said, you also need to
consider your application and user requirements and the budget
including administration costs as you architect your resource.
As an aside note, generally the more specialized or non-standard the
implementation, the more pressure you will put on administration
Keep in mind that the requirements of the system and budget need to
define the architecture of the system. NFS is a good choice and can be
suitable for systems much larger then 300 nodes. *BUT* that would
depend on what you are doing with the cluster, application IO
requirements, usage patterns, user needs, reliability/uptime goals,
Hope that helps. ;)
Greg M. Kurtzer
Chief Technology Officer
HPC Systems Architect
Infiscale, Inc. - http://www.infiscale.com
More information about the Beowulf