distributed file systems
rross at mcs.anl.gov
Thu Sep 6 10:51:38 PDT 2001
There is no "best method" IMHO. PVFS is probably your best bet for a
scratch space for applications to store large data sets in, especially for
MPI-IO applications. It isn't good for home directories; it doesn't
cache, and all the metadata lookups make for very slow accesses in typical
NFS is probably your best bet for home directories, but it isn't good for
large data sets and parallel access both for performance (single I/O node,
limited protocol) and correctness (the cache isn't consistent and is
difficult to disable) reasons.
So I would use PVFS for scratch space. I would then ask myself if I
really NEED /home on all the nodes. It's only nine nodes...you can copy
executables out quickly. If you can stand the inconvenience, performance
will be better if you just run applications off local disks.
On Fri, 31 Aug 2001, Jon Tegner wrote:
> We have a small cluster consisting of nine nodes, and we are currently
> exporting /home from the master to every node using nfs.
> We have also tried using pvfs, using partitions from all nodes in one
> "parallel partition"- something which was slower than only using
> nfs (since we don't have access to the source of the codes we use, we
> cannot write in parallel). Maybe it would be better to only use two
> nodes for every parallel partition, i.e., n1 and n2 builds home1, n3
> and n4 builds home2 ... ?
> Haven't tried, but it seems that afs should be slightly faster than
> nfs, see
> My question now is if you have any suggestion of a "best method" for
> a distributed file system to use in a cluster environment.
> Jon Tegner
More information about the Beowulf