[Beowulf] network filesystem

Robert Latham robl at mcs.anl.gov
Tue Mar 6 07:53:41 PST 2007

On Mon, Mar 05, 2007 at 11:08:28AM -0500, Mark Hahn wrote:

> writing to different sections of a file is probably wrong on any 
> networked FS, since there will inherently be obscure interactions 
> with the size and alignment of the writes vs client pagecache,

I'm rather surprised to see that sentiment on a mailing list for high
performance clusters :>

I would contend that writing to different sections of a file *must* be
supported by any file system deployed on a cluster.  How else would
you get good performance from MPI-IO?

PVFS, GPFS, and Lustre all suppoort simultaneous writes to different
sections of a file.  

> in my experience, people who expect it to "just work" have an
> incredibly naive model of how a network FS works (ie, write()
> produces an RPC direct to the server)

I agree that the POSIX API and consistency semantics make it difficult
to achieve high I/O rates for common scientific workloads, and that
NFS is probably not the best solution for those truly parallel workloads.

Fortunately,  there are good alternatives out there. 


Rob Latham
Mathematics and Computer Science Division    A215 0178 EA2D B059 8CDF
Argonne National Lab, IL USA                 B29D F333 664A 4280 315B

More information about the Beowulf mailing list