[Beowulf] network filesystem

Robert Latham robl at mcs.anl.gov
Tue Mar 6 07:53:41 PST 2007


On Mon, Mar 05, 2007 at 11:08:28AM -0500, Mark Hahn wrote:

> writing to different sections of a file is probably wrong on any 
> networked FS, since there will inherently be obscure interactions 
> with the size and alignment of the writes vs client pagecache,

I'm rather surprised to see that sentiment on a mailing list for high
performance clusters :>

I would contend that writing to different sections of a file *must* be
supported by any file system deployed on a cluster.  How else would
you get good performance from MPI-IO?

PVFS, GPFS, and Lustre all suppoort simultaneous writes to different
sections of a file.  

> in my experience, people who expect it to "just work" have an
> incredibly naive model of how a network FS works (ie, write()
> produces an RPC direct to the server)

I agree that the POSIX API and consistency semantics make it difficult
to achieve high I/O rates for common scientific workloads, and that
NFS is probably not the best solution for those truly parallel workloads.

Fortunately,  there are good alternatives out there. 

==rob

-- 
Rob Latham
Mathematics and Computer Science Division    A215 0178 EA2D B059 8CDF
Argonne National Lab, IL USA                 B29D F333 664A 4280 315B



More information about the Beowulf mailing list