[Beowulf] I/O workload of an application in distributed file system

Lombard, David N dnlombar at ichips.intel.com
Tue Nov 27 07:05:23 PST 2007

On Mon, Nov 26, 2007 at 10:06:03AM -0600, Robert Latham wrote:
> The word 'distributed' in the subject is telling... I like to make a
> distiction between 'distributed', 'cluster', and 'parallell' file
> systems.  
> Distributed:  uncorrdinated access among processes.  Possibly over the
> wide area. Total capacity is important, but performance is not.
> Cluster: local access only.  maybe homedir-style accesses (lots of
> metadata operations, lots of small file creation/reading/writing --
> unpack a tarball, compile a kernel).  also has uncoordinated access
> among many processes.
> Parallel: a high performance file system for parallel applications
> doing large amounts of I/O.  Coordinated access, likely via MPI-IO.

Hmmm.  Your "distributed" class could also require a high-performance
parallel file system.  Consider a parallel "application" transformed
into a set of EP "jobs" all accessing the same very large files.

David N. Lombard, Intel, Irvine, CA
I do not speak for Intel Corporation; all comments are strictly my own.

More information about the Beowulf mailing list