[Beowulf] I/O workload of an application in distributed file system

Robert Latham robl at mcs.anl.gov
Mon Nov 26 08:06:03 PST 2007

On Thu, Nov 22, 2007 at 10:15:25AM -0500, Mark Hahn wrote:
> with that in mind, my opinion is that cluster IO testing should be
> a combination of:
> 	- parallel streaming IO to separate files - resembling a checkpoint,
> 	or an IO-intensive app reading, or an app where the user forgot to
> 	turn off debugging.
> 	- smallish metadata-heavy traffic like time(tar zxf;make;make clean).

The word 'distributed' in the subject is telling... I like to make a
distiction between 'distributed', 'cluster', and 'parallell' file

Distributed:  uncorrdinated access among processes.  Possibly over the
wide area. Total capacity is important, but performance is not.

Cluster: local access only.  maybe homedir-style accesses (lots of
metadata operations, lots of small file creation/reading/writing --
unpack a tarball, compile a kernel).  also has uncoordinated access
among many processes.

Parallel: a high performance file system for parallel applications
doing large amounts of I/O.  Coordinated access, likely via MPI-IO.

This is verring a bit off topic from the original question... 

I'd like to suggest that I/O to separate files, while certainly a
popular I/O workload, should be considered a legacy workload, or at
the very least not a high-performance workload.

Applications should be encouraged if at all possible to do their I/O
to a single large file.  Supercompuer applications, further, should do
all their I/O through either MPI-IO or a high-level library on top of
MPI-IO (parallel-HDF5, parallel-NetCDF, etc).

Lots of files compilcates the data management problem and eliminiates
several optimization opportunities for the I/O software stack.


Rob Latham
Mathematics and Computer Science Division    A215 0178 EA2D B059 8CDF
Argonne National Lab, IL USA                 B29D F333 664A 4280 315B

More information about the Beowulf mailing list