[Beowulf] I/O workload of an application in distributed file system
Robert Latham
robl at mcs.anl.gov
Mon Nov 26 08:06:03 PST 2007
On Thu, Nov 22, 2007 at 10:15:25AM -0500, Mark Hahn wrote:
> with that in mind, my opinion is that cluster IO testing should be
> a combination of:
> - parallel streaming IO to separate files - resembling a checkpoint,
> or an IO-intensive app reading, or an app where the user forgot to
> turn off debugging.
> - smallish metadata-heavy traffic like time(tar zxf;make;make clean).
The word 'distributed' in the subject is telling... I like to make a
distiction between 'distributed', 'cluster', and 'parallell' file
systems.
Distributed: uncorrdinated access among processes. Possibly over the
wide area. Total capacity is important, but performance is not.
Cluster: local access only. maybe homedir-style accesses (lots of
metadata operations, lots of small file creation/reading/writing --
unpack a tarball, compile a kernel). also has uncoordinated access
among many processes.
Parallel: a high performance file system for parallel applications
doing large amounts of I/O. Coordinated access, likely via MPI-IO.
This is verring a bit off topic from the original question...
I'd like to suggest that I/O to separate files, while certainly a
popular I/O workload, should be considered a legacy workload, or at
the very least not a high-performance workload.
Applications should be encouraged if at all possible to do their I/O
to a single large file. Supercompuer applications, further, should do
all their I/O through either MPI-IO or a high-level library on top of
MPI-IO (parallel-HDF5, parallel-NetCDF, etc).
Lots of files compilcates the data management problem and eliminiates
several optimization opportunities for the I/O software stack.
==rob
--
Rob Latham
Mathematics and Computer Science Division A215 0178 EA2D B059 8CDF
Argonne National Lab, IL USA B29D F333 664A 4280 315B
More information about the Beowulf
mailing list