Locality and caching in parallel/distributed file systems

Joseph Landman landman at scalableinformatics.com
Tue Dec 3 03:46:29 PST 2002

On Tue, 2002-12-03 at 19:57, Andrew Fant wrote:
> Morning all,
> 	Lately, I have been thinking a lot about parallel filesystems in
> the most un-rigourous way possible.  Knowing that PVFS simply stripes the
> data across the participating filesystems, I was wondering if anyone had
> tried to apply caching technology and file migration capacities to a
> parallel/distributed filesytem in a manner analagous to SGI's ccNuma
> memory architecture.  That is, distributing files in the FS to various
> nodes, keeping track of where the accesses are coming from, and moving
> the file to another node if that is where some suitable percentage of the

cough cough <avaki> cough cough...

Distributed parallel file systems require distributed data and local
speed access to make any sense.  I am sure others may disagree, but any
file system that you need to shuttle metadata about will generally not
scale well (unless you have a NUMAlink like speed/latency, which pushes
the scaling wall way out, but it is still there).  Cluster file systems
have been the rage in the past as one of the next great things.  I guess
I advocate waiting and seeing for this, as I have not yet seen a
scalable distributed file system (and if someone knows of one, which is
not too painful, please let me know).  My definition of a scalable
distributed file system is, BTW, one that connects to every compute
node, and gives local I/O speed to simultaneous reads and writes (to the
same/different files) across the single namespace.  This def may not be
in line with others, but it is what I use to understand the issues.

The idea in building any scalable resource (net, computing, disk, etc)
is to avoid single points of information flow.  Maintaining metadata for
file systems represents exactly that.  You get hot-spot formation, and
start having to do interesting gymnastics to overcome it (if it is at
all possible to overcome).

Data motion is rapidly becoming one of the hardest issues to deal with. 
Good thread start there Andy!

Joseph Landman, Ph.D.
Scalable Informatics LLC
email:   landman at scalableinformatics.com
  web:   http://scalableinformatics.com
voice:  +1 734 612 4615
  fax:  +1 734 398 5774

More information about the Beowulf mailing list