[Beowulf] PVFS on 80 proc (40 node) cluster

Robert Latham robl at mcs.anl.gov
Mon Nov 1 08:46:44 PST 2004

On Sun, Oct 31, 2004 at 10:14:44PM -0500, Brian Smith wrote:
> PVFS2 has much improved fault tolerance over PVFS1 in that there can be
> redundant file nodes where as with PVFS1, if one node dropped dead, your
> FS was toast.

Just wanted to point out that through shared storage, 'heartbeat', and
engough hardware, you can have redundant PVFS1 and PVFS2 nodes.  We do
not at this time have *software* redundancy.  It's an area of active
research, though.

Please don't let the lack of software redundancy scare you off!  Many
many sites have run PVFS and not found reliability to be a problem.
Your application can do its I/O, writing out checkpoints or reading
datafiles or whatever IO it does to PVFS.  After your application
runs, move the data to tape or long-term storage at your liesure.
PVFS is fast scratch space, and as long as you treat it as such,
everything should work just fine.

> If you go to their web site, there should be plenty of documentation on
> how to set it up.  

Yes.  Also, feel free to take up this discussion on the PVFS
mailing lists. 


Rob Latham
Mathematics and Computer Science Division    A215 0178 EA2D B059 8CDF
Argonne National Labs, IL USA                B29D F333 664A 4280 315B

More information about the Beowulf mailing list