[Beowulf] PVFS on 80 proc (40 node) cluster
Robert Latham
robl at mcs.anl.gov
Mon Nov 1 08:46:44 PST 2004
On Sun, Oct 31, 2004 at 10:14:44PM -0500, Brian Smith wrote:
> PVFS2 has much improved fault tolerance over PVFS1 in that there can be
> redundant file nodes where as with PVFS1, if one node dropped dead, your
> FS was toast.
Just wanted to point out that through shared storage, 'heartbeat', and
engough hardware, you can have redundant PVFS1 and PVFS2 nodes. We do
not at this time have *software* redundancy. It's an area of active
research, though.
Please don't let the lack of software redundancy scare you off! Many
many sites have run PVFS and not found reliability to be a problem.
Your application can do its I/O, writing out checkpoints or reading
datafiles or whatever IO it does to PVFS. After your application
runs, move the data to tape or long-term storage at your liesure.
PVFS is fast scratch space, and as long as you treat it as such,
everything should work just fine.
> If you go to their web site, there should be plenty of documentation on
> how to set it up.
Yes. Also, feel free to take up this discussion on the PVFS
mailing lists.
==rob
--
Rob Latham
Mathematics and Computer Science Division A215 0178 EA2D B059 8CDF
Argonne National Labs, IL USA B29D F333 664A 4280 315B
More information about the Beowulf
mailing list