distributed storage
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Robert Ross rross at mcs.anl.govSat Jun 1 20:31:07 PDT 2002
- Previous message: License Server Redundancy [WAS: Re: LSF and Hyperthreading]
- Next message: Power controlers... on another note.
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi, What do you want to do with all that space? Store large datasets? Store usenet news? The expected workload has a HUGE impact on your choices, because it says things about a) how many nodes files need to be spread across to get the bandwidth you need, and b) what the access patterns to the file system might look like. You've mentioned that you are concerned about availability in the presence of node failures. That makes good sense. And for PVFS, if a node that has your file's data goes down, then you can't get to that file. That's a little different than your description, and the difference could be important. If, for example, you really just want to use the space and it's ok to only get a single link's worth of bandwidth, then you could set up a PVFS volume that puts files on single servers. Then any given file has at most two nodes on which it is dependent (metadata server and the node on which the data is stored). Most data would still be available in most failure cases. Likewise you could set up the system to stripe to only two I/O servers and so on. It's not the ideal solution by any means, but it is a good tradeoff between availability and performance given the packages available. There aren't too many choices at the "free" price point. Cross-mounted NFS volumes would be the other one that I can think of :). Rob --- Rob Ross, Mathematics and Computer Science Division, Argonne National Lab On Thu, 30 May 2002, Yudong Tian wrote: > Hi everyone! > I'd like to hear your expert opinion on how to effectively use the > distributed storage capacity of a cluster. Suppose you have hundreds > of nodes, each with a local hard disk, thus the accumulated capacity > is huge. Aslo you have large amount of data to be stored. How can we > reliably take advantage of the storage on the nodes? > I checked out PVFS. It is nice, but it seems to me the reliability > is an issue -- if one of the storage nodes goes down, the whole FS is > not available (correct me if I am wrong). CXFS and GFS are not our > options because they cost $$. Any suggestions on how we can best use > the capacity, while having no problems when one or two nodes crash? > > Thanks. > ------------------------------------------------------- > Falun Dafa: Truthfulness Benevolence Forbearance > http://www.falundafa.org > ------------------------------------------------------- > Yudong Tian, Ph.D. GSFC/NASA (301) 504-5825 > >
- Previous message: License Server Redundancy [WAS: Re: LSF and Hyperthreading]
- Next message: Power controlers... on another note.
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
