Need comments about cluster file systems

hanzl at noel.feld.cvut.cz hanzl at noel.feld.cvut.cz
Fri Nov 15 07:49:54 PST 2002


Just few clarifications for my not-so-carefully drafted words.

I wrote:
> (For sure others will point you to PVFS, which IMHO makes sense only
> if network card is quicker than local disk.)

I had no intention to insult any of PVFS authors, far from that - I
highly value their work and their nice and polite way of any
discussions related to PVFS. I took high PVFS credit for granted,
something so granted that I did not mention it when I wanted to refer
to just one specific situation: People on beowulf maillist often
recommend PVFS without respect to particular needs of person asking
the question.

I expected that the the original poster will receive bunch of messages
with good enough description of PVFS, as it usually happens, and just
wanted to add my diff against them: The information that certain usage
patterns may be served even better than PVFS can now. As he mentioned
local filesystems, I guessed his usage patterns might benefit from
persistent file cache, which may act in a manner similar to
hand-distributing files but with much less administrative hassle.

As I live in timezone with only short overlap with US working hours
and it takes hours for messages to get through the beowulf maillist, I
reacted even before I have seen these anticipated messages.

My apologies for any bad feelings I caused by this.

Walter B. Ligon III wrote:
> If your application is such that you know where you data needs to be
> before hand and you can run your computations on the same node, then
> you don't need anything more than the local file system. ...  If the
> data you need is on the local disk, PVFS gives you local disk speeds.
> If its not, you are limited by the network speed.  There is no way
> around that.

Well, I thing there is a way around - if one can force regular usage
patterns (e.g. by placing processes repeatedly accessing particular
data on the same node again and again if possible), the persistent
file cache could help a lot. Compared to the local filesystems only,
one could save a lot of human work.

> Our newest version of PVFS is designed so that things like that can
> be added as modules.

Thanks for hint, I will look into this possibility.


Donald Becker wrote:
> > I am looking for any working opensource solution for persistent file
> > chaching.
> 
> Ours is Open Source, but we don't document it or provide it separately.
> It's an internal part of our system, not visible to users.

Well than I do not feel too guilty that I did not know about it.

It is perfectly legal for OpenOurce (GPL?) software not to be on any
public server, of course. If it is GPL, it is also perfectly legal for
any of Scyld customers to put it on public server. So, anybody will
tell me - can I download it somewhere? At least some fragment?

Or do you refer to BProc mechanisms? Or any improved version of them?

> I just think that "working and actively being improved" should get
> extra credit.

Agreed, of course. Any my wording not looking like this is my fault,
sorry. I am looking for technical solutions, not for possibilities to
glorify vaporware.


> It's frustrating to hear people talk about how wonderful InterMezzo and
> Lustre _are_, and dismiss PVFS and GFS. Software that is not quite
> finished is always better and faster than software that already
> exists.  It only loses speed and features when reality looms.

I've said Coda, not Lustre. Coda is a reality. It is not my fault that
some things related to Lustre upset some people (for good reason,
probably).


Best Regards

Vaclav



More information about the Beowulf mailing list