[Beowulf] distributed file storage solution?
bill at cse.ucdavis.edu
Mon Dec 11 17:53:58 PST 2006
Eric Thibodeau wrote:
> You can look into OpenAFS but be warned that you have to know infrastructure software quite well (LDAP+kerberos). It's cross-platform, can be distributed but don't think it's up to multiple writes on different mirrors though.
Indeed. There are many tough compromises in distributed filesystems. Alas
there are many conflicting goals. Coherency vs performance is a big one, you
pretty much get one or the other. Locking is another ugly one, databases
and some applications assume bit range locking which is sometimes available,
sometimes not. Many unix programs assuming posix locking, again sometimes
available. So, unfortunately it's easy to ask for a distributed filesystem
which does not exist.
I'll provide my current brain dump on the various pieces I've been tracking,
I'm sure there are some inaccuracies included, but hopefully they are small
ones. As always comments and corrections welcome.
A high level overview of opanafs:
* Openafs is distributed, but not p2p.
* performs well (assuming cache friendliness, and a single peer accessing
the same files/directories)
* scales well (for reads, because RO volumes can be replicated)
* has a universal namespace
* places little trust in a peer (getting root on a client != ability to
read all files)
* allows for transparent volume migration (the client doesn't complain when a
volume is migrated)
* perfect coherency (via a subscription model)
* It also supports linux, OSX, and Windows (among others).
* relatively complex.
NFS in contrast:
* Isn't distributed (unless you count automount)
* has loose coherency (poll based)
* No replication (corrections?)
* Doesn't scale easily
* Volume migration isn't easy (nfs4 claims to enable this, I've yet to see it
demonstrated in the real world).
* Is mostly unix specific (Microsoft had an NFS client but MS EoL'd it?)
* relatively simple
* client server
* scales extremely well, seems popular on the largest of clusters.
* Can survive hardware failures assuming more than 1 block server is connected
to each set of disks
* unix only.
* relatively complex.
* Client server
* scales well
* can not survive a block server death.
* unix only
* relatively simple.
* designed for use within a cluster.
* claims scalability to billions of users
* Highly available/byzantine fault tolerant
* in prototype stage
* Requires use of an API (AFAIK it is not available as a transparently mounted
So the end result (from my skewed perspective) is:
* NFS is hugely popular, easy, not very secure (at least by default), poor
coherency, but for things like sharing /home within a cluster it works
reasonably well. Seems most appropriate for LAN usage. Diskless to most
implies NFS (and works well within a cluster or LAN).
* Lustre and PVFS2 are popular in clusters for sharing files in larger
clusters where more than single file server worth of bandwidth is required.
Both I believe scale well with bandwidth but only allow for a single
metadata server so will ultimately scale only as far as single machine
for metadata intensive workloads (such as lock intensive, directory
intensive, or file creation/deletion intensive workloads). Granted this
also allows for exotic hardware solutions (like solid state storage) if you
really need the performance.
* AFS is popular for internet wide file service, researchers love the ability
to run an application that requires 100 different libraries anywhere in the
world. Sysadmins love it because then can migrate volumes without having
to notify users or schedule downtime. I believe performance is usually
somewhat less than NFS within a cluster (because of higher overhead), and
usually significantly better outside a cluster (better caching and
I'm less familiar with the various commercial filesystems like ibrix.
Hopefully others will expand and correct the above.
More information about the Beowulf