[Beowulf] PetaBytes on a budget, take 2

Eugen Leitl eugen at leitl.org
Sat Jul 23 04:21:44 PDT 2011


On Fri, Jul 22, 2011 at 10:53:38PM -0700, Greg Lindahl wrote:
> On Fri, Jul 22, 2011 at 09:05:11AM +0200, Eugen Leitl wrote:
> 
> > Additional advantage of zfs is that it can deal with the higher
> > error rate of consumer or nearline SATA disks (though it can do
> > nothing against enterprise disk's higher resistance to vibration), 
> > and also with silent bit rot with periodic scrubbing (you can
> > make Linux RAID scrub, but you can't make it checksum).
> 
> And you can have a single zfs filesystem over 100s of nodes with
> petabytes of data? This thread has had a lot of mixing of single-node

I'm not sure how well pNFS (NFS 4.1) can do on top of zfs. Does anybode
use this in production?

> filesystems with cluster filesystems, it leads to a lot of confusion.
> 
> Hadoop has checksums and maybe scrubbing, and the NoSQL database that
> we wrote at blekko has both plus end-to-end checksums; it's hard to
> imagine anyone writing a modern storage system without those features.

Speaking of which, is there something easy and reliable open source 
for Linux that scales up to some 100 nodes, on GBit Ethernet?

There's plenty mentioned on
https://secure.wikimedia.org/wikipedia/en/wiki/List_of_file_systems#Distributed_parallel_fault-tolerant_file_systems
but which of them fit above requirement?

-- 
Eugen* Leitl <a href="http://leitl.org">leitl</a> http://leitl.org
______________________________________________________________
ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE



More information about the Beowulf mailing list