[Beowulf] PetaBytes on a budget, take 2

Greg Lindahl lindahl at pbm.com
Fri Jul 22 22:53:38 PDT 2011

On Fri, Jul 22, 2011 at 09:05:11AM +0200, Eugen Leitl wrote:

> Additional advantage of zfs is that it can deal with the higher
> error rate of consumer or nearline SATA disks (though it can do
> nothing against enterprise disk's higher resistance to vibration), 
> and also with silent bit rot with periodic scrubbing (you can
> make Linux RAID scrub, but you can't make it checksum).

And you can have a single zfs filesystem over 100s of nodes with
petabytes of data? This thread has had a lot of mixing of single-node
filesystems with cluster filesystems, it leads to a lot of confusion.

Hadoop has checksums and maybe scrubbing, and the NoSQL database that
we wrote at blekko has both plus end-to-end checksums; it's hard to
imagine anyone writing a modern storage system without those features.

-- greg

