Hi Andrew,

> I am looking at Lustre 2.4 currently and have it working in a test 
> environment (actually with minimal shouting and grinding of
> teeth).


> Taking a holistic approach: What does ZFS mean for HPC? I am
> excited about on the fly data compression and snapshotting however
> for most scientific datas I suppose dedupe is a bit pointless.

I've played with ZFS under Linux (both with FUSE and the via the ZFS
On Linux project kernel module) and whilst it is interesting the
licensing makes it a nightmare to integrate, so I've given up on it.
It just causes too much hassle when upgrading kernels.

I'm also not sure if dedup is as unimportant as we might hope it to be
- - we have a lot of separate projects here and I wouldn't mind betting
there is a non-trivial amount of duplication of data between them. At
$JOB-1 I know there were developers of various applications who kept
*lots* of duplicate copies of their SVN source trees around.. :-(

For about the same length of time (since before it got merged in to
the mainline) I've been using btrfs and whilst it is still quite young
(commit 4204617d142c0887e45fda2562cb5c58097b918e from David Sterba
recently removed the word "experimental" from the kernel config help
for what will be the 3.13 release) I've found it quite usable and
gives a number of the benefits of ZFS without the integration pain.
RAID 5/6 style support is certainly experimental, though RAID-1
support has been there from the early days.

> Are raid controllers a thing of the past?

I don't think so - as in most things it's horses for courses and I
think there's likely to still be cases where they are appropriate.

> how often to bits flip anyway?

Good question - how will you know unless you're checking? :-)

> What impact does this have?

Unintentional fuzz testing of HPC applications..

> Does checksumming save our data?

Well that will depend on whether your setup is to just detect
corruption, or be able to correct it too.

