[Beowulf] Big storage

Leif Nixon nixon at nsc.liu.se
Thu Sep 13 13:19:34 PDT 2007

Loic Tortay <tortay at cc.in2p3.fr> writes:

> According to Bruce Allen:
>> This thread has been evolving, but I'd like to push it back a bit.
>> Earlier in the thread you pointed out the CERN study on silent data
>> corruption:
>> http://fuji.web.cern.ch/fuji/talk/2007/kelemen-2007-C5-Silent_Corruptions.pdf
> Actually, I was not the one who pointed out this study but I can't
> remember who did.

That was me, actually. We both saw the presentation at the last HEPiX
meeting, though. (We have already established we were there. 8^) )

> We are not using fsprobe on our X4500.
> There are two reasons:
>  . ZFS has built-in error detection (through "zpool scrub") and we are
>    (maybe naively) relying on this to detect and correct data corruption
>    which would be otherwise silent;

It *would* be interesting to see if the ZFS checksumming lives up to
its promises.

>  . due to some ZFS limitation (there are some :-) fsprobe does not
>    work reliably with ZFS.
> I'll try to be as concise as possible on the last point.
> In order to make sure that data are actually written to/read from disk
> and not from cache, fsprobe (optionally) uses Direct I/O (buffer
> cache bypass).
> Since Direct I/O is not supported by ZFS, you can't actually be certain
> that you're reading from disk and not from the cache (although you can
> get "some" guarantee that you actually write to the disk using "data
> synchronous" writes -- aka O_DSYNC or the "fsync()" family of POSIX
> functions).

I still think it would be interesting to see how often one gets data
corruption from other sources than disk errors (presuming ZFS is
perfect). Data corruption is data corruption even if its from bad
cache memory.

I will try to get fsprobe deployed on as much of the Nordic LHC
storage as possible.

Leif Nixon                       -            Systems expert
National Supercomputer Centre    -      Linkoping University

More information about the Beowulf mailing list