[Beowulf] Big storage
Leif Nixon
nixon at nsc.liu.se
Fri Sep 14 02:21:14 PDT 2007
Loic Tortay <tortay at cc.in2p3.fr> writes:
> During the last HEPiX meeting, Peter Kelemen mentionned something told
> to him by a ZFS developer (Jeff Bonwick, if I'm not mistaken) about
> data corrupted by a Fibre Channel HBA during transfer between disk and
> host. ZFS, reportedly, detected (and corrected) the corruption.
> Of course a ZFS developer may be biased.
AFAIU, ZFS is designed specifically to handle such situations, but I'd
like to see large scale tests over a range of different hardware.
> I'm probably mis-remembering some of the technical details about this,
> since they seem quite unlikely now (something about the laser beam
> being somehow "corrupted", but I think this would be detected by the
> Fibre Channel link protocols or upper layers checksums).
Yeah, I guess it should. But we recently lost 11 TB data due to a FC
switch port silently trashing a small proportion of the data passing
through it. (Quite possibly ZFS would have saved us.) And I've seen
three similiar incidents at other places in the last few months. So I
have turned up my cynicism knob yet a few notches.
--
Leif Nixon - Systems expert
------------------------------------------------------------
National Supercomputer Centre - Linkoping University
------------------------------------------------------------
More information about the Beowulf
mailing list