[Beowulf] Surviving a double disk failure

Chris Samuel csamuel at vpac.org
Sun Apr 19 02:14:44 PDT 2009

----- "David Mathog" <mathog at caltech.edu> wrote:

> Thankfully I don't have to do this myself, not having data anywhere
> near that size to cope with, but it seems to me that backing up a
> nearly full 16TB RAID is likely to be a painful, expensive, exercise. 

Well if it's growing slowly to that level, as opposed to
just appearing as one big lump, then you're probably going
to be OK as long as you've got a robot and the software to
run it.

The restore, on the other hand, is another kettle of fish..

We found that with XFS that the limiting factor was its
less than stellar file creation, and our users have *lots*
of small files (and lots of old software builds that they've
forgotten about, grr).

What took almost a week to restore onto our IBM system (XFS
with an internal journal) took about 24 hours onto our new
system where we spec'd the server box to have a pair of SAS
drives (HW mirrored) to take external journals.

Very happy with that decision..

Christopher Samuel - (03) 9925 4751 - Systems Manager
 The Victorian Partnership for Advanced Computing
 P.O. Box 201, Carlton South, VIC 3053, Australia
VPAC is a not-for-profit Registered Research Agency

More information about the Beowulf mailing list