[Beowulf] Surviving a double disk failure
Joshua Baker-LePain
jlb17 at duke.edu
Fri Apr 10 13:48:41 PDT 2009
On Fri, 10 Apr 2009 at 1:15pm, David Mathog wrote
> Thankfully I don't have to do this myself, not having data anywhere near
> that size to cope with, but it seems to me that backing up a nearly full
> 16TB RAID is likely to be a painful, expensive, exercise.
>
> Going with tape first...
>
> The fastest tape drives that I know of are Ultrium 4's at 120 MB/s. In
> theory that could copy 1GB every 8.3 seconds, 1TB every 8300 seconds (
> AKA 138 minutes, or a bit over 2 hours), and for that 16 TB data set,
> something over 32 hours. Except that there is no tape with that
> capacity, Max listed is still 800 GB, so it would take 20 tapes. And
> really obtaining a sustained 120MB/s from the RAID to the tape is likely
> extremely challenging. In any case, it looks like this calls for a tape
> robot of some sort, with many drives in it. Not cheap. On the plus
> side, transporting a box of 20 tape cartridges to "far away" is not
> particularly difficult, and they are fairly impervious to abuse during
> shipment.
*snip*
> Since all of the obvious options are so slow, I expect most sites are
> doing incremental backups. Which is fine, until the day comes when one
> has to restore the entire data array from two year's worth of
> incremental backups. Or maybe folks carry the tape incremental backups
> to the offsite backup RAID and apply them there?
>
> Is there an easier/faster/cheaper way to do all of this?
I currently backup a bit more than 16TB to an LTO3 library and don't find
it that painful. I use AMANDA and break the data down into bite-size
chunks. AMANDA handles spreading these chunks out over the whole backup
cycle, so that each night's backup is about the same size (and so takes
about the same amount of time). Each "chunk" gets a full dump once per
cycle and incrementals in between. Yes, there is some bookkeeping to be
done when the "chunks" change drastically in size. But it really does
mostly run itself. A full restore would indeed take some time, but this
is academia. We have time.
As for cost, no, the library wasn't cheap (neither was it ridiculously
expensive either, though -- again, academia). Once that's acquired,
though, adding capacity (tapes) is cheap and easy.
--
Joshua "been using AMANDA for a *long* time" Baker-LePain
QB3 Shared Cluster Sysadmin
UCSF
More information about the Beowulf
mailing list