[Beowulf] Big storage

Tue Aug 28 00:44:19 PDT 2007

Andrew Piskorski wrote:
> 
> I believe Garth's whole point is that your assumption above is often
> NOT true.  He also seemed to imply that this is a function of the
> ineraction between the block-level RAID implementation and the file
> system, as his Panasas file system reputedly fixes this scary, "one
> small unrecoverable read during array rebuild kills your entire disk
> volume" failure mode.

Panasas is an object-based filesystem. Individual files are stored as 
either a mirror (small files) or as a RAID-5 across the filesystem.
If you should be in the middle of a reconstruction process and THEN run 
into bad blocks only the file which contains those bad blocks is affected.

Panasas also has "distributed sparing" which is kind of interesting.
Instead of allocating a complete spare disk (or disks) in a RAID array,
the filesystem allocates a certain percentage of space as spare space - 
a high water mark if you will. So there is no need to allocate 
individual spare blades, and the user can choose how much spare capacity 
is desirable (say you have ten storage blades, then 10% of space would 
be allocated. If you have 20 blades, and specify one blade's worth of 
redundancy you have 5% allocated)
When a reconstruction process starts this spare space is used up.
When a failed blade is replaced with a new one, the filesystem migrates 
to it over time, until the N% sparing is restored.