[Beowulf] Big storage

Loic Tortay tortay at cc.in2p3.fr
Thu Aug 30 07:14:35 PDT 2007


According to Bruce Allen:
[...]
>
> ------------------------------------------------------------------
>
> During a READ operation, if the RAID controller finds an unreadable
> (uncorrectable) disk sector, then it will immediately reconstruct the
> missing data for that sector using redundant data from the rest of the
> array, and WRITE that data to the unreadable (uncorrectable) sector to
> force sector reallocation if needed by the disk.
>
> In addition, the RAID controller will perform a continous or regular (at
> least daily) background scan of all disk sectors to identify and repair
> any unreadable sectors as described above.
>
> ------------------------------------------------------------------
>
> I think it is a mistake to purchase a RAID controller without these
> features.  The absence of these features is the main reason that I don't
> think Linux software RAID is very good.
>
> Question for the Sun/ZFS experts: Does the Sun X4500 (Thumper) do what I
> describe above?
>
The 6 disk controllers in the X4500 are not RAID controllers but simple
SATA disk controllers.

All RAID operations are done in software by ZFS (which acts as both a
filesystem and a volume manager).

ZFS has a "scrub" command that does a background scan of a pool (a set
of disks), it's not done automatically by default but can be automated
very easily.

The "scrub" also verifies the parity/checksum of the data blocks.
There is a example of this in one the ZFS demos on the OpenSolaris web
site: <http://www.opensolaris.org/os/community/zfs/demos/selfheal/>.

The data found to be damaged during a "scrub" is corrected/repaired
(immediatly) using the various parity/checksum/mirror information
available (see the "zpool" man page on the OpenSolaris web site).

ZFS also has more redundancy than just RAID-5 or RAID-6 (when using
"raidz2" in the latter case), especially in the newer ZFS versions.
The "ditto blocks" for instance, allow you to have multiple copies of
the same data or metadata.

Another nice feature is the "parity check on read" (similar to what DDN
disk controllers do).

I am not a Sun employee or share holder, we just have many X4500 (soon
more than 100, probably around 130 at the end of the year).

We've had X4500s in "production" for almost a year and they have
proved to be very reliable, very fast and globally pretty cheap.


Loïc.
-- 
| Loïc Tortay <tortay at cc.in2p3.fr> -     IN2P3 Computing Centre     |



More information about the Beowulf mailing list