[Beowulf] Big storage
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Jerome, Ron Ron.Jerome at nrc-cnrc.gc.caThu Apr 17 05:49:58 PDT 2008
- Previous message: [Beowulf] Big storage
- Next message: [Beowulf] Big storage
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
For what it's worth, I have 4 of those Supermicro 16 drive chassis's each having a single Areca 1160 card. They have been running without issue for about a year now (touch wood). I also just build a 48 drive box using an AIC chassis and 3 Areca 1261 cards, but that has not been put into service yet. _________________________________________ Ron Jerome National Research Council Canada M-2, 1200 Montreal Road, Ottawa, Ontario K1A 0R6 Government of Canada _________________________________________ > -----Original Message----- > From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] > On Behalf Of Bruce Allen > Sent: April 17, 2008 2:39 AM > To: Gerry Creager > Cc: beowulf at beowulf.org; Bruce Allen > Subject: Re: [Beowulf] Big storage > > Hi Gerry, > > > Areca replacement; RAID rebuild (usually successful); backup; Areca > > replacement with 3Ware controller or CoRAID (or JetStor) shelf; > create > > new RAID instance; restore from backup. > > > > Let's just say we lost confidence. > > I understand. Was this with 'current generation' controllers and > firmware > or was this two or three years ago? It's my impression that (when used > with compatible drives and drive backplanes) the latest generation of > Areca hardware is quite stable. > > Cheers, > Bruce > > > > Bruce Allen wrote: > >> What was needed to fix the systems? Reboot? Hardware replacement? > >> > >> On Wed, 16 Apr 2008, Gerry Creager wrote: > >> > >>> We've had two fail rather randomly. The failures did cause disk > >>> corruption but it wasn't an undetected/undetectable sort. They > started > >>> throwing errors to syslog, then fell over and stopped accessing > disks. > >>> > >>> gerry > >>> > >>> Bruce Allen wrote: > >>>> Hi Gerry, > >>>> > >>>> So far the only problem we have had is with one Areca card that > had a bad > >>>> 2GB memory module. This generated lots of (correctable) single > bit > >>>> errors but eventually caused real problems. Could you say > something > >>>> about the reliability issues you have seen? > >>>> > >>>> Cheers, > >>>> Bruce > >>>> > >>>> > >>>> On Wed, 16 Apr 2008, Gerry Creager wrote: > >>>> > >>>>> We've used AoE (CoRAID hardware) with pretty good success (modulo > one > >>>>> RAID shelf fire that was caused by a manufacturing defect and > dealt with > >>>>> promptly by CoRAID). We've had some reliability issues with > Areca cards > >>>>> but no data corruption on the systems we've built that way. > >>>>> > >>>>> gerry > >>>>> > >>>>> Bruce Allen wrote: > >>>>>> Hi Xavier, > >>>>>> > >>>>>>>>>> PPS: We've also been doing some experiments with putting > >>>>>>>>>> OpenSolaris+ZFS on some of our generic (Supermicro + Areca) > 16-disk > >>>>>>>>>> RAID systems, which were originally intended to run Linux. > >>>>>> > >>>>>>>>> I think that DESY proved some data corruption with such > >>>>>>>>> configuration, so they switched to OpenSolaris+ZFS. > >>>>>> > >>>>>>>> I'm confused. I am also talking about OpenSolaris+ZFS. What > did > >>>>>>>> DESY try, and what did they switch to? > >>>>>> > >>>>>>> Sorry, I am indeed not clear. As far as I know, DESY found data > >>>>>>> corruption using Linux and Areca cards. They moved from linux > to > >>>>>>> OpenSolaris and ZFS, avoiding other corruption. This has been > >>>>>>> discussed in HEPiX storage workgroup. However, I can not speak > on > >>>>>>> their behalf at all. I'll try to get you in touch with someone > more > >>>>>>> aware of this issue, as my statements lack of figures. > >>>>>> > >>>>>> I think that would be very interesting to the entire Beowulf > mailing > >>>>>> list, so please suggest that they respond to the entire group, > not just > >>>>>> to me personally. Here is an LKML thread about silent data > corruption: > >>>>>> http://kerneltrap.org/mailarchive/linux-kernel/2007/9/10/191697 > >>>>>> > >>>>>> So far we have not seen any signs of data corruption on > Linux+Areca > >>>>>> systems (and our data files carry both internal and external > checksums, > >>>>>> so we would be sensitive to this). > >>>>>> > >>>>>> Cheers, > >>>>>> Bruce > >>>>>> _______________________________________________ > >>>>>> Beowulf mailing list, Beowulf at beowulf.org > >>>>>> To change your subscription (digest mode or unsubscribe) visit > >>>>>> http://www.beowulf.org/mailman/listinfo/beowulf > >>>>> > >>>>> > >>> > >>> > > > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf
- Previous message: [Beowulf] Big storage
- Next message: [Beowulf] Big storage
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
