Disk reliability (Was: Node cloning)

kragen at pobox.com kragen at pobox.com
Sun May 27 01:33:32 PDT 2001


Josip Loncaric <josip at icase.edu> writes:
> JackM wrote:
> > You can try using hdparm to turn the DMA off.  Of course, it does slow
> > down data transfer rates considerably.
> 
> As Mark said, BadCRC only means that the transfer was retried.  If a few
> BadCRC messages are the only problem, I would not turn off DMA.

What size of CRCs are being used?  If it's a 32-bit CRC and the errors
involved are likely to involve several bits, I think your chances of
having an uncaught data error are only four billion to one.  Four
billion microseconds is about eighty minutes, a billion milliseconds
is about a month and a half, and four billion seconds is about 125
years.





More information about the Beowulf mailing list