[Beowulf] Tyan S2882
gebhardt at hrz.uni-marburg.de
Fri Sep 29 04:30:52 PDT 2006
thanks for your reply!
On Thursday 28 September 2006 16:02, you wrote:
> I bet if you decode the MCE it will say uncorrectable ECC memory error.
You'd win that bet.
> memtest86 doesn't see correctable memory errors.
As far as I can remember, memtest86 includes tests that also detect
correctable ECC errors.
> It sounds like you have a pile of correctable (soft?) memory errors that
> occasionally become uncorrectable.
Yes, we have. But about 75% of our nodes never showed correctable ECC errors.
And some of them crashed. On the other side we have nodes with a bunch of
correctable ECC errors that have been stable since the first day.
More information about the Beowulf