[Beowulf] mcelog output, interpretation?

Mark Hahn hahn at mcmaster.ca
Mon Aug 18 15:03:05 PDT 2008


> /var/log/messages.  One of them had 29 machine checks logged, all of
> them variants of this:

where the variation encompassed nearby addresses?

> These had built up at about 1 per month over the last couple of years.

1/month is not a concern, IMO.  the main reason to run mcelog is to
avoid the situation of having enough corrected mce's that you run 
into uncorrectable or undetected ones.  1/month is a very low rate.

> There seems to be an issue with the Northbridge, but exactly what that

NB is used here in a very general sense - it's referring to the onchip
memory controller, not a literal external chip.  I don't think there's 
any involvement of video.



More information about the Beowulf mailing list