[Beowulf] Seeing ECC errors since upgraded from Opteron 246 to 275

Mark Hahn hahn at mcmaster.ca
Fri Aug 1 11:25:54 PDT 2008


> So I have 2 DL145-G2 nodes with 2 single-core 246 / 4GB each, and 2
> DL145-G2 nodes with 2 dual-core 275 / 4GB each.

it's worth making sure you have current bios installed.

> 07/28/2008 | 17:52:23 | Memory #0x02 | Uncorrectable ECC | Asserted

it may also be useful to run mcelog, which will tell you about 
any ongoing _correctable_ ECC activity.



More information about the Beowulf mailing list