Problem with Alpha and strange message.
Greg Lindahl
lindahl at conservativecomputer.com
Mon Oct 1 09:12:46 PDT 2001
On Mon, Oct 01, 2001 at 09:35:07AM -0600, Carlos Lopez wrote:
> Hello, we have a 4 node cluster with Alphas, and lately I've been
> recieving the next messages from the console of the master node:
>
> Sep 18 15:56:47 master kernel: TSUNAMI machine check: vector=0x630
> pc=0xfffffc0000333410 code=0x100000086
This is an Alpha question, not really related to beowulf. Here's a
table that describes what the machine checks mean:
Code Reason Example or Common Cause
==== ====== =======================
620 System Correctable correctable errors in the memory subsystem,
eg single bit ECC errors, detected async to
processor execution
630 Processor Correctable correctable cache and TLB errors, detected
internally by the processor
660 System Uncorrectable unrecoverable memory errors
670 Processor Uncorrectable unrecoverable cache or TLB errors, or
read of a non-existent I/O space location
If you frequently get 630's, I'd advise replacing the CPU.
g
More information about the Beowulf
mailing list