[Beowulf] NMI (Non maskable interrupts)
Steven Truong
midair77 at gmail.com
Mon Mar 17 11:49:14 PDT 2008
Dear, all. We recently bought some dual quadcore AMD Barcelona nodes
with Asus KFSN4-DRE motherboard and installed Rocks Cluster 4.3,
CentOS 5.1 on these machines.
What we found have irked us in terms of the number of NMI generated.
#cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5
CPU6 CPU7
0: 11468997 11550969 11551029 11550982 11550374 11549932
11549991 11553108 IO-APIC-edge timer
8: 0 0 0 0 0 0
0 0 IO-APIC-edge rtc
9: 0 0 0 0 0 0
0 0 IO-APIC-level acpi
177: 0 0 0 0 0 0
0 0 IO-APIC-level ohci_hcd
185: 0 0 0 0 0 0
0 0 IO-APIC-level ehci_hcd
193: 0 149313 392 38104 36736 0
1 77870 IO-APIC-level libata
201: 0 0 0 0 0 0
0 0 IO-APIC-level libata
233: 29715082 0 0 0 0 0
0 0 PCI-MSI eth0
NMI: 658519 686187 682474 687981 690017 689957
685692 588203
LOC: 92324693 92324694 92324693 92324692 92324692 92324687
92324689 92324681
ERR: 0
MIS: 0
# uptime
11:38:50 up 10 days, 16:27, 1 user, load average: 7.99, 7.98, 7.99
>From my understanding, NMI is not good since the processors really
have to handle these interrupts right away and these might degrade the
performance of the nodes. From what I read, NMI are usually generated
by bad hardwares or memory issues and I would like to know how to find
out what causes these NMI... Could you please point me to the right
direction in finding out more about this?
Thank you.
More information about the Beowulf
mailing list