[Beowulf] Dolphin PCI adapters suddenly invisible

Cameron Abrams cfa22 at drexel.edu
Thu Sep 15 17:30:37 PDT 2005


I have a 16-node dual-athlon cluster with dolphin/sci that has run 
beautifully for two years.  After a recent power cycle, 13 of 15 dolphin 
adapters were simply not recognized by the motherboards.  Each compute 
node has a Tyan Tiger MPX 2446N-4M, bios 4.03, dual Athlon MP 2200+, and
the dolphins d334-xxx (64bit/66MHz) with 2 ins and 2 outs.  I opened no 
cases, I changed no cables; the adapters just became invisible, from one 
day to the next.

I have picked one node at random to diagnose in detail.  So far, I have 
flashed the motherboard bios (both upgrade and retrograde), tried 
different PCI slots, tried other PCI cards in the same slots.  Other 
cards are recognized on the bus just fine; only the dolphin adapter is 
not.  I swapped an adapter from a bad node into a good node, and the 
good node found it.  The BIOS settings on the good node are identical to 
all the bad nodes.

I am stumped.  Has anyone ever experienced anything like this?
-- 
Cameron F Abrams, PhD
Assistant Professor
Department of Chemical and Biological Engineering
Drexel University
Philadelphia, Pennsylvania  USA
(v) 215-895-2231 (f) 215-895-5837



More information about the Beowulf mailing list