[Beowulf] Tyan S2882 + Chenbro incompatibility (and solution)

Bill Broadley bill at cse.ucdavis.edu
Thu Dec 2 13:48:10 PST 2004

I was installing a new head node for a cluster, Tyan S2882 motherboard,
dual opteron, dual 3ware cards (installed on seperate pci-x busses),
and 16 400 GB SATA disks.

The headnode seemed mostly fine, fast, no error messages, but occasionally
during burn in it would hang.  At that point I couldn't turn the machine
off with the power switch.  When unplugged and replugged in the machine
would immediate spin up all fans, link lights on both network interfaces,
but not boot up, no video sync, nothing on the serial console.

Troubleshooting involved removing raid cards which fixed the problem,
but it turns out it wasn't the cards but the screws holding the PCI-x
cards in.  Further exploration showed that the motherboard has 12
mounting holes but the case had 13 standoffs.  One directly under the
2 raid cards did not have a matching motherboard hole.  After repeating
this effect several dozen times I confirmed that this was the problem.

A large pair of channel locks removed the offending motherboard post.  After
a careful wipedown of the inside of the case (to remove any conductive
particles) the server seems to be working well.

I'm not sure if I should blame Chenbro, Tyan, or the person who assembled
the machine, but in any case I figure I'd let people know about it.
I've heard occasional stories about flaky tyan motherboards and it's
possible this is a contributing factor.

So if you have any flakey tyan boards especially if in a chenbro enclosure
make sure that your mounting posts and motherboard mounting holes line up.

Bill Broadley
Computational Science and Engineering
UC Davis

