[Beowulf] Re: recommendation on crash cart for a cluster room:fullcluster KVM is not an option I suppose?

Greg Lindahl lindahl at pbm.com
Fri Oct 9 14:27:45 PDT 2009

On Fri, Oct 09, 2009 at 05:02:35PM -0400, Mark Hahn wrote:

> I'm not sure why you keep going on about bios updates.  they're easy and
> automatable, so not a real issue for clusters.  no offense!


I don't think you've had a look at that many vendors, right? Mostly a
single vendor?

I recently updated hundreds of SuperMicro nodes, because the existing
BIOS had a bug where a node with 64 gbytes of ram wouldn't boot. We
were upgrading memory to 64g.

Our reseller has some magic tools that let them generate a
bios-flashing boot disk with a non-standard default setting. But this
tool does not allow AHCI to be turned on. I had to manually manipulate
every single node after flashing. My crash cart came in quite handy.

On another note, dmidecode running against SuperMicro's Phoenix BIOS
correctly indicates whether ECC is turned on in the BIOS -- I found 2
nodes which were incorrectly configured this way. But a newer node
with Nehalem has an AMI Bios, and dmidecode always reports ECC off.
Oh, well. Neither one captures serial numbers, but, I use the mac addr
as a mobo serial number.

-- greg

