[Beowulf] Re: recommendation on crash cart for a cluster room:fullcluster KVM is not an option I suppose?

Buccaneer for Hire. buccaneer at rocketmail.com
Fri Oct 9 05:07:15 PDT 2009

--- On Thu, 10/8/09, Mark Hahn <hahn at mcmaster.ca> wrote:

> From: Mark Hahn <hahn at mcmaster.ca>
> Subject: Re: [Beowulf] Re: recommendation on crash cart for a cluster room:fullcluster KVM is not an option I suppose?
> To: "Beowulf Mailing List" <beowulf at beowulf.org>
> Date: Thursday, October 8, 2009, 10:44 PM
> > We have found that vendors are
> very helpful there. Setting the node to PXE
> >is a big help, specially for new types of nodes. We
> require the latest
> >firmware on the box (so I don't have to spend countless
> hours upgrading
> >firmware), and a spreadsheet with Mac address/Asset
> information. We use a
> >script to upload the info the DB.  Connect power,
> gigE,10gE,serial.  Then on
> >box and it starts installing...
> donno.  I can see that it would be easier to gain vendor cooperation
> at the beginning, before they have the cash, or soon enough after
> acceptance that they can still remember it ;)
> OTOH, I think we all need _ongoing_ mechanisms to handle these 
> issues, since they WILL crop up again during the cluster's lifespan.
> I'd probably not burn my "vendor capital" on this stuff.  then
> again, maybe it's use it or lose it...
> my first wish would be for some way to automate BIOS settings,
> node properties like MAC and SNs are easy enough to gather yourself. 
> flashing BIOS versions is easy enough too via pxe - but again,
> only as long as the flash doesn't fubar your settings...

While agree sometimes we have to get on the same page, I see a vendor as a partner in my success.  If the vendor has a different idea, we don't have a problem seeking new vendors.  If that vendor can not help you reach your goals, why squander the company's money?

It is also best to discuss this during the sales cycle.  If we buy X number of nodes, I don't think I need to be spending 2,3 or 4 hours on each of them to upgrade all the firmware-before I can put them to work. I consider that absurd. 

We have also never had an issue with a vendor getting this information to us.  They have to capture it anyway. In fact, we send then a spreadsheet with hostname and they fill in the rest of the information. We get the spreadsheet, I run a script for the that vendor and the DB is populated-a second.  After that, as the nodes are placed in the rack and connected, when power is applied it starts to install!!!  On paper. :)

But is works way more than it doesn't.


More information about the Beowulf mailing list