[Beowulf] A couple of interesting comments

Tim Cutts tjrc at sanger.ac.uk
Fri Jun 6 11:35:25 PDT 2008


On 6 Jun 2008, at 6:45 pm, Perry E. Metzger wrote:

>
> Bill Broadley <bill at cse.ucdavis.edu> writes:
>>> 2.  BIOS had a couple of interesting defaults, including warn on
>>> keyboard error (Keyboard?  Not intentionally.  This is a compute
>>> node, and should never require a keyboard.  Ever.)  We also find the
>>> BIOS is set to boot from hard disk THEN PXE. But due to item 1,
>>> above, we never can fail over to PXE unless we load up a keyboard
>>> and monitor, and hit F12 to drop to PXE.
>>
>> Very strange standard for a server, let alone a cluster node.
>
> I would be less disturbed about such things if it was trivial to alter
> the BIOS settings in a semi-automated way -- say by booting some
> standalone program, or loading a file from a USB thumb drive. Then you
> could just go up to each box with a USB thumb drive, turn it on, and
> have it fix itself in a consistent way. However, the fact that you
> can't generally automate fixing BIOS settings makes all of this far
> more annoying.
>
> Anyone have any cool tricks for how to consistently set the BIOS on
> large numbers of boxes without requiring steps that humans can screw
> up easily?

Nope.  :-)  This is, in my view, one of the major disadvantages of PC  
clusters.  The crappy old BIOS that we're stuck with.

Here, we mostly get around this problem by using blade servers rather  
than pizza boxes.  Or at least using pizza boxes which have some form  
of command line access to a lights-out management processor that  
allows us to set the boot order, such as those on HP ProLiants and Sun  
X**** servers.

So with c-Class blades from HP, for example, I don't really have a  
problem - once the chassis is configured, I make them all PXE boot by  
ssh'ing into the Onboard administrator and typing:

set server boot first pxe all
poweron server all

Bingo, all 16 machines PXE boot at about 1 second intervals.  Job's a  
good'un.  As Joe says, you get what you pay for.   I don't think I've  
*ever* had to futz around with BIOS settings on any recent bladeserver  
(I used to have to on our old RLX bladeservers, which periodically got  
confused and lost all the CMOS settings, which required manual fixing  
in the BIOS).  But the IBM and HP stuff we use now, it's very rare  
indeed.

Tim


-- 
 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 



More information about the Beowulf mailing list