[Beowulf] SATA II - PXE+NFS - diskless compute nodes

John Hearns john.hearns at streamline-computing.com
Sun Dec 10 00:17:14 PST 2006

Buccaneer for Hire. wrote:
>> I personally like the idea of putting one admin server in each rack.
>> they don't have to be fancy servers, by any means.
> *LOLOL*  At first I was guilty of the one things I am always getting on the
> other guys for-thinking too literally. I was going to say there is no room in
> the rack.  Of course, the server would not have to even be on the same
> side of the room.  :)

Depends on your network layout of course - but we Beowulf types like 
nice flat networks anyway.

> dhcp is not the problem (it is only critical during kickstart and for laptops moved
> in on a temporary basis. 
You'll have to go the dhcp route for booting with an NFS route.
You won't regret it anyway.

  tftp was a problem because of xinetd.  We bought 1024
> dual opt nodes in 16 racks.  When we received the first 6 racks we triggered them
> all for install-it did not work as expected. 

Always stagger booting by a few seconds between nodes.
Stops power surges (OK, unlikely) but more importantly gives all those 
little daemons time to shift their packets out to the interface.

>>> So now to figure out my next step.  I will need local space for logs and data/temp data files.
>> why would you want logs local?

> We have huge data sets, huge scratch data, huge library data (travel time sets)
> and I worry about network traffic.
I would be thinking about a your data storage and transport module.
Give thought to a parallel filesystem, Panasas would be good, or Lustre.
Or maybe iSCSI servers for the huge library data (if it is read only, 
then each of these admin nodes per rack could double up as an iSCSI 
server. Mirror the data between admin nodes, and rejig the fstab on a 
per-rack basis???)
Also motherboards have two gig E ports - if not using the second for MPI 
it could be a storage network.

For huge scratch data - you have local disks.
Either write a script to format the disk when you boot the node in 
NFS-root, the disk has a swap, a /tmp for scratch space and a local /var 
if you don't want to use a network syslog server.
Or leave the install as-is and mount the swap, tmp and var paritions.

      John Hearns
      Senior HPC Engineer
      Streamline Computing,
      The Innovation Centre, Warwick Technology Park,
      Gallows Hill, Warwick CV34 6UW
      Office: 01926 623130 Mobile: 07841 231235

More information about the Beowulf mailing list