[Beowulf] SATA II - PXE+NFS - diskless compute nodes

Sun Dec 10 09:03:56 PST 2006

>> *LOLOL*  At first I was guilty of the one things I am always getting on the
>> other guys for-thinking too literally. I was going to say there is no room in
>> the rack.  Of course, the server would not have to even be on the same
>> side of the room.  :)
>
> Depends on your network layout of course - but we Beowulf types like 
> nice flat networks anyway.

Certainly much simpler if there is no technical reason not to.  In our case we have
a /20 of non-RFC1918 space which is its own seperate VLAN.  With NIS/DNS/NTP/etc
traffic the amount of bandwidth used for broadcasting is low. 

>> dhcp is not the problem (it is only critical during kickstart and for laptops moved
>> in on a temporary basis. 
>
> You'll have to go the dhcp route for booting with an NFS route.
> You won't regret it anyway.

I am researching it. One of the guys as been wanting to change all machine
to boot using DHCP while I have been resistant - believing one always stacks
the cards in favor of the least amount of impact when an "issue" occurs.

>>  tftp was a problem because of xinetd.  We bought 1024
>> dual opt nodes in 16 racks.  When we received the first 6 racks we triggered them
>> all for install-it did not work as expected. 
>
> Always stagger booting by a few seconds between nodes.
> Stops power surges (OK, unlikely) but more importantly gives all those 
> little daemons time to shift their packets out to the interface.

Will look into that. I believe the power system should allow for that.

[snip]

> Errrrrrrrr....
> I would be thinking about a your data storage and transport module.
> Give thought to a parallel filesystem, Panasas would be good, or Lustre.
> Or maybe iSCSI servers for the huge library data (if it is read only, 
> then each of these admin nodes per rack could double up as an iSCSI 
> server. Mirror the data between admin nodes, and rejig the fstab on a 
> per-rack basis???)

I have been pushing for a long to time for us to focus on a standard inside
the cluster and right now it is EMC.  I have already tried others but I have
almost 300TB of data and 400TB of space (the new EMC came in) and I
can just start moving things round.  But I always have a plan B (and then
C.)

[snip]

> For huge scratch data - you have local disks.
> Either write a script to format the disk when you boot the node in 
> NFS-root, the disk has a swap, a /tmp for scratch space and a local /var 
> if you don't want to use a network syslog server.
> Or leave the install as-is and mount the swap, tmp and var paritions.

That's the direction I am thinking. Over the holidays I will start working on
a plan with my grid documentation.

____________________________________________________________________________________
Have a burning question?  
Go to www.Answers.yahoo.com and get answers from real people who know.