[Beowulf] SATA II - PXE+NFS - diskless compute nodes
Buccaneer for Hire.
buccaneer at rocketmail.com
Sun Dec 10 09:03:56 PST 2006
>> *LOLOL* At first I was guilty of the one things I am always getting on the
>> other guys for-thinking too literally. I was going to say there is no room in
>> the rack. Of course, the server would not have to even be on the same
>> side of the room. :)
>
> Depends on your network layout of course - but we Beowulf types like
> nice flat networks anyway.
Certainly much simpler if there is no technical reason not to. In our case we have
a /20 of non-RFC1918 space which is its own seperate VLAN. With NIS/DNS/NTP/etc
traffic the amount of bandwidth used for broadcasting is low.
>> dhcp is not the problem (it is only critical during kickstart and for laptops moved
>> in on a temporary basis.
>
> You'll have to go the dhcp route for booting with an NFS route.
> You won't regret it anyway.
I am researching it. One of the guys as been wanting to change all machine
to boot using DHCP while I have been resistant - believing one always stacks
the cards in favor of the least amount of impact when an "issue" occurs.
>> tftp was a problem because of xinetd. We bought 1024
>> dual opt nodes in 16 racks. When we received the first 6 racks we triggered them
>> all for install-it did not work as expected.
>
> Always stagger booting by a few seconds between nodes.
> Stops power surges (OK, unlikely) but more importantly gives all those
> little daemons time to shift their packets out to the interface.
Will look into that. I believe the power system should allow for that.
[snip]
> Errrrrrrrr....
> I would be thinking about a your data storage and transport module.
> Give thought to a parallel filesystem, Panasas would be good, or Lustre.
> Or maybe iSCSI servers for the huge library data (if it is read only,
> then each of these admin nodes per rack could double up as an iSCSI
> server. Mirror the data between admin nodes, and rejig the fstab on a
> per-rack basis???)
I have been pushing for a long to time for us to focus on a standard inside
the cluster and right now it is EMC. I have already tried others but I have
almost 300TB of data and 400TB of space (the new EMC came in) and I
can just start moving things round. But I always have a plan B (and then
C.)
[snip]
> For huge scratch data - you have local disks.
> Either write a script to format the disk when you boot the node in
> NFS-root, the disk has a swap, a /tmp for scratch space and a local /var
> if you don't want to use a network syslog server.
> Or leave the install as-is and mount the swap, tmp and var paritions.
That's the direction I am thinking. Over the holidays I will start working on
a plan with my grid documentation.
____________________________________________________________________________________
Have a burning question?
Go to www.Answers.yahoo.com and get answers from real people who know.
More information about the Beowulf
mailing list