Scyld, local access to nodes, and master node as compute node
Brian C Merrell
brian at patriot.net
Thu May 24 11:07:04 PDT 2001
On Thu, 24 May 2001, Sean Dilda wrote:
> Is there any reason the program itself can't run itself in the special
> way they want? Anything you can do with rlogin or rsh can be done with
> bpsh, except for an interactive shell. However, this can be mimiced
> through bpsh. If you can give me some idea of what they are wanting to
> do, I might be able to help you find a way to do it without requiring an
> interactive shell. Scyld clusters are designed to run background jobs
> on all of the slave nodes, not to run login services for users on the
> slave nodes.
>
Hmmm. I guess this warrants some background info.
The cluster is not a new cluster. It was previously built by someone else
who is now gone. The cluster master node crashed, taking the system and
most of their data with it. I am now trying to rebuild the cluster. The
cluster previously used RH6.1 stock and followed more of a NOW model than
a beowulf model, although all the hardware was dedicated to the cluster,
not on people's desks. I'm now trying to use Scyld's distro to bring the
cluster back up. I'm pretty happy with it, and managed to get the master
node up with a SCSI software RAID array, and a few test nodes up with boot
floppies. Seems fine to me. BUT....
There are three reasons that they want to be able to rlogin to the
machines: 1) first, there are a number of people with independent
projects who use the cluster. They are used to being able to simply login
to the master, rlogin to a node, and start their projects on one or more
nodes, so that they take up only a chunk of the cluster. 2) Also, at
least one researcher was previously able to and wants to be able to
continue to login to separate nodes and run slightly different (and
sometimes non-parallelizable) programs on his data. 3) ALSO, they have
code that they would rather not change.
> It is possible to use BProc with a full install on every slave node
> however this reduces a lot of the easy administration features we've
> trying to put into our distro.
>
I just set this up, and realize what you mean. I had to statically define
IP addresses, users, etc. At first it wasn't a pain, but I realized after
the first two that doing all 24 would be. Even though it is now possible
to rlogin to different nodes, it wasn't what I was hoping for. I imagine
it will be particularly unpleasant when software upgrades need to be
performed. :(
I'm still hoping to find some happy medium, but I'm going to present these
options to the group and see what they think. The problem is that they
are mathematicians and physicists, not computer people. They really don't
want to have to change, even though it seems to be the same.
Also one thing I'm still trying to find a solution to: how can the nodes
address each other? Previously they used a hosts file that had listings
for L001-L024 (and they would like to keep it that way) I guess with the
floppy method they don't have to, because the BProc software maps node
numbers to IP addresses,
-brian
More information about the Beowulf
mailing list