Scyld, local access to nodes, and master node as compute node
Sean Dilda
agrajag at scyld.com
Thu May 24 11:37:28 PDT 2001
On Thu, 24 May 2001, Brian C Merrell wrote:
> On Thu, 24 May 2001, Sean Dilda wrote:
>
> > Is there any reason the program itself can't run itself in the special
> > way they want? Anything you can do with rlogin or rsh can be done with
> > bpsh, except for an interactive shell. However, this can be mimiced
> > through bpsh. If you can give me some idea of what they are wanting to
> > do, I might be able to help you find a way to do it without requiring an
> > interactive shell. Scyld clusters are designed to run background jobs
> > on all of the slave nodes, not to run login services for users on the
> > slave nodes.
> >
>
> Hmmm. I guess this warrants some background info.
>
> The cluster is not a new cluster. It was previously built by someone else
> who is now gone. The cluster master node crashed, taking the system and
> most of their data with it. I am now trying to rebuild the cluster. The
> cluster previously used RH6.1 stock and followed more of a NOW model than
> a beowulf model, although all the hardware was dedicated to the cluster,
> not on people's desks. I'm now trying to use Scyld's distro to bring the
> cluster back up. I'm pretty happy with it, and managed to get the master
> node up with a SCSI software RAID array, and a few test nodes up with boot
> floppies. Seems fine to me. BUT....
>
> There are three reasons that they want to be able to rlogin to the
> machines: 1) first, there are a number of people with independent
> projects who use the cluster. They are used to being able to simply login
> to the master, rlogin to a node, and start their projects on one or more
> nodes, so that they take up only a chunk of the cluster. 2) Also, at
> least one researcher was previously able to and wants to be able to
> continue to login to separate nodes and run slightly different (and
> sometimes non-parallelizable) programs on his data. 3) ALSO, they have
> code that they would rather not change.
Ok, I understand now. All of these things can be handled with bpsh.
Do you think these people will be happy with doing something like 'rsh
<node> <command>' instead of rsh'ing in to get a shell and then run the
command? If so, you could probablly get away with just symlinking
/usr/bin/rsh to /usr/bin/bpsh
>
> > It is possible to use BProc with a full install on every slave node
> > however this reduces a lot of the easy administration features we've
> > trying to put into our distro.
> >
>
> I just set this up, and realize what you mean. I had to statically define
> IP addresses, users, etc. At first it wasn't a pain, but I realized after
> the first two that doing all 24 would be. Even though it is now possible
> to rlogin to different nodes, it wasn't what I was hoping for. I imagine
> it will be particularly unpleasant when software upgrades need to be
> performed. :(
This is one of the advantages of our software. It is setup in such a
way that you don't have to do so much work to keep the slave nodes up to
date.
>
> I'm still hoping to find some happy medium, but I'm going to present these
> options to the group and see what they think. The problem is that they
> are mathematicians and physicists, not computer people. They really don't
> want to have to change, even though it seems to be the same.
>
> Also one thing I'm still trying to find a solution to: how can the nodes
> address each other? Previously they used a hosts file that had listings
> for L001-L024 (and they would like to keep it that way) I guess with the
> floppy method they don't have to, because the BProc software maps node
> numbers to IP addresses,
Perhaps you could write some sort of rsh replacement script that turns
the L001-L024 names into the BProc node numbers, then call bpsh. Would
that be a happy medium?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 232 bytes
Desc: not available
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20010524/26758715/attachment.sig>
More information about the Beowulf
mailing list