newbie: rsh and pvm problems
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Robert G. Brown rgb at phy.duke.eduWed Sep 13 05:19:52 PDT 2000
- Previous message: newbie: rsh and pvm problems
- Next message: ScaLAPACK RPMS available at Scyld FTP site
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Tue, 12 Sep 2000, Georgia Southern Beowulf Cluster Project wrote: > Hello, > > I'm part of a small undergrad team building a poor-man's beowulf out of > surplus computers (P200 to P166 for nodes). Each of the nodes are diskless > and use the etherboot package to get a kernel image and boot an NFS-mounted > root partition (unique for each node) and NFS-mount a /home partition that > is shared across several computers (nodes, a server, and workstations). Our > problem is that we can start pvm on our server, but it will not allow us to > add any of the workstations or nodes. Also, the nodes will not start pvm > (saying that a pvmd.<uid> file is present, but it is not there, honest). > I've made sure that rsh works across all nodes, server, and workstations in > user space and that pvm works on the workstations and the server. The > .rhosts files allow for each computer to access each other without any other > authentication. Additionally, since all computers share a single /home > directory, every computer shares the same .rhosts files. One oddity is that > the workstation will add the server under pvm, but not vice versa. I hope > someone can enlighten me and that the info above is specific enough without > being too overbearing. I find it easier to write more instead of less. All > suggestions are welcome. The latest version of pvm has a debug mode that tells you exactly where it fails and why. At a guess, you are failing for one of two reasons. The most likely one is that key pvm environment variables or paths aren't set when your server attempts to rsh pvmd on the clients. This can easily happen within /bin/bash or /bin/sh because environments are not passed by rsh and sh init files in /etc are typically not executed on rsh's either. The less likely one is that /tmp isn't writable on your nodes (but is on your server) or that you are making the mistake of sharing a single writable /tmp across several clients, so a race condition is created that prevents more than one client from starting up. Obviously, every client needs its own writable /tmp. My personal suggestion is to: a) dump rsh (which sucks in so many ways anyway) in favor of ssh, which is now at last totally legal since RSA jumped the gun and put the key encryption patent in the public domain. Duke was anticipating this and is already moving to totally eliminate rsh and telnet and ftp within the entire campus network. ssh is measurably more expensive than rsh, but the extra expense is almost certainly irrelevant to pvm -- so it takes you 5 seconds to spawn and start up a large job instead of 1 or 2, who cares (as long as the large job runs for a few thousand seconds or more, your marginal cost is way under a percent). If you use ssh, you can create /etc/environment and put PVM_ROOT=/usr/share/pvm3 XPVM_ROOT=/usr/share/pvm3/xpvm PVM_RSH=/usr/bin/ssh in it, and these variables will then be set for all users for all ssh invocations. You will need to learn to set up ssh so that password-free ssh works across all clients but that is really not difficult. b) Be sure you get a 3.4 pvm revision that is later than (IIRC) February of this year as it has the new debug features and supports the PVM_RSH variable. If you get the 6.2 PowerTools pvm RPM, it has all of this stuff and creates a stub shell script in /usr/bin to eliminate path problems. You still have to create the /etc/environment file or make sure all users have these variables set in their .???rc file for their shell of choice. I'd recommend doing this whether or not you go for ssh. c) If you want to get on the bleeding edge, visit the scyld.com website (Scyld is also the host of the beowulf.org website) and check out their "bproc" offering. This is the beowulf-specific, extremely low overhead alternative to rsh. It makes no particular attempt to be secure (in the sense of encrypting traffic, etc.), but it eliminates most of the overhead of a remote shell and has lots of potential for fabulosity. I suspect that inside a year or two it will evolve into the glue that converts a pile of PC's into a "true" supercomputer with something like a unified operating system. Curiously, when I check out this website myself I can see links to beostatus and beosetup under their software link but cannot find bproc itself. I'm sure it is there somewhere, though. rgb > > Thank you, > > Wes Wells > > <><><><><><><><><><><><><><><><><><> > Georgia Southern University > Beowulf Cluster Project > gscluster at hotmail.com > > _________________________________________________________________________ > Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com. > > Share information about yourself, create your own public profile at > http://profiles.msn.com. > > > _______________________________________________ > Beowulf mailing list > Beowulf at beowulf.org > http://www.beowulf.org/mailman/listinfo/beowulf > -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
- Previous message: newbie: rsh and pvm problems
- Next message: ScaLAPACK RPMS available at Scyld FTP site
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
