newbie: still pvm problems.
Georgia Southern Beowulf Cluster Project
gscluster at hotmail.com
Mon Sep 18 07:03:08 PDT 2000
Hello,
I've sent a previous e-mail entitled "newbie: rsh and pvm problems" that
should be available in last week's mailings. With your help I've been able
to solve the rsh problem. It turned out to be that the wrong permissions
were set in a couple PAM files. However, pvm will not start on my nodes.
These nodes use the etherboot package to boot from a floppy, and NFS mount
their root filesystem from a server node. Each node has a unique filesystem
with the exception of the /home directory, which they share with the server,
all other nodes, and our development workstations. This helps make sure
that user profiles and files are consistant within the cluster. Also, the
/tmp directory has permissions 1777 so anyone can write to it. In testing
I've set up my $PVM_TMP to point to /tmp/<username> so that I can avoid
seeing other users pvmd and pvml files. This is all to describe my setup.
Now to the problem.
When I manually login (not remote, but with keyboard and monitor) a node and
try to start pvm with just typing "pvm" it gives the following lines:
libpvm [pid #] /tmp/<username>/pvmd.<uid>: No such file or directory
libpvm [pid #]: Console: can't start pvmd
My directory exists and there is not a thing there, which is verified
because I just created it. Furthermore, my pvml.<uid> file is created with
the following comments:
pvmd[pid #] date time mksocs() socket loclsock: Invalid argument
pvmd[pid #] date time pvmbailout(0)
When starting pvm remotely by addining it into an already running daemon
(say adding a node from our server node) I recieve the following messages:
PVM Daemon Files Found on <node>!
And it proceeds to tell me to delete any present pvmd.<uid> files or socket
files. However, nothing is present in my /tmp/<username> directory or the
/tmp directory (having been freshly deleted before trying this). Could
these errors be the result of NFS, or is there another file that is causing
the problems. I created the nodes from the /bin, /sbin, /lib, and other
directories of our server node tar'ed into a template directory, the
executables and libraries of which are hard linked by all other nodes.
Additionally, I use the bash/bash2 shell and all of my pvm variables are
declared in the .bashrc file in my $HOME directory.
Any help or guidance is very much appreciated.
Thank you,
Wes Wells
<><><><><><><><><><><><><><><><><><>
Georgia Southern University
Beowulf Cluster Project
gscluster at hotmail.com
_________________________________________________________________________
Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com.
Share information about yourself, create your own public profile at
http://profiles.msn.com.
More information about the Beowulf
mailing list