newbie: still pvm problems.
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Georgia Southern Beowulf Cluster Project gscluster at hotmail.comMon Sep 18 07:03:08 PDT 2000
- Previous message: how set MPICH to use ssh instead of rsh?
- Next message: warning: recent PIII's not SMP-able
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hello, I've sent a previous e-mail entitled "newbie: rsh and pvm problems" that should be available in last week's mailings. With your help I've been able to solve the rsh problem. It turned out to be that the wrong permissions were set in a couple PAM files. However, pvm will not start on my nodes. These nodes use the etherboot package to boot from a floppy, and NFS mount their root filesystem from a server node. Each node has a unique filesystem with the exception of the /home directory, which they share with the server, all other nodes, and our development workstations. This helps make sure that user profiles and files are consistant within the cluster. Also, the /tmp directory has permissions 1777 so anyone can write to it. In testing I've set up my $PVM_TMP to point to /tmp/<username> so that I can avoid seeing other users pvmd and pvml files. This is all to describe my setup. Now to the problem. When I manually login (not remote, but with keyboard and monitor) a node and try to start pvm with just typing "pvm" it gives the following lines: libpvm [pid #] /tmp/<username>/pvmd.<uid>: No such file or directory libpvm [pid #]: Console: can't start pvmd My directory exists and there is not a thing there, which is verified because I just created it. Furthermore, my pvml.<uid> file is created with the following comments: pvmd[pid #] date time mksocs() socket loclsock: Invalid argument pvmd[pid #] date time pvmbailout(0) When starting pvm remotely by addining it into an already running daemon (say adding a node from our server node) I recieve the following messages: PVM Daemon Files Found on <node>! And it proceeds to tell me to delete any present pvmd.<uid> files or socket files. However, nothing is present in my /tmp/<username> directory or the /tmp directory (having been freshly deleted before trying this). Could these errors be the result of NFS, or is there another file that is causing the problems. I created the nodes from the /bin, /sbin, /lib, and other directories of our server node tar'ed into a template directory, the executables and libraries of which are hard linked by all other nodes. Additionally, I use the bash/bash2 shell and all of my pvm variables are declared in the .bashrc file in my $HOME directory. Any help or guidance is very much appreciated. Thank you, Wes Wells <><><><><><><><><><><><><><><><><><> Georgia Southern University Beowulf Cluster Project gscluster at hotmail.com _________________________________________________________________________ Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com. Share information about yourself, create your own public profile at http://profiles.msn.com.
- Previous message: how set MPICH to use ssh instead of rsh?
- Next message: warning: recent PIII's not SMP-able
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
