[Beowulf] PVM does not spawn to other nodes
Robert G. Brown
rgb at phy.duke.edu
Fri Aug 13 10:50:24 PDT 2004
On Wed, 4 Aug 2004, Patrick Begou wrote:
> Where is located your slave program ?
> It should be in $HOME/pvm3/bin/$PVM_ARCH
> $HOME is your home directory on the node (could be nfs mounted)
> $ARCH is the node architecture: LINUX for an ia32 linux node, RS6K for
> an IBM RS6000 node....etc
There are also some oddities in PVM associated with specific network
interfaces. For example, I have been able to reproducibly get error
messages like this (and failed tasks) when trying to run a PVM job
STARTING from my laptop connected by wireless to my home network.
Note how odd this is. The interface is wireless, sure, but as far as
userspace software is concerned this is just another /dev/eth* device --
it has an IP number etc and in all ways behaves just like TCP/IP over
ethernet, just slower and with more randomly dropped packets as I move
To make matters worse, I can -- reproducibly -- USE the laptop as a
SLAVE node in the same computations. Needless to say, ssh is enabled
correctly in both directions, the same distro is on all systems, same
version of PVM, PVM starts perfectly on wireless and wired nodes...
I went as deep into the cause of failure as PVM permits with its debug
options and could determine only that some networking command was
failing on the wireless interface (only). I'm >>guessing<< that this is
due to a lack of support for e.g. multicasting or the like at the level
of the wireless access point itself, but it is very difficult to know
for sure without trying to debug pvm at the source code level.
So I don't know why you're getting the message, but it could be related
to hardware -- a switch or NIC that doesn't implement some required
feature that isn't terribly well documented...
Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
More information about the Beowulf