[Beowulf] errors while testing machines
akhtar Rasool
akhtar_samo at yahoo.com
Fri Dec 10 21:10:44 PST 2004
After the extraction of MPICH in /usr/local
1- tcsh
2- ./configure with-comm=shared --prefix=/usr/local
3- make
4- make install
5- util/tstmachines
in the 5th step error was
Errors while trying to run rsh 192.168.0.25 n /bin/ls /usr/local/mpich/mpich-1.2.5.2/mpichfoo unexpected response from 192.168.0.25
n > /bin/ls: /usr/local/mpich/mpich-1.2.5.2/mpichfoo:
n no such file or directory
The ls test failed on some machines.
This usually means that u donot have a common filesystem on all of the machines in your machines list; MPICH requires this for mpirun (it is possible to handle this in a procgroup file; see the
)
Other possible problems include:-
The remote shell command rsh doesnot allow you to run ls.
See the doc abt remote shell & rhosts
You have common filesystem, but with inconsistent names
See the doc on the automounter fix
1 error were encountered while testing the machines list for LINUX
only these machines seem to be available
host1
now since this is only a two node cluster host1 is the server on to which MPICH is being installed. & 192.168.0.25 is the client
..
rsh on both nodes is logging freely
.
On the server side the file machines.LINUX contains
-192.168.0.25
-host1
Kindly help
Akhtar
---------------------------------
Do you Yahoo!?
The all-new My Yahoo! What will yours do?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20041210/b80a0605/attachment.html>
More information about the Beowulf
mailing list