[Beowulf] errors while testing machines

akhtar Rasool akhtar_samo at yahoo.com
Fri Dec 10 21:10:44 PST 2004


After the extraction of MPICH in /usr/local

 

1- tcsh               

2- ./configure –with-comm=shared --prefix=/usr/local

3-  make

4-  make install

5-  util/tstmachines

in the 5th step error was

Errors while trying to run  rsh 192.168.0.25 –n /bin/ls  /usr/local/mpich/mpich-1.2.5.2/mpichfoo     unexpected response from 192.168.0.25

 

n      > /bin/ls: /usr/local/mpich/mpich-1.2.5.2/mpichfoo:

n      no such file or directory

The ls test failed on some machines.

This usually means that u donot have a common filesystem on all of the machines in your machines list; MPICH requires this for mpirun (it is possible to handle this in a procgroup file; see the……)

Other possible problems include:-

The remote shell command rsh doesnot allow you to run ls.

See the doc abt remote shell & rhosts

 

You have common filesystem, but with inconsistent names

See the doc on the automounter fix

1 error were encountered while testing the machines list for LINUX

only these machines seem to be available

host1

 


 

 

    

now since this is only a two node cluster host1 is the server on to which MPICH is being installed. & 192.168.0.25 is the client…..

rsh on both nodes is logging freely…….

On the server side the file    “ machines.LINUX  “ contains   

-192.168.0.25

-host1

Kindly help

   

 

Akhtar


		
---------------------------------
Do you Yahoo!?
 The all-new My Yahoo! – What will yours do?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20041210/b80a0605/attachment.html>


More information about the Beowulf mailing list