[Beowulf] problem with execution of cpi in two node cluster

Vinodh gvinodh1980 at yahoo.co.in
Tue Jan 11 23:27:08 PST 2005

     i setup a two node cluster with mpich2-1.0.

the name of the master node is aarya
the name of the slave node is desktop2

i enabled the passwordless ssh session.

in the mpd.hosts, i included the name of both nodes.

the command, mpdboot -n 2 works fine.

the command, mpdtrace gives the name of both machines.

i copied the example program cpi on /home/vinodh/ on
both the nodes.

mpiexec -n 2 cpi gives the output,

Process 0 of 2 is on aarya
Process 1 of 2 is on desktop2
aborting job:
Fatal error in MPI_Bcast: Other MPI error, error
MPI_Bcast(821): MPI_Bcast(buf=0xbfffbf28, count=1,
MPI_INT, root=0, MPI_COMM_WORLD) failed
MPIDI_CH3_Progress_wait(207): an error occurred while
handling an event returned by MPIDU_Sock_Wait()
[ch3:sock] failed to connnect to remote process
MPIDU_Socki_handle_connect(767): connection failure
(set=0,sock=1,errno=113:No route to host)
rank 0 in job 1  aarya_40878   caused collective abort
of all ranks
  exit status of rank 0: return code 13

but, the other example hellow works fine.

let me know, why theres an error for the program cpi.

G. Vinodh Kumar

