[Beowulf] Mpich problem
loc duong ding
mambom1902 at yahoo.com
Sun Jul 13 06:29:16 PDT 2008
Dear G.Vinodh Kumar,
I read your problem form internet.
hi,
i setup a two node cluster with mpich2-1.0.
the name of the master node is aarya
the name of the slave node is desktop2
i enabled the passwordless ssh session.
in the mpd.hosts, i included the name of both nodes.
the command, mpdboot -n 2 works fine.
the command, mpdtrace gives the name of both machines.
i copied the example program cpi on /home/vinodh/ on
both the nodes.
mpiexec -n 2 cpi gives the output,
Process 0 of 2 is on aarya
Process 1 of 2 is on desktop2
aborting job:
Fatal error in MPI_Bcast: Other MPI error, error
stack:
MPI_Bcast(821): MPI_Bcast(buf=0xbfffbf28, count=1,
MPI_INT, root=0, MPI_COMM_WORLD) failed
MPIR_Bcast(229):
MPIC_Send(48):
MPIC_Wait(308):
MPIDI_CH3_Progress_wait(207): an error occurred while
handling an event returned by MPIDU_Sock_Wait()
MPIDI_CH3I_Progress_handle_sock_event(1053):
[ch3:sock] failed to connnect to remote process
kvs_aarya_40892_0:1
MPIDU_Socki_handle_connect(767): connection failure
(set=0,sock=1,errno=113:No route to host)
rank 0 in job 1 aarya_40878 caused collective abort
of all ranks
exit status of rank 0: return code 13
but, the other example hellow works fine.
let me know, why theres an error for the program cpi.
Regards,
G. Vinodh Kumar
At present, I have the same problem when I install Mpich2.1.0.7.
I think that you have solved this problem. Could you please instruct me how to solve this problem?
I look forward to your reply.
Thank you first.
Sincerely,
DUong Dinh Loc.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20080713/dfd03e76/attachment.html>
More information about the Beowulf
mailing list