[Beowulf] mpich2 error
Ru-Zhen Li
r.li at qmul.ac.uk
Tue Jan 31 02:54:49 PST 2006
Dear all,
I kept having this error message, I couldnt find out why, anybody have similar experience? Thanks!
aborting job:
Fatal error in MPI_Barrier: Other MPI error, error stack:
MPI_Barrier(406): MPI_Barrier(MPI_COMM_WORLD) failed
MPIR_Barrier(76):
MPIC_Sendrecv(161):
MPIC_Wait(321):
MPIDI_CH3_Progress_wait(209): an error occurred while handling an event returned by MPIDU_Sock_Wait()
MPIDI_CH3I_Progress_handle_sock_event(489):
connection_recv_fail(1836):
MPIDU_Socki_handle_read(658): connection failure (set=0,sock=1,errno=104:Connection reset by peer)
rank 9 in job 1 cn117_42770 caused collective abort of all ranks
exit status of rank 9: killed by signal 9
rank 7 in job 1 cn117_42770 caused collective abort of all ranks
exit status of rank 7: killed by signal 9
rank 10 in job 1 cn117_42770 caused collective abort of all ranks
exit status of rank 10: return code 13
rank 11 in job 1 cn117_42770 caused collective abort of all ranks
exit status of rank 11: killed by signal 9
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20060131/74651db9/attachment.html>
More information about the Beowulf
mailing list