Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] mpich2 error

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Ru-Zhen Li r.li at qmul.ac.uk
Tue Jan 31 02:54:49 PST 2006


Dear all,

I kept having this error message, I couldnt find out why, anybody have similar experience? Thanks!

aborting job:
Fatal error in MPI_Barrier: Other MPI error, error stack:
MPI_Barrier(406): MPI_Barrier(MPI_COMM_WORLD) failed
MPIR_Barrier(76):
MPIC_Sendrecv(161):
MPIC_Wait(321):
MPIDI_CH3_Progress_wait(209): an error occurred while handling an event returned by MPIDU_Sock_Wait()
MPIDI_CH3I_Progress_handle_sock_event(489):
connection_recv_fail(1836):
MPIDU_Socki_handle_read(658): connection failure (set=0,sock=1,errno=104:Connection reset by peer)
rank 9 in job 1  cn117_42770   caused collective abort of all ranks
  exit status of rank 9: killed by signal 9
rank 7 in job 1  cn117_42770   caused collective abort of all ranks
  exit status of rank 7: killed by signal 9
rank 10 in job 1  cn117_42770   caused collective abort of all ranks
  exit status of rank 10: return code 13
rank 11 in job 1  cn117_42770   caused collective abort of all ranks
  exit status of rank 11: killed by signal 9
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.scyld.com/pipermail/beowulf/attachments/20060131/74651db9/attachment.html


More information about the Beowulf mailing list