Dear All,<br>
<pre>I kept having this error message, I couldnt find out why, anybody have similar experience? <br><br>Fatal error in MPI_Barrier: Other MPI error, error stack:<br>MPI_Barrier(406): MPI_Barrier(MPI_COMM_WORLD) failed<br>
MPIR_Barrier(76):<br>MPIC_Sendrecv(152):<br>MPIC_Wait(321):<br>MPIDI_CH3_Progress_wait(209): an error occurred while handling an event returned by MPIDU_Sock_Wait()<br>MPIDI_CH3I_Progress_handle_sock_event(489):<br>connection_recv_fail(1836):
<br>MPIDU_Socki_handle_read(658): connection failure (set=0,sock=2,errno=104:Connection reset by peer)<br>aborting job:<br><br>but in 7 nodes run fine, and not errors <br><br>can you help me ?<br><br>thanks!<br></pre>