[bproc]MPI chokes
Jag
agrajag at linuxpower.org
Thu Mar 15 07:44:48 PST 2001
On Thu, 15 Mar 2001, Arthur H. Edwards,1,505-853-6042,505-256-0834 wrote:
> Erik Arjan Hendriks wrote:
>
> > On Wed, Mar 14, 2001 at 04:44:29PM -0700, Art Edwards wrote:
> >
> >> I've installed Scyld on a small cluster and I'm trying to
> >> run the test programs that come with beompi
> >>
> >> The codes run on one node. However, when I try to run
> >> on multiple nodes I get the following error
> >>
> >> jarrett/home/edwardsa>mpirun -np 2 pi3p
> >> p0_28682: p4_error: net_create_slave: bproc_rfork: -1
> >> p4_error: latest msg from perror: Invalid argument
> >> jarrett/home/edwardsa>bm_list_28683: p4_error: interrupt SIGINT: 2
> >>
<snip>
> >
> > BProc doesn't use any host names anywhere so nothing involving
> > hostnames will affect whether or an rfork works.
> >
> > There's some other MPI issue going on here.
> >
> > - Erik
> >
>
> Thanks for the reply. The program dies in the PMPI_INIT phase. What
> should I be doing to figure this out?
Based on the error messages from your previous message, it looks like it
is trying to rfork to a node that is down. What does the output of
'bpstat' on your cluster look like?
Jag
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 232 bytes
Desc: not available
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20010315/11649afa/attachment.sig>
More information about the Beowulf
mailing list