[bproc]MPI chokes

Erik Arjan Hendriks hendriks at hendriks.cx
Wed Mar 14 21:16:06 PST 2001


On Wed, Mar 14, 2001 at 04:44:29PM -0700, Art Edwards wrote:
> I've installed Scyld on a small cluster and I'm trying to
> run the test programs that come with beompi
> 
> The codes run on one node. However, when I try to run
> on multiple nodes I get the following error
> 
> jarrett/home/edwardsa>mpirun -np 2 pi3p
> p0_28682:  p4_error: net_create_slave: bproc_rfork: -1
>     p4_error: latest msg from perror: Invalid argument
> jarrett/home/edwardsa>bm_list_28683:  p4_error: interrupt SIGINT: 2
> 
> I have asked about this in a previous message, so here
> are two more specific questions.
> 
> The master node has a hostname that is not node0. The first
> slave node is, as far as beosetup, is node0. Is this a problem?

In BProc's terms, the nodes are numbered 0 through n-1.  The front end
is node -1.
 
> When beompi assigns nodes does it look at a machines file?
> Should I install a HOSTNAME file on each slave?

BProc doesn't use any host names anywhere so nothing involving
hostnames will affect whether or an rfork works.

There's some other MPI issue going on here.

- Erik




More information about the Beowulf mailing list