[bproc]MPI chokes

Arthur H. Edwards,1,505-853-6042,505-256-0834 edwards@icantbelieveimdoingthis.com
Thu, 15 Mar 2001 08:46:55 -0700


Erik Arjan Hendriks wrote:

> On Wed, Mar 14, 2001 at 04:44:29PM -0700, Art Edwards wrote:
> 
>> I've installed Scyld on a small cluster and I'm trying to
>> run the test programs that come with beompi
>> 
>> The codes run on one node. However, when I try to run
>> on multiple nodes I get the following error
>> 
>> jarrett/home/edwardsa>mpirun -np 2 pi3p
>> p0_28682:  p4_error: net_create_slave: bproc_rfork: -1
>>     p4_error: latest msg from perror: Invalid argument
>> jarrett/home/edwardsa>bm_list_28683:  p4_error: interrupt SIGINT: 2
>> 
>> I have asked about this in a previous message, so here
>> are two more specific questions.
>> 
>> The master node has a hostname that is not node0. The first
>> slave node is, as far as beosetup, is node0. Is this a problem?
> 
> In BProc's terms, the nodes are numbered 0 through n-1.  The front end
> is node -1.
>  
> 
>> When beompi assigns nodes does it look at a machines file?
>> Should I install a HOSTNAME file on each slave?
> 
> BProc doesn't use any host names anywhere so nothing involving
> hostnames will affect whether or an rfork works.
> 
> There's some other MPI issue going on here.
> 
> - Erik
> 
> 
> 

Thanks for the reply. The program dies in the PMPI_INIT phase. What 
should I be doing to figure this out?

Art Edwards