[bproc]MPI chokes
Arthur H. Edwards,1,505-853-6042,505-256-0834
edwards at icantbelieveimdoingthis.com
Thu Mar 15 08:15:48 PST 2001
Jag wrote:
> On Thu, 15 Mar 2001, Arthur H. Edwards,1,505-853-6042,505-256-0834 wrote:
>
>
>> Erik Arjan Hendriks wrote:
>>
>>
>>> On Wed, Mar 14, 2001 at 04:44:29PM -0700, Art Edwards wrote:
>>>
>>>
>>>> I've installed Scyld on a small cluster and I'm trying to
>>>> run the test programs that come with beompi
>>>>
>>>> The codes run on one node. However, when I try to run
>>>> on multiple nodes I get the following error
>>>>
>>>> jarrett/home/edwardsa>mpirun -np 2 pi3p
>>>> p0_28682: p4_error: net_create_slave: bproc_rfork: -1
>>>> p4_error: latest msg from perror: Invalid argument
>>>> jarrett/home/edwardsa>bm_list_28683: p4_error: interrupt SIGINT: 2
>>>>
>>>
> <snip>
>
>>> BProc doesn't use any host names anywhere so nothing involving
>>> hostnames will affect whether or an rfork works.
>>>
>>> There's some other MPI issue going on here.
>>>
>>> - Erik
>>>
>>
>> Thanks for the reply. The program dies in the PMPI_INIT phase. What
>> should I be doing to figure this out?
>
> Based on the error messages from your previous message, it looks like it
> is trying to rfork to a node that is down. What does the output of
> 'bpstat' on your cluster look like?
>
>
> Jag
Here is the output from bpstat
jarrett/home/edwardsa>bpstat
Node Address Status
0 192.168.1.100 up
1 192.168.1.101 up
2 192.168.1.102 up
3 192.168.1.103 up
4 192.168.1.104 up
5 192.168.1.105 up
6 192.168.1.106 up
7 192.168.1.107 down
8 192.168.1.108 down
9 192.168.1.109 down
10 192.168.1.110 down
11 192.168.1.111 down
12 192.168.1.112 down
13 192.168.1.113 down
14 192.168.1.114 down
15 192.168.1.115 down
16 192.168.1.116 down
17 192.168.1.117 down
18 192.168.1.118 down
19 192.168.1.119 down
20 192.168.1.120 down
21 192.168.1.121 down
22 192.168.1.122 down
23 192.168.1.123 down
24 192.168.1.124 down
25 192.168.1.125 down
26 192.168.1.126 down
27 192.168.1.127 down
28 192.168.1.128 down
29 192.168.1.129 down
30 192.168.1.130 down
31 192.168.1.131 down
Art Edwards
More information about the Beowulf
mailing list