[bproc]MPI chokes

Jag agrajag@linuxpower.org
Thu, 15 Mar 2001 08:10:41 -0800


--Qf/2YuBwNTyt+peV
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Thu, 15 Mar 2001, Arthur H. Edwards,1,505-853-6042,505-256-0834 wrote:

> > Based on the error messages from your previous message, it looks like it
> > is trying to rfork to a node that is down.  What does the output of
> > 'bpstat' on your cluster look like?
> >=20
> >=20
> > Jag
>=20
> Here is the output from bpstat
>=20
> jarrett/home/edwardsa>bpstat
> Node    Address         Status
> 0       192.168.1.100   up
> 1       192.168.1.101   up
> 2       192.168.1.102   up
> 3       192.168.1.103   up
> 4       192.168.1.104   up
> 5       192.168.1.105   up
> 6       192.168.1.106   up
> 7       192.168.1.107   down
> 8       192.168.1.108   down
> 9       192.168.1.109   down

<snip>

Ok.. You seem to be running Scyld's PREVIEW release (27BZ-6).  At the
end of January, Scyld had an actual release (27BZ-7).  The 27BZ-7
release included updated software, including updates for the beompi,
which is Scyld's MPI package.

I never tried to run MPI programs on the preview release, but my guess
is that it is getting confused by all the "down" nodes.  I've played
with MPI on the 27BZ-7 release and have had no problems when there were
down nodes.  So, I would recommend to you that you upgrade to the latest
release.

Also, the reason you have so many "down" nodes is that you gave it a
large IP range to use for slave nodes.  If you want there to be not as
many "down" nodes (that are really nodes that just don't exist), you
should use the beosetup program, click on preferences, and adjust the IP
range so that there are as many IPs as there are slave nodes.

Hope this helps,


Jag

--Qf/2YuBwNTyt+peV
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.4 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE6sOmB+pq97aGGtXARAlguAJ9elryZCI/bv2nbPd31ouoVqbc5jACcDrfX
jhnIgRgppTsXMIlRIJitXoc=
=HJ4u
-----END PGP SIGNATURE-----

--Qf/2YuBwNTyt+peV--