[Beowulf] perl with OpenMPI gotcha?

David Mathog mathog at caltech.edu
Fri Nov 20 13:43:08 PST 2020


I'm hoping one of you has been to the end of this road already and can 
point out what is going wrong.

I have some perl scripts which have been carried along for a couple of 
decades now which use PVM to start simple jobs on the compute nodes, wait 
for them to finish (listing jobs as they close out), and then cleans 
up.  Since this is the only thing which PVM is used for it seemed like it
might be (way past) time to migrate that to MPI, specifically OpenMPI 
4.0.1, which is what is on the cluster.

There are apparently tricks required, either that, or the test script 
does not run on a single standalone machine, or perhaps OpenMPI is not 
configured right?

There are already modules for OpenMPI and bioperl, and I decided to 
install Parallel::MPI::Simple into the latter, since it holds all the perl
modules which were not installed with dnf on this CentOS 8 system.  Like so:

   module load bioperl
   module load OpenMPI
   cd /usr/common/src/perl_modules
   cpanm -l $ROOT_BIOPERL Parallel::MPI::Simple 2>&1 \
     | tee install_perl_parallel_mpi_simple_2020_11_20.log

(no errors or warnings).

There is a little test program "ic.pl" which comes with Parallel::MPI::Simple,
however just invoking it turns up that it cannot find Simple.so.  I have 
been down this road before with Perl and MPI with the "Maker" program - 
some libraries must be preloaded or they just will not be found by Perl. 
Once that is done all the missing library and symbol errors go away.  But it still does not run:

LD_PRELOAD=/usr/common/modules/el8/x86_64/software/bioperl/1.7.7-CentOS-vanilla/lib/perl5/x86_64-linux-thread-multi/auto/Parallel/MPI/Simple/Simple.so:/opt/ompi401/lib/libmpi.so 
$ROOT_BIOPERL/lib/perl5/x86_64-linux-thread-multi/Parallel/MPI/ic.pl
[poweredge:04423] *** An error occurred in MPI_Send
[poweredge:04423] *** reported by process [603979777,0]
[poweredge:04423] *** on communicator MPI_COMM_WORLD
[poweredge:04423] *** MPI_ERR_RANK: invalid rank
[poweredge:04423] *** MPI_ERRORS_ARE_FATAL (processes in this communicator 
will now abort,
[poweredge:04423] ***    and potentially your MPI job)


Any idea what might be wrong here?

Also, searching turned up very little information on using MPI with perl.
(Lots on using MPI with other languages of course.)
The Parallel::MPI::Simple module is itself almost a decade old.
We have a batch manager but I would prefer not to use it in this case.
Is there some library/method other than MPI which people typically use 
these days for this sort of compute cluster process control with Perl 
from the head node?

Thanks,

David Mathog





More information about the Beowulf mailing list