[Beowulf] perl with OpenMPI gotcha?

Fri Nov 20 13:58:10 PST 2020

On 11/20/20 4:43 PM, David Mathog wrote:

[...]

>
> Also, searching turned up very little information on using MPI with perl.
> (Lots on using MPI with other languages of course.)
> The Parallel::MPI::Simple module is itself almost a decade old.
> We have a batch manager but I would prefer not to use it in this case.
> Is there some library/method other than MPI which people typically use 
> these days for this sort of compute cluster process control with Perl 
> from the head node?

I can't say I've ever used Perl and MPI.  I suppose it is doable, but if 
you were doing it, I'd recommend encapsulating it with FFI::Platypus 
(https://metacpan.org/pod/FFI::Platypus).

This however, doesn't seem tp be your problem per se.  Your problem 
sounds like "how do I launch a script on N compute nodes at once, and 
wait for it to complete".

If I have that correct, then you want to learn about pdsh 
(https://github.com/chaos/pdsh and info here: 
https://www.rittmanmead.com/blog/2014/12/linux-cluster-sysadmin-parallel-command-execution-with-pdsh/ 
).

I write most of my admin scripts in perl, and you can use pdsh as a 
function within them.

However ...

MCE::Loop is your friend.

Combine that with something like this:

     $mounts=`ssh -o ConnectTimeout=20 $node grep o2ib /proc/mounts`;

and you can get pdsh-like control directly in Perl without invoking pdsh.

The general template looks like this:

    #!/usr/bin/env perl

    use strict;
    use MCE::Loop;

    MCE::Loop->init(
        max_workers => 25, chunk_size => 1
    );

    my $nfile=shift;

    # grab file contents into @nodes array
    my @nodes;
    chomp(@nodes = split(/\n/,`cat $nfile`));

    # looping over nodes, max_workers at a time
    mce_loop {
        my ($mce, $chunk_ref, $chunk_id) = @_;
        # do stuff to node $_
    } @nodes;

This will run 25 copies (max_workers) of the loop body over the @nodes 
array.  Incorporate the ssh bit above in the #do stuff area, and you get 
basically what I think you are asking for.

FWIW, I've been using this pattern for a few years, most recently on 
large supers over the past few months.

>
> Thanks,
>
> David Mathog
>
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf

-- 
Joe Landman
e: joe.landman at gmail.com
t: @hpcjoe
w: https://scalability.org
g: https://github.com/joelandman
l: https://www.linkedin.com/in/joelandman

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://beowulf.org/pipermail/beowulf/attachments/20201120/9e6ead9c/attachment.html>