[Beowulf] [EXTERNAL] Re: perl with OpenMPI gotcha?

Benson Muite benson_muite at emailplus.org
Sat Nov 21 03:59:32 PST 2020


GNU Parallel ( http://www.gnu.org/software/parallel/ ) might allow for 
similar workflows

On 11/21/20 3:56 AM, Lux, Jim (US 7140) via Beowulf wrote:
> If Joe has interpreted your need correctly, I’ll second the suggestion 
> of pdsh – it’s simple, it works pretty well, it’s “transport” 
> independent (I use it to manage a cluster of beagleboards over WiFi).  
> Typically I wind up with a shell script on the head node and some shell 
> scripts on the worker nodes, and the head node script fires pdsh, which 
> starts the worker bee scripts.
> 
> *From: *Beowulf <beowulf-bounces at beowulf.org> on behalf of Joe Landman 
> <joe.landman at gmail.com>
> *Date: *Friday, November 20, 2020 at 2:03 PM
> *To: *"beowulf at beowulf.org" <beowulf at beowulf.org>
> *Subject: *[EXTERNAL] Re: [Beowulf] perl with OpenMPI gotcha?
> 
> On 11/20/20 4:43 PM, David Mathog wrote:
> 
> [...]
> 
> 
>     Also, searching turned up very little information on using MPI with
>     perl.
>     (Lots on using MPI with other languages of course.)
>     The Parallel::MPI::Simple module is itself almost a decade old.
>     We have a batch manager but I would prefer not to use it in this case.
>     Is there some library/method other than MPI which people typically
>     use these days for this sort of compute cluster process control with
>     Perl from the head node?
> 
> I can't say I've ever used Perl and MPI.  I suppose it is doable, but if 
> you were doing it, I'd recommend encapsulating it with FFI::Platypus 
> (https://metacpan.org/pod/FFI::Platypus 
> <https://urldefense.us/v3/__https:/metacpan.org/pod/FFI::Platypus__;!!PvBDto6Hs4WbVuu7!afgHh9iIgQExswMZG_DhAJu2PzyrdLg5Tc8j9Dnc3LdGZ9ujD927YjcLBxKWv7faIxt-IWQ$>).
> 
> This however, doesn't seem tp be your problem per se.  Your problem 
> sounds like "how do I launch a script on N compute nodes at once, and 
> wait for it to complete".
> 
> If I have that correct, then you want to learn about pdsh 
> (https://github.com/chaos/pdsh 
> <https://urldefense.us/v3/__https:/github.com/chaos/pdsh__;!!PvBDto6Hs4WbVuu7!afgHh9iIgQExswMZG_DhAJu2PzyrdLg5Tc8j9Dnc3LdGZ9ujD927YjcLBxKWv7fa6GA3YiQ$> 
> and info here: 
> https://www.rittmanmead.com/blog/2014/12/linux-cluster-sysadmin-parallel-command-execution-with-pdsh/ 
> <https://urldefense.us/v3/__https:/www.rittmanmead.com/blog/2014/12/linux-cluster-sysadmin-parallel-command-execution-with-pdsh/__;!!PvBDto6Hs4WbVuu7!afgHh9iIgQExswMZG_DhAJu2PzyrdLg5Tc8j9Dnc3LdGZ9ujD927YjcLBxKWv7fatAD11t0$> 
> ).
> 
> I write most of my admin scripts in perl, and you can use pdsh as a 
> function within them.
> 
> However ...
> 
> MCE::Loop is your friend.
> 
> Combine that with something like this:
> 
>      $mounts=`ssh -o ConnectTimeout=20 $node grep o2ib /proc/mounts`;
> 
> and you can get pdsh-like control directly in Perl without invoking pdsh.
> 
> The general template looks like this:
> 
>     #!/usr/bin/env perl
> 
>     use strict;
>     use MCE::Loop;
> 
>     MCE::Loop->init(
>         max_workers => 25, chunk_size => 1
>     );
> 
>     my $nfile=shift;
> 
>     # grab file contents into @nodes array
>     my @nodes;
>     chomp(@nodes = split(/\n/,`cat $nfile`));
> 
>     # looping over nodes, max_workers at a time
>     mce_loop {
>         my ($mce, $chunk_ref, $chunk_id) = @_;
>         # do stuff to node $_
>     } @nodes;
> 
> This will run 25 copies (max_workers) of the loop body over the @nodes 
> array.  Incorporate the ssh bit above in the #do stuff area, and you get 
> basically what I think you are asking for.
> 
> FWIW, I've been using this pattern for a few years, most recently on 
> large supers over the past few months.
> 
> 
>     Thanks,
> 
>     David Mathog
> 
> 
> 
>     _______________________________________________
>     Beowulf mailing list, Beowulf at beowulf.org
>     <mailto:Beowulf at beowulf.org> sponsored by Penguin Computing
>     To change your subscription (digest mode or unsubscribe) visit
>     https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
>     <https://urldefense.us/v3/__https:/beowulf.org/cgi-bin/mailman/listinfo/beowulf__;!!PvBDto6Hs4WbVuu7!afgHh9iIgQExswMZG_DhAJu2PzyrdLg5Tc8j9Dnc3LdGZ9ujD927YjcLBxKWv7fafXgrJN4$>
> 
> 
> -- 
> 
> Joe Landman
> 
> e:joe.landman at gmail.com  <mailto:joe.landman at gmail.com>
> 
> t: @hpcjoe
> 
> w:https://scalability.org  <https://urldefense.us/v3/__https:/scalability.org__;!!PvBDto6Hs4WbVuu7!afgHh9iIgQExswMZG_DhAJu2PzyrdLg5Tc8j9Dnc3LdGZ9ujD927YjcLBxKWv7faOt9Xj2U$>
> 
> g:https://github.com/joelandman  <https://urldefense.us/v3/__https:/github.com/joelandman__;!!PvBDto6Hs4WbVuu7!afgHh9iIgQExswMZG_DhAJu2PzyrdLg5Tc8j9Dnc3LdGZ9ujD927YjcLBxKWv7faYVvpwqA$>
> 
> l:https://www.linkedin.com/in/joelandman  <https://urldefense.us/v3/__https:/www.linkedin.com/in/joelandman__;!!PvBDto6Hs4WbVuu7!afgHh9iIgQExswMZG_DhAJu2PzyrdLg5Tc8j9Dnc3LdGZ9ujD927YjcLBxKWv7famxaMBkU$>
> 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
> 



More information about the Beowulf mailing list