[Beowulf] multi-threading vs. MPI

Kozin, I (Igor) i.kozin at dl.ac.uk
Tue Dec 11 07:44:14 PST 2007


I'm no expert in CPMD but
- AFAIK  CPMD is not meant to run as pure OpenMP; OpenMP parallelization is auxiliary to MPI
- in the paper " Dual-level parallelism for ab initio molecular dynamics: Reaching teraflop performance with the CPMD code" the developers (Jürg Hutter and Alessandro Curioni) compare performance of pure MPI and mixed MPI+OpenMP; while pure MPI wins on small problems size MPI+OpenMP is more efficient on the largest problem. 
Now you can argue whether one could create a more efficient pure MPI code or not, or whether there was a problem with IBM pSeries 690 cluster it seems quite obvious that in order to achieve good scaling on 1000s of cores the programming difficulties are bound to advance to a new level and complexity will generally grow.


-----Original Message-----
From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Martin Siegert
Sent: 11 December 2007 00:28
To: Jones de Andrade
Cc: beowulf at beowulf.org
Subject: Re: [Beowulf] multi-threading vs. MPI

Hi Jones,

On Sat, Dec 08, 2007 at 06:50:29PM -0300, Jones de Andrade wrote:
> 
>    Hi all.
>    I  usually  just  keep  on  looking  on this list, but this discussion
>    really called my attention.
>    Could  someone  clarify  a bit better *why* would openMP be such a bad
>    performer in comparison to MPI?
>    Moreover,  concerning you tests  Dr. Siegert, could you please show us
>    a  bit  more?  I  mean  for  example  the  scalling you observed throw
>    increasing the number of cores.

As I said, I will post the full results in a few weeks, we are not quite
done yet.

>    I  really  don't  immeadiatelly  understand  how openMP can perform so
>    worst  than  mpi  on  a smp machine, given the fact that not having to
>    communicate  with  the  other  31  cores  (on  cpmd case, all the huge
>    matrixes  that  should be exchanged) should at least make things a bit
>    easier.

The way code is parallelized is different whe using MPI and OpenMP.
OpenMP code usually uses a loop level parallelization, i.e., this is
very fine grained. In the cases that I have mentioned MPI uses domain
decomposition - very coarse grained.
All what I am saying is that for the codes that I have seen the
MPI way of parallelizing the code is by far more efficient than
the OpenMP way. That does not mean that you could not use domain
decomposition with OpenMP, it just appears that that is not done
usually. I am speculating that that may be a consequence of the
(perceived?) easier way of programming using OpenMP. If you do
domain decomposition you probably end up writing code that looks very
similar to the MPI code even if you use OpenMP.

Cheers,
Martin

-- 
Martin Siegert
Head, Research Computing
WestGrid Site Lead
Academic Computing Services                phone: 778 782-4691
Simon Fraser University                    fax:   778 782-4242
Burnaby, British Columbia                  email: siegert at sfu.ca
Canada  V5A 1S6

>    But all that from the eyes and view of an young "amadorist".  ;)
>    Thanks a lot in advance,
>    Jones
> 
>    On Dec 8, 2007 5:55 PM, Martin Siegert <[1] siegert at sfu.ca> wrote:
> 
>      Over the last months I have done quite a bit of benchmarking of
>      applications.  One  of  the  aspects  we  are  interested in is the
>      performance
>      of  applications  that  are  available  in  MPI,  OpenMP and hybrid
>      versions.
>      So  far  we  looked  at WRF and CPMD; we'll probably look at POP as
>      well.
>      MPI vs. OpenMP on a SMP (64 core Power5):
>      walltime for cpmd benchmark on 32 cores:
>      MPI: 93.13s   OpenMP: 446.86s
>      Results for WRF on the same platform are similar.
>      In  short:  the performance of OpenMP code isn't even close to that
>      of the
>      MPI code.
>      We also looked at the hybrid version of these codes on clusters.
>      The difference in run times are in the 1% range - less than the
>      accuracy of the measurement.
>      Thus,  if  you have the choice, why would you even look at anything
>      other
>      than MPI? Even if the programming effort for OpenMP is lower,
>      the performance penalty is huge.
>      That's my conclusion drawn from the cases we've looked at.
>      If anybody knows of applications where the OpenMP performance comes
>      close
>      to  the  MPI  performance  and  of  applications  where  the hybrid
>      performance
>      is significantly better than the pure MPI performance, then I would
>      love to hear from you. Thanks!
>      Cheers,
>      Martin
>      --
>      Martin Siegert
>      Head, Research Computing
>      WestGrid Site Lead
>      Academic Computing Services                phone: 778 782-4691
>      Simon Fraser University                    fax:   778 782-4242
>      Burnaby, British Columbia                  email: [2]siegert at sfu.ca
>      Canada  V5A 1S6
>      On Fri, Dec 07, 2007 at 10:56:14PM -0600, Gerry Creager wrote:
>      >  WRF has been under development for 10 years.  It's got an OpenMP
>      flavor,
>      > an MPI flavor and a hybrid one.  We still don't have all the bugs
>      worked
>      >  out  of  the hybrid so that it can handle large, high resolution
>      domains
>      > without being slower than the MPI version.  And, yeah, the OpenMP
>      geeks
>      > working on this... and the MPI folks, are good.
>      >
>      >  Hybrid  isn't  easy and isn't always foolproof.  And, as another
>      thought,
>      > OpenMP isn't always the best solution to the problem.
>      >
>      > gerry
>      >
>      > [3]richard.walsh at comcast.net wrote:
>      > > -------------- Original message ----------------------
>      > >From: Toon Knapen <[4] toon.knapen at gmail.com>
>      > >>Greg Lindahl wrote:
>      >  >>>In  real  life  (i.e. not HPC), everyone uses message passing
>      between
>      > >>>nodes.  So I don't see what you're getting at.
>      > >>>
>      >  >>Many on this list suggest that using multiple MPI-processes on
>      one and
>      > >>the same node is superior to MT approaches IIUC. However I have
>      the
>      > >>impression that almost the whole industry is looking into MT to
>      benefit
>      >  >>from  multi-core without even considering message-passing. Why
>      is that so?
>      > >
>      >  >I  think  what Greg and others are really saying is that if you
>      have to use
>      > >a distributed memory
>      > >model (MPI) as a first order response to meet your scalability
>      > >requirements, then
>      >  >the  extra  coding  effort  and complexity required to create a
>      hybrid code
>      > >may not be
>      >  >a  good performance return on your investment.  If on the other
>      hand you
>      > >only
>      > >need to scale within a singe SMP node (with cores and sockets on
>      a single
>      >  >board  growing in number, this returns more performance than in
>      the past),
>      > >then you
>      >  >may  be  able to avoid using MPI and chose a simpler model like
>      OpenMP.  If
>      > >you
>      >  >have  already  written  an efficient MPI code,  then (with some
>      exceptions)
>      >  >the  performance-gain  divided  by the hybrid coding-effort may
>      seem small.
>      > >
>      > >Development in an SMP environment is easier.  I know of a number
>      of sights
>      > >that work this way.  The experienced algorithm folks work up the
>      code in
>      >  >OpenMP  on  say  an  SGI  Altix  or Power6 SMP, then they get a
>      dedicated MPI
>      >  >coding  expert  to  convert  it  later  for scalable production
>      operation on a
>      > >cluster.
>      >  >In  this situation, they do end up with hybrid versions in some
>      cases.  In
>      > >non-HPC
>      >  >or smaller workgroup contexts your production code may not need
>      to be
>      > >converted.
>      > >
>      > >Cheers,
>      > >
>      > >rbw
>      > >
>      > >--
>      > >
>      > >"Making predictions is hard, especially about the future."
>      > >
>      > >Niels Bohr
>      > >
>      > >--
>      > >
>      > >Richard Walsh
>      > >Thrashing River Consulting--
>      > >5605 Alameda St.
>      > >Shoreview, MN 55126
>      > >
>      > >Phone #: 612-382-4620
>      > >
>      > >_______________________________________________
>      > >Beowulf mailing list, [5]Beowulf at beowulf.org
>      > >To change your subscription (digest mode or unsubscribe) visit
>      > >[6]http://www.beowulf.org/mailman/listinfo/beowulf
>      >
>      > --
>      > Gerry Creager -- [7]gerry.creager at tamu.edu
>      > Texas Mesonet -- AATLT, Texas A&M University
>      > Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983
>      >  Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX
>      77843
>      > _______________________________________________
>      > Beowulf mailing list, [8]Beowulf at beowulf.org
>      > To change your subscription (digest mode or unsubscribe) visit
>      > [9]http://www.beowulf.org/mailman/listinfo/beowulf
>      --
>      Martin Siegert
>      Head, Research Computing
>      WestGrid Site Lead
>      Academic Computing Services                phone: 778 782-4691
>      Simon Fraser University                    fax:   778 782-4242
>      Burnaby,   British   Columbia                                email:
>      [10]siegert at sfu.ca
>      Canada  V5A 1S6
>      _______________________________________________
>      Beowulf mailing list, [11]Beowulf at beowulf.org
>      To  change  your  subscription  (digest  mode or unsubscribe) visit
>      [12]http://www.beowulf.org/mailman/listinfo/beowulf
> 
> References
> 
>    1. mailto:siegert at sfu.ca
>    2. mailto:siegert at sfu.ca
>    3. mailto:richard.walsh at comcast.net
>    4. mailto:toon.knapen at gmail.com
>    5. mailto:Beowulf at beowulf.org
>    6. http://www.beowulf.org/mailman/listinfo/beowulf
>    7. mailto:gerry.creager at tamu.edu
>    8. mailto:Beowulf at beowulf.org
>    9. http://www.beowulf.org/mailman/listinfo/beowulf
>   10. mailto:siegert at sfu.ca
>   11. mailto:Beowulf at beowulf.org
>   12. http://www.beowulf.org/mailman/listinfo/beowulf

> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf




More information about the Beowulf mailing list