Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] Really efficient MPIs??

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Nathan Moore ntmoore at gmail.com
Wed Nov 28 08:21:02 PST 2007


I've not tried their respective MPI libraries, but as a general rule, the
people who manufacture the chips have the best idea of how to optimize a
given library.  (There are obvious counter-examples, gotoBLAS and fftw for
example).

That said, have you tried for Intel:
http://www.intel.com/cd/software/products/asmo-na/eng/308295.htm

or for AMD:  http://developer.amd.com/devtools.jsp (they link to HP's MPI)

As a side note, IBM uses a slightly modified version of MPICH for Blue Gene.

Nathan


On Nov 28, 2007 9:48 AM, Christian Bell <christian.bell at qlogic.com> wrote:

> But the main point with MPI implementations, more than usual with
> shared memory, is to run your application.
>
> For 2 different MPI shared-memory implementations that show equal
> performance on point-to-point microbenchmarks, you can measure very
> different performance in applications (mostly at the bandwidth-bound
> level).
>
> Microbenchmarks assume senders and receivers are always synchronized
> in time and report memory copy performance for memory copies that go
> mostly through the cache.  Memory transfers that are mostly out of
> cache are rarely tuned for or even measured.
>
> Microbenchmarks also never have the receivers actually consume the
> data that's received or have senders re-reference the data sent for
> computation.  The cost of these application-level memory accesses is
> greatly determined by where in the memory hierarchy the MPI
> implementation left the data to be computed on.  And finally, a given
> implementation will have very different performance characteristics
> on Opteron versus Intel, few-core versus many-core and point-to-point
> versus collectives.
>
> It's safe to assume that most if not all MPIs try to do something
> about shared memory but I wouldn't be surprised if each of them can
> top out on some performance curve on some specific system.
>
>
>    . . christian
>
> On Wed, 28 Nov 2007, amjad ali wrote:
>
> > Hello,
> >
> > Because today the clusters with multicore nodes are quite common and the
> > cores within a node share memory.
> >
> > Which Implementations of MPI (no matter commercial or free), make
> automatic
> > and efficient use of shared memory for message passing within a node.
> (means
> > which MPI librarries auomatically communicate over shared memory instead
> of
> > interconnect on the same node).
> >
> > regards,
> > Ali.
>
> > _______________________________________________
> > Beowulf mailing list, Beowulf at beowulf.org
> > To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
>
> --
> christian.bell at qlogic.com
> (QLogic Host Solutions Group, formerly Pathscale)
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>



-- 
- - - - - - -   - - - - - - -   - - - - - - -
Nathan Moore
Assistant Professor, Physics
Winona State University
AIM: nmoorewsu
- - - - - - -   - - - - - - -   - - - - - - -
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.scyld.com/pipermail/beowulf/attachments/20071128/6a4e45bc/attachment.html


More information about the Beowulf mailing list