[Beowulf] Really efficient MPIs??
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Nathan Moore ntmoore at gmail.comWed Nov 28 08:21:02 PST 2007
- Previous message: [Beowulf] Really efficient MPIs??
- Next message: [Beowulf] Really efficient MPIs??
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
I've not tried their respective MPI libraries, but as a general rule, the people who manufacture the chips have the best idea of how to optimize a given library. (There are obvious counter-examples, gotoBLAS and fftw for example). That said, have you tried for Intel: http://www.intel.com/cd/software/products/asmo-na/eng/308295.htm or for AMD: http://developer.amd.com/devtools.jsp (they link to HP's MPI) As a side note, IBM uses a slightly modified version of MPICH for Blue Gene. Nathan On Nov 28, 2007 9:48 AM, Christian Bell <christian.bell at qlogic.com> wrote: > But the main point with MPI implementations, more than usual with > shared memory, is to run your application. > > For 2 different MPI shared-memory implementations that show equal > performance on point-to-point microbenchmarks, you can measure very > different performance in applications (mostly at the bandwidth-bound > level). > > Microbenchmarks assume senders and receivers are always synchronized > in time and report memory copy performance for memory copies that go > mostly through the cache. Memory transfers that are mostly out of > cache are rarely tuned for or even measured. > > Microbenchmarks also never have the receivers actually consume the > data that's received or have senders re-reference the data sent for > computation. The cost of these application-level memory accesses is > greatly determined by where in the memory hierarchy the MPI > implementation left the data to be computed on. And finally, a given > implementation will have very different performance characteristics > on Opteron versus Intel, few-core versus many-core and point-to-point > versus collectives. > > It's safe to assume that most if not all MPIs try to do something > about shared memory but I wouldn't be surprised if each of them can > top out on some performance curve on some specific system. > > > . . christian > > On Wed, 28 Nov 2007, amjad ali wrote: > > > Hello, > > > > Because today the clusters with multicore nodes are quite common and the > > cores within a node share memory. > > > > Which Implementations of MPI (no matter commercial or free), make > automatic > > and efficient use of shared memory for message passing within a node. > (means > > which MPI librarries auomatically communicate over shared memory instead > of > > interconnect on the same node). > > > > regards, > > Ali. > > > _______________________________________________ > > Beowulf mailing list, Beowulf at beowulf.org > > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > > > -- > christian.bell at qlogic.com > (QLogic Host Solutions Group, formerly Pathscale) > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > -- - - - - - - - - - - - - - - - - - - - - - Nathan Moore Assistant Professor, Physics Winona State University AIM: nmoorewsu - - - - - - - - - - - - - - - - - - - - - -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20071128/6a4e45bc/attachment.html
- Previous message: [Beowulf] Really efficient MPIs??
- Next message: [Beowulf] Really efficient MPIs??
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
