[Beowulf] Using Autoparallel compilers or Multi-Threaded libraries with MPI

Tom Elken tom.elken at qlogic.com
Thu Nov 29 11:26:45 PST 2007

The SPEC HPG (High Performance Group) is having discussions about using
a hybrid of MPI and thread-level parallelism on the SPEC MPI2007
benchmark suite.  We have a separate OpenMP suite (SPEC OMP2001), so we
chose not to allow the source-code complications of hybrid MPI/OpenMP
parallelism in this benchmark suite with "MPI" foremost in its name.

I was wondering how many people use either auto-parallel compiler
features, or multi-threaded math libraries (Goto, MKL, ACML, etc.) to
provide some thread-level parallelism on a cluster where you primarily
use MPI to achieve your parallel execution.*

Have you used compiler auto-parallel features mixed with MPI with
success on your clusters?

Have you used multi-threaded math or scientific libraries mixed with MPI
with success on your clusters?

If you just want to 'reply' to me only with simpler Yes/No answers, I
will report on a summary of the results to this list and to the SPEC HPG

If you have success or failure stories that might be useful to the
Beowulf list, please 'reply-all'.  

Tom Elken,
member SPEC HPG committee
*  For example, if an autoparallelizing compiler could find effective
4-way thread-level parallelism in an MPI code and you were running on a
cluster of 8 nodes each with two quad-core CPUs, 64 cores total, you
might choose to run with 16 MPI threads and set your NUM_THREADS
variable to 4, to run with all 64 cores of the cluster executing work
with reasonable efficiency. 

More information about the Beowulf mailing list