[Beowulf] RE: programming multicore clusters
lindahl at pbm.com
Thu Jun 14 20:33:15 PDT 2007
On Thu, Jun 14, 2007 at 02:04:59PM -0700, Joseph Han wrote:
> Is running a program using OpenMP on a SMP/multi-core box more efficient that
> an MPI code with an implementation using localhost optimization?
One good example comes from codes which have both pure MPI and hybrid
MPI/OpenMPI implementations. There's published data from John
Michalakes MM5 is faster in pure MPI mode.
In fact I've never seen a bid involving pure MPI and hybrid codes
where hybrid was faster.
There are some unsual cases where hybrid can be a win:
* Codes with extreme load imbalance, like NASA's CFD code Overflow.
But it's hare to tell what a good MPI implementation of Overflow would
perform like; if it turns out that it's simply unusually OpenMP
friendly, that's not really a useful datapoint.
* Codes where a pure MPI code runs out of decomposition, but
the hyrbrid code doesn't.
* Codes where there's a big read-only database that can be shared
within a node. But you can share between MPI processes using Sys5
shared memory segments, or you can mmap the database as a file, which
Hybrid can be a lose when MPI interconnect hardware benefits from
being driving from multiple cores.
All in all, hybrid programming has been an incredible waste of time,
ranking up with HDF in the all-time failures in HPC.
More information about the Beowulf