[Beowulf] Shared memory

Vincent Diepeveen diep at xs4all.nl
Mon Jun 27 15:23:51 PDT 2005

At 12:25 PM 6/27/2005 +0100, Kozin, I \(Igor\) wrote:
>> But back to the original question: instead of using OpenMP and MPI at 
>> once (which I see in the same way as Mark), I'd suggest to 
>> compile MPICH 
>> with shared memory support and setup the machinefile 
>> accordingly. Then 
>> you can easy look for any speed improvement using shared memory.
>> CU - Reuti
>I think MPI/OpenMP has its niche. Choosing OpenMP or MPI or 
>mixed MPI/OpenMP is also about a choice of appropriate/most suitable 
>algorithm as well. However I must agree that pure MPI seems 
>most suitable if the target architecture is Opteron.

depends upon the network card and the applications nature in question.

sometimes 'shmem' type libraries kick butt. that's cheaper than MPI as you
avoid a lot of mpi calls and checks which lock/unlock and waste processor
time from the receiving node.

of course it is not generic code. it won't run at *all* highend network cards.

in general speaking, non-generic code works faster.

>BTW, "taskset" worked fine with MPI but could not get a grip on OpenMP
>threads on a dual core. 

>It is straightforward to make STREAM a mixed OpenMP/MPI code
>from an MPI source and use it as a test. On the other
>hand the conclusion of
>is that mixed MPI/OpenMP model performs well if the job
>is not memory bandwidth bounded (however this presentation
>seems to be a bit dated.)

Reality learns that programs using the latest algorithms which get run on
clusters, definitely have either a big need for bandwidth, or low latency
network cards. 

Sometimes you can rewrite problems from low latency to big bandwidth and/or
vice versa. 

There is real real tiny number of applications that can't profit from low
latency nor from a big bandwidth.

In that case your above statement suggests that MPI is useless, which it is
not :)

>Unfortunately I can't recommend a simple established code or benchmark 
>which would allow transparent comparison of MPI versus OpenMP/MPI.

I don't feel a comparision would be revealing much. By now MPI is simply a
very generic standard (though exact implementation of function calls
differs from one manufacturer to another) to ship data from one node to
another one.

It's relative easy to build a cluster using MPI.

>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit

More information about the Beowulf mailing list