[Beowulf] More cores/More processors/More nodes?
deadline at clustermonkey.net
Tue Oct 3 07:37:30 PDT 2006
-- snipped some good advice --
>> However, I do not understand what happens when you have
>> multi-processor/multi-core nodes in a cluster. Do you just use MPI
>> (with each thread using its own non-shared memory) or is there any
>> way to do "mixed-mode" programming which takes advantage of shared
>> memory within a node (like, an MPI/OpenMP hybrid?).
> The first is the easiest. MPI takes advantage of shared memory within
> the node.
> The hybrid model is a lot more work for the programmer, and often is
> slower than pure MPI. And it hurts interconnect performance because you
> usually end up with just 1 core driving the interconnect.
This is a non-obvious result many find hard to believe.
That is, MPI on the same node maybe faster than some shared/threaded
mode. (of course it all depends on the application etc.) Furthermore,
in some recent NAS parallel runs on quad-core Xeons (dual socket MB, 8
cores per MB), LAM-MPI/tcp did better than LAM-MPI/sysv or
LAM-MPI/usysv (I have not done any tuning to see if it
helps, I should have the hardware back soon though,
not allowed to give hard numbers just yet, sorry).
Furthermore, hybrid models also start becoming very hardware
specific and if the pay-off is not that great, then
you *may* have spent a lot of time making your code less portable.
These are very good questions by the way, multi-core is
changing some things.
More information about the Beowulf