[Beowulf] multi-threading vs. MPI

Mon Dec 10 18:01:30 PST 2007

> Generally speaking, I find scientists/engineers generally "get" OpenMP more 
> easily than MPI.  They have to work less hard to get some benefit from OpenMP 
> than MPI.
>
> This above statement I expect to generate great deals of heat, which is a

I don't think so many would disagree.  OpenMP presents a programming model
that, for simple codes, is very close to plain old serial.  getting "some"
benefit is very easy.

the issue, though, is whether it's practical and efficient to mix both.
I think the answer is no - sort of a correlary of the following:

 	Debugging is twice as hard as writing the code in the first place.
 	Therefore, if you write the code as cleverly as possible, you are,
 	by definition, not smart enough to debug it."  (Brian W Kernighan)

doing just OpenMP or MPI well is hard enough - at the edge of most people's 
ability.  debugging it is therefore beyond their ability ;)

> channeling an ex-US president, we need to define what "better" means.  Faster 
> execution on model problems?  Faster benchmarking?  Faster development, ease 
> of code testing/debugging/management?

in my world, a code (or trivial variants) is run many times, and consumes a
lot of CPU resources, so I encourage people to write efficient code even if it
takes more effort.  the relative value of the compute hardware is different
in, for instance, an engineering company.

> What we see going forward are desktops with 4-16 cores (biased as this is 
> what we are doing/selling) and a shared memory system.

I'm skeptical about how quickly the market will reward 8-core chips.

> (non-NUMA) for Intel.  Intel is going to NUMA as far as I have seen at SC07 
> and elsewhere (and Intel folks, please do step in and let me know if I am

there's really no choice: you simply can't scale a flat memory system.

> issues.  The streams code is an example of a "trivial" (sorry John) code 
> which operates in OpenMP very nicely.

it's embarassingly parallel, that's all.  I don't think John would disagree.
of course, stream-in-MPI scales even better ;)