[Beowulf] MPI Processes + Auto Vectorization
amjad11 at gmail.com
Mon Nov 30 12:24:34 PST 2009
Suppose we run a parallel MPI code with 64 processes on a cluster, say of 16
nodes. The cluster nodes has multicore CPU say 4 cores on each node.
Now all the 64 cores on the cluster running a process. Program is SPMD,
means all processes has the same workload.
Now if we had done auto-vectorization while compiling the code (for example
with Intel compilers); Will there be any benefit (efficiency/scalability
improvement) of having code with the auto-vectorization? Or we will get the
same performance as without Auto-vectorization in this example case?
How can we really get benefit in performance improvement with
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Beowulf