[Beowulf] MPI Processes + Auto Vectorization

Mon Nov 30 12:24:34 PST 2009

Hi,
Suppose we run a parallel MPI code with 64 processes on a cluster, say of 16
nodes. The cluster nodes has multicore CPU say 4 cores on each node.

Now all the 64 cores on the cluster running a process. Program is SPMD,
means all processes has the same workload.

Now if we had done auto-vectorization while compiling the code (for example
with Intel compilers); Will there be any benefit (efficiency/scalability
improvement) of having code with the auto-vectorization? Or we will get the
same performance as without Auto-vectorization in this example case?

How can we really get benefit in performance improvement with
Auto-Vectorization?

Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20091130/c69de3e6/attachment.html>