> On two-socket, dual core Opteron systems, we often see that throughput is > enhanced by using two times two processes per node. just for the obvious reason? (memory bandwidth)