[Beowulf] performance tweaks and optimum memory configs for a Nehalem

David N. Lombard dnlombar at ichips.intel.com
Tue Aug 11 08:40:56 PDT 2009

On Mon, Aug 10, 2009 at 01:02:51PM -0700, Rahul Nabar wrote:
> On Mon, Aug 10, 2009 at 2:09 PM, Joshua Baker-LePain<jlb17 at duke.edu> wrote:
> > Well, as there are only 8 "real" cores, running a computationally intensive
> > process across 16 should *definitely* do worse than across 8.

Some workloads will benefit materially from SMT, some are neutral, and some
will degrade.  For those that degrade, simply not oversubscribing the physical
cores will get best performance.

> >                                                               However, it's
> > not so surprising that you're seeing peak performance with 2-4 threads.
> >  Nehalem can actually overclock itself when only some of the cores are busy
> > -- it's called Turbo Mode.  That *could* be what you're seeing.
> That could very well be it! Is there any way to test if the CPU has
> overclocked itself?

There's an application note on the subect at:

Be aware this document is very technical, talking about MSRs & performance counters.

> Or can I turn the "turbo mode" off and check?

That would work, but...  Alternately, take a look at

David N. Lombard, Intel, Irvine, CA
I do not speak for Intel Corporation; all comments are strictly my own.

