[Beowulf] performance tweaks and optimum memory configs for a Nehalem
hahn at mcmaster.ca
Mon Aug 10 10:04:56 PDT 2009
>> this is on the machine which reports 16 cores, right? I'm guessing
>> that the kernel is compiled without numa and/or ht, so enumerates virtual
>> cpus first. that would mean that when otherwise idle, a 2-core
>> proc will get virtual cores within the same physical core. and that your 8c
>> test is merely keeping the first socket busy.
> No. On both machines. The one reporting 16 cores and the other
> reporting 8. i.e. one hyperthreaded and the other not. Both having 8
> physical cores.
> What is bizarre is I tried using -np 16. THat ought to definitely
> utilize all cores, right? I'd have expected the 16 core performance to
> be the best. BUt no the performance peaks at a smaller number of
I think I would still invoke kernel miscompilation, since if the kernel
isn't aware of the memory/core/socket topology, it probably makes quite
poor affinity-oblivious allocations. this is the machine where numactl
doesn't do anything sensible, right?
More information about the Beowulf