[Beowulf] bizarre scaling behavior on a Nehalem

Wed Aug 12 09:32:08 PDT 2009

Mikhail Kuzminsky wrote:
> In message from Craig Tierney <Craig.Tierney at noaa.gov> (Tue, 11 Aug 2009
> 11:40:03 -0600):
>> Rahul Nabar wrote:
>>> On Mon, Aug 10, 2009 at 12:48 PM, Bruno
>>> Coutinho<coutinho at dcc.ufmg.br> wrote:
>>>> This is often caused by cache competition or memory bandwidth
>>>> saturation.
>>>> If it was cache competition, rising from 4 to 6 threads would make
>>>> it worse.
>>>> As the code became faster with DDR3-1600 and much slower with Xeon
>>>> 5400,
>>>> this code is memory bandwidth bound.
>>>> Tweaking CPU affinity to avoid thread jumping among cores of the
>>>> will not
>>>> help much, as the big bottleneck is memory bandwidth.
>>>> To this code, CPU affinity will only help in NUMA machines to maintain
>>>> memory access in local memory.
>>>>
>>>>
>>>> If the machine has enough bandwidth to feed the cores, it will scale.
>>>
>>> Exactly! But I thought this was the big advance with the Nehalem that
>>> it has removed the CPU<->Cache<->RAM bottleneck. So if the code scaled
>>> with the AMD Barcelona then it would continue to scale with the
>>> Nehalem right?
>>>
>>> I'm posting a copy of my scaling plot here if it helps.
>>>
>>> http://dl.getdropbox.com/u/118481/nehalem_scaling.jpg
>>>
>>> To remove most possible confounding factors this particular Nehlem
>>> plot is produced with the following settings:
>>>
>>> Hyperthreading OFF
>>> 24GB memory i.e. 6 banks of 4GB. i.e. optimum memory configuration
>>> X5550
>>>
>>> Even if we explained away the bizzare performance of the 4 node case
>>> to the Turbo effect what is most confusing is how the 8 core data
>>> point could be so much slower than the corresponding 8 core point on a
>>> old AMD Barcelona.
>>>
>>> Something's wrong here that I just do not understand. BTW, any other
>>> VASP users here? Anybody have any Nehalem experience?
>>>
>>
>> Rahul,
>> What are you doing to ensure that you have both memory and processor
>> affinity enabled?
>> Craig
> 
> As I mentioned here in "numactl&SuSE11.1' thread, on some kernels there
> is wrong behaviour for Nehalem (bad /sys/devices/system/node directory
> content). This bug is presented, in particular, in default OpenSuSE 11
> kernels (2.6.27.7-9 and 2.6.29-6), and (as it was writted in the
> corresponding thread discussion) in FC11 2.6.29 kernel.
> 
> I found that in such situation disabling of NUMA in BIOS gives only
> increase of STREAM throughput. Therefore I think this (Rahul) problem is
> not due to BIOS settings. Unfortunately I've no data about VASP itself.
> 
> It's interesting, do somebody have "normally working" w/Nehalem - in the
> sense of NUMA - kernels ? AFAIK more old 2.6 kernels (from SuSE 10.3)
> works OK, but I didn't check. May be error in NUMA support is the reason
> of Rahul problem ?
> 

What do you mean normally?  I am running Centos 5.3 with 2.6.18-128.2.1
right now on a 448 node Nehalem cluster.  I am so far happy with how things work.
The original Centos 5.3 kernel, 2.6.18-128.1.10 had bugs in Nelahem support
where nodes would just start randomly run slow.  Upgrading the kernel
fixed that.  But that performance problem was either all or none, I don't recall
it exhibiting itself in the way that Rahul described.

Craig

> Mikhail       
>>
>>
>>> -- 
>>> Rahul
>>> _______________________________________________
>>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>>> To change your subscription (digest mode or unsubscribe) visit
>>> http://www.beowulf.org/mailman/listinfo/beowulf
>>>
>>
>>
>> -- 
>> Craig Tierney (craig.tierney at noaa.gov)
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit
>> http://www.beowulf.org/mailman/listinfo/beowulf
>>
>> -- 
>> üÔÏ ÓÏÏÂÝÅÎÉÅ ÂÙÌÏ ÐÒÏ×ÅÒÅÎÏ ÎÁ ÎÁÌÉÞÉÅ × ÎÅÍ ×ÉÒÕÓÏ×
>> É ÉÎÏÇÏ ÏÐÁÓÎÏÇÏ ÓÏÄÅÒÖÉÍÏÇÏ ÐÏÓÒÅÄÓÔ×ÏÍ
>> MailScanner, É ÍÙ ÎÁÄÅÅÍÓÑ
>> ÞÔÏ ÏÎÏ ÎÅ ÓÏÄÅÒÖÉÔ ×ÒÅÄÏÎÏÓÎÏÇÏ ËÏÄÁ.
>>
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
> 

-- 
Craig Tierney (craig.tierney at noaa.gov)