[Beowulf] Again about NUMA (numactl and taskset)

Thu Jun 26 09:10:43 PDT 2008

In message from Håkon Bugge <Hakon.Bugge at scali.com> (Thu, 26 Jun 2008 
11:16:17 +0200):

Numastat statistics before Gaussian-03 run (OpenMP, 8 threads, 8 
cores,
requires  512 Mbytes shared memory plus something more, may be fitted 
in memory of any node - I have 8 GB per node, 6- GB free in node0 and 
7+ GB free in node1)

node0:
numa_hit 14594588
numa_miss 0
numa_foreign 0
interleave_hit 14587
local_node 14470168
other_node 124420

node1:
numa_hit 11743071
numa_miss 0
numa_foreign 0
interleave_hit 14584
local_node 11727424
other_node 15647
-------------------------------------------
Statistics after run:

node0:
numa_hit 15466972
numa_miss 0
numa_foreign 0
interleave_hit 14587
local_node 15342552
other_node 124420

node1:
numa_hit 12960452
numa_miss 0
numa_foreign 0
interleave_hit 14584
local_node 12944805
other_node 15647
-------------------------------------------

Unfortunately I don't know, what exactly means this lines !! :-(
(BTW, do somebody know ?!)

But intuitive it looks (taking into account the increase of
numa_hit and local_node values), that the allocation of RAM was 
performed from BOTH nodes (and more RAM was allocated from node1 
memory - node1 had initially more free RAM).

It is in opposition w/my expectations of "continuous" RAM allocation 
from the RAM of one node !

Mikhail Kuzminsky,
Computer Assistance to Chemical Research
Zelinsky Institute of Organic Chemistry
Moscow 

>At 18:34 25.06.2008, Mikhail Kuzminsky wrote:
>>Let me assume now the following situation. I have OpenMP-parallelized 
>>application which have the number of processes equal to number of CPU 
>>cores per server. And let me assume that this application uses not 
>>too more virtual memory, so all the real memory used may be placed in 
>>RAM of *one* node.
>>It's not the abstract question - a lot of Gaussian-03 jobs we have 
>>fit to this situation, and all the 8 cores for dual socket quad core 
>>Opteron server will be "well loaded".
>>
>>Is it right that all the application memory (w/o using of numactl) 
>>will be allocated (by Linux kernel) in *one* node ?
>
>Guess the answer is, it depends. The memory will be allocated on the 
>node where the thread first touching it is running. But you could use 
>numastat to investigate the issue.
>
>
>Håkon
>
>
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit 
>http://www.beowulf.org/mailman/listinfo/beowulf