On 08/03/11 16:26, Mark Hahn wrote:

> if the app doesn't control this (or you with numactl),
> then you should expect performance to lie somewhere
> between the two extremes (fully local vs fully remote).
> the kernel does make some effort at keeping things local
> - and for that matter, avoiding moving a process among
> multiple cores/sockets.

It's worth also mentioning the issue of "NUMA diffusion"
through swapping made by David Singleton from ANU on the
hwloc-devel list:


# Unless it has changed very recently, Linux swapin_readahead
# is the main culprit in messing with NUMA locality on that
# platform. Faulting a single page causes 8 or 16 or whatever
# contiguous pages to be read from swap. An arbitrary contiguous
# range of pages in swap may not even come from the same process
# far less the same NUMA node. My understanding is that since
# there is no NUMA info with the swap entry, the only policy
# that can be applied to is that of the faulting vma in the
# faulting process. The faulted page will have the desired NUMA
# placement but possibly not the rest. So swapping mixes
# different process' NUMA policies leading to a "NUMA diffusion
# process".

Keep in mind that the reason that ANU runs systems with swap
is so they can suspend jobs, page the entire thing out and
start a new higher priority job.  Running without swap isn't
really an option for them..

