[Beowulf] big read triggers migration and slow memory IO?

Thu Jul 9 14:27:40 PDT 2015

On 09-Jul-2015 11:54, James Cuff wrote:
> http://blog.jcuff.net/2015/04/of-huge-pages-and-huge-performance-hits.html

Well, that seems to be it, but not quite with the same symptoms you 
observed.  khugepaged never showed up, and "perf top" never revealed 
_spin_lock_irqsave.  Instead this is what "perf top" shows in my tests:

(hugepage=always, when migration/# process observed)
  89.97%  [kernel]       [k] compaction_alloc
   1.21%  [kernel]       [k] compact_zone
   1.18%  [kernel]       [k] get_pageblock_flags_group
   0.75%  [kernel]       [k] __reset_isolation_suitable
   0.57%  [kernel]       [k] clear_page_c_e

(hugepage=always, when events/# process observed)
  85.97%  [kernel]       [k] compaction_alloc
   0.84%  [kernel]       [k] compact_zone
   0.65%  [kernel]       [k] get_pageblock_flags_group
   0.64%  perf           [.] 0x000000000005cff7

(hugepage=never)
  29.86%  [kernel]       [k] clear_page_c_e
  21.88%  [kernel]       [k] copy_user_generic_string
  12.46%  [kernel]       [k] __alloc_pages_nodemask
   5.70%  [kernel]       [k] page_fault

This is good, because "perf top" shows that the underlying issue
is compaction_alloc and compact_zone even though what top shows
is in one case migration/# and when locked to a cpu, events/#.

Switching hugepage always->never seems to make things work right away.  
Switching hugepage never->always seems to take a while to break.  In 
order to get it to start failing many of the big files involved must be 
copied to /dev/null again, even though they were presumably already in 
file cache.

Searched for "compaction_alloc" and "compact_zone" and found a 
suggestion here

https://structureddata.github.io/2012/06/18/linux-6-transparent-huge-pages-and-hadoop-workloads/

to do:

echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag

(transparent_hugepage is a link to redhat_transparent_hugepage).
Reenabled hugepage and reproduced the painfully slow IO, set defrag to 
"never" and the IO was fast again, even though hugepage was still 
enabled.

So on my machine the problem seems to be with hugepage defrag 
specifically.  Disabling just that is sufficient to resolve the issue, 
it isn't necessary to take out all of hugepage.  Will let
it run that way for a while and see if anything else shows up.

For future reference:

CentOS release 6.6 (Final)
kernel 2.6.32-504.23.4.el6.x86_64
Dell Inc. PowerEdge T620/03GCPM, BIOS 2.2.2 01/16/2014
48 Intel Xeon CPU E5-2695 v2 @ 2.40GHz  (in /proc/cpuinfo)
RAM 529231456 kB (in /proc/meminfo)

Thanks all!

David Mathog
mathog at caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech