[Beowulf] Large Dell, odd IO delays

Michael Di Domenico mdidomenico4 at gmail.com
Thu Feb 15 04:51:35 PST 2018


On Wed, Feb 14, 2018 at 6:44 PM, Kilian Cavalotti
<kilian.cavalotti.work at gmail.com> wrote:
> On Wed, Feb 14, 2018 at 2:26 PM, David Mathog <mathog at caltech.edu> wrote:
>> Checked the hugepage settings and found a difference there.  The two systems
>> that don't do this have  /sys/kernel/mm/redhat_transparent_hugepage/defrag
>>
>> always madvise [never]
>>
>> whereas the system with the issue has:
>>
>> [always] madvise never
>
> THP defragmentation is definitely something that has bitten us in the
> past, when under memory pressure, and we now default to [madvise]
> pretty much everywhere (we're too timid to disable it entirely).

i will second this stance as well.  i've seen huge issues with disk
performance when hugepage was enabled.  i disable it on all the
machines we have now.

the way i found it was when doing large IO with hugepages enabled, the
khugepage (sp?) process shoots right to the top of a top display.  and
the performance you describe was the same.


More information about the Beowulf mailing list