[Beowulf] Feedback on large pages in Linux

Tue Jul 25 13:17:04 PDT 2006

> the memory access pattern.  The main reason is that the Opteron only has
> eight 2-Mbyte TLB entries, compared to 512 4-Kbyte TLB entries (see

which seems great to me: up to 16 MB without a TLB miss vs only 2MB...

> below).  So, an app that accesses lots of little regions of memory
> scattered all over the place will probably be hurt by using large
> pages.

I find that statement a bit misleading; consider a case where I'm
iterating through a 16M region, touching 1 word at 4k strides.
8x2M pages will be golden, whereas small pages would thrash badly.

> Anybody know if recent Intel processors have the same issue?

I don't really see how it could be avoided...

> obtained using cpuid instruction on an Opteron 146...
>
> L1 2-Mbyte TLB:
>    DTLB entries       = 8
>    ITLB entries       = 8
>    DTLB associativity = full
>    ITLB associativity = full
>
> L1 4-Kbyte TLB:
>    DTLB entries       = 32
>    ITLB entries       = 32
>    DTLB associativity = full
>    ITLB associativity = full
>
> L2 4-Kbyte TLB:
>    DTLB entries       = 512
>    ITLB entries       = 512
>    DTLB associativity = 4
>    ITLB associativity = 4

the intel doc I looked at listed up to 128x4k and 64x2 or 4M pages.
it didn't seem to address core2, though, which probably has more 
than the pent-m.