[Beowulf] Woodcrest - Shared L2 cache
Renato S. Silva
rssr at lncc.br
Wed Aug 16 08:39:31 PDT 2006
Hi Folks
Does anyone have information about how they shared L2 for both cores ?
Thanks
Renato Silva
Richard Walsh wrote:
> Mark Hahn wrote:
>
>>>> Good point which makes perfect sense to me.
>>>> Given that the theoretical maximum is actually 21.3 GB/s
>>>> the real maximum Triad number must be 21.3/3 = 7.1 GB/s.
>>>
>>
>> I don't get this - triad does two reads and one write.
>> if you don't use store-through ('nt' versions of mov),
>> then the write also implies a read for write-allocate
>> (filling the cache line).
>> without store-through, the peak theoretical number reported by
>> stream should be 3*peak/4. the 4 is because there are 3r+1w,
>> and the 3 because stream doesn't give credit for write-allocate.
>
> That looks right. So, one socket, with write allocate, >>should<< show:
>
> 10.5 GB/sec * .75 or 7.875 GBytes/sec
>
> and two sockets 15.75 GBytes/sec. The problem could be related
> to competitive/ineffective use of the shared L2 cache or a bottleneck
> in the North bridge. It would seem that a look at how the performance
> grows
> as you add cores within versus across sockets should reveal this.
>
> Two cores on separate sockets should show higher numbers if it's
> an L2 cache issue. If they are the same as those for 2 cores on one
> socket then you have a problem with the North bridge or getting
> full bandwidth from the FB-DIMMs.
> A complication in this test could be that in the one core per socket case
> the whole L2 cache is allocated to a single core. Watching performance
> change as the array sizes grow should reveal this.
> rbw
>
>
>>
>>> Then how do you explain a dual opteron with two 6.4GB/sec (peak)
>>> memory system, 12.8GB/sec total per node managing 9-10GB/sec?
>>>
>>> 12.8/3=4.26GB/sec. People are seeing well over twice that.
>>
>>
>> since pathscale does write-through, the peak really should be 12.8,
>> so achieving 9-10 is decent but not paradoxical. (the peak would
>> correspond to 1.07 Gflops, significantly below the peak theoretical
>> pipeline rate of 2*clock flops...)
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org
>> To change your subscription (digest mode or unsubscribe) visit
>> http://www.beowulf.org/mailman/listinfo/beowulf
>>
>
>
More information about the Beowulf
mailing list