[Beowulf] Woodcrest Memory bandwidth

Stu Midgley sdm900 at gmail.com
Mon Aug 14 16:35:29 PDT 2006


sorry, forgot to reply all... don't you hate gmail's interface sometimes?


What is the memory latency of the woodcrest machines?  Since memory
latency really determines your memory bandwidth.

If Intel hasn't made any improvements in latency then the limited
number of out-standing loads in the x86-64 architecture will limit the
bandwidth regarless of the MB/s you throw at it.



On 8/15/06, Richard Walsh <rbw at ahpcrc.org> wrote:
> Joe Landman wrote:
>
>  >4-threads
>  >
>  >Copy:        6645.4125       0.0965       0.0963       0.0976
>  >Scale:       6994.6233       0.0916       0.0915       0.0917
>  >Add:         6373.0207       0.1508       0.1506       0.1509
>  >Triad:       6710.7522       0.1432       0.1431       0.1433
>  >
>  >I may have been Bill's 10 GB/s source, and that may have been a mixup
> on my part.
>
> 10 GB/sec of course comes from the advertised bandwidth off a single socket.
>
> Yes, this is quite disappointing because the "on-paper" numbers from each
> socket to the Northbridge are nicely balanced with the 4-channel FB-DIMM
> numbers.  Then there is all the discussion of the advantages of the
> shared L2 cache
> and the shared-cache-intelligent pre-fetch engines and cool memory
> dis-ambiguation.
> Seemingly irrelevant I guess, if the Northbridge is still under designed.
>
> Is it possible that the compilers are just not ready to effectively use
> some of these
> features ... ?? ... on the other hand stream is sufficiently simple that
> these features
> probably do not come into play anyway.  The real application benchmarks
> with
> some quantity of locality look better.
>
> Any one working on compilers care to comment what's the bottleneck
> really is?
>
> rbw



-- 
Dr Stuart Midgley
sdm900 at gmail.com



More information about the Beowulf mailing list