[Beowulf] Woodcrest Memory bandwidth

Peter Kjellstrom cap at nsc.liu.se
Tue Aug 15 10:21:02 PDT 2006


On Tuesday 15 August 2006 17:25, Richard Walsh wrote:
> Mark Hahn wrote:
> >>> Good point which makes perfect sense to me.
> >>> Given that the theoretical maximum is actually 21.3 GB/s
> >>> the real maximum Triad number must be 21.3/3 = 7.1 GB/s.
> >
> > I don't get this - triad does two reads and one write.
> > if you don't use store-through ('nt' versions of mov),
> > then the write also implies a read for write-allocate
> > (filling the cache line).
> > without store-through, the peak theoretical number reported by
> > stream should be 3*peak/4.  the 4 is because there are 3r+1w,
> > and the 3 because stream doesn't give credit for write-allocate.
>
> That looks right.  So, one socket, with write allocate, >>should<< show:
>
>       10.5 GB/sec * .75 or 7.875 GBytes/sec
>
> and two sockets 15.75 GBytes/sec.  The problem could be related
> to  competitive/ineffective use of the shared L2 cache or a bottleneck
> in the North bridge.  It would seem that a look at how the performance
> grows as you add cores within versus across sockets should reveal this.

here you go (dell 2950 with 8 modules and streams compiled with icc-9.1 -O3:

[root at tbox3 streamd]# hostname ; date ; for i in 1 2 3 4 5 ; do export 
OMP_NUM_THREADS=$i ; ./streamd | egrep "Total memory re|Number of Th|Function 
|Copy:|Scale:|Add:|Triad:"; done
tbox3
Fri Aug 11 17:59:22 CEST 2006
Total memory required = 457.8 MB.
Number of Threads requested = 1
Function      Rate (MB/s)   Avg time     Min time     Max time
Copy:        3945.5494       0.0812       0.0811       0.0813
Scale:       2914.9758       0.1098       0.1098       0.1099
Add:         3227.5618       0.1488       0.1487       0.1489
Triad:       3219.5307       0.1492       0.1491       0.1493
Total memory required = 457.8 MB.
Number of Threads requested = 2
Function      Rate (MB/s)   Avg time     Min time     Max time
Copy:        4324.2058       0.0741       0.0740       0.0742
Scale:       2999.9626       0.1068       0.1067       0.1069
Add:         3309.2733       0.1451       0.1450       0.1452
Triad:       3309.7031       0.1451       0.1450       0.1452
Total memory required = 457.8 MB.
Number of Threads requested = 3
Function      Rate (MB/s)   Avg time     Min time     Max time
Copy:        5422.5441       0.0590       0.0590       0.0590
Scale:       4102.8364       0.0780       0.0780       0.0781
Add:         4487.2464       0.1070       0.1070       0.1070
Triad:       4487.7465       0.1070       0.1070       0.1070
Total memory required = 457.8 MB.
Number of Threads requested = 4
Function      Rate (MB/s)   Avg time     Min time     Max time
Copy:        6023.2969       0.0532       0.0531       0.0533
Scale:       4862.4855       0.0658       0.0658       0.0659
Add:         5264.1973       0.0912       0.0912       0.0913
Triad:       5268.1782       0.0911       0.0911       0.0911
Total memory required = 457.8 MB.
Number of Threads requested = 5
Function      Rate (MB/s)   Avg time     Min time     Max time
Copy:        5504.9004       0.0582       0.0581       0.0582
Scale:       4318.9044       0.0786       0.0741       0.1147
Add:         4705.1016       0.1042       0.1020       0.1216
Triad:       4705.2885       0.1038       0.1020       0.1184

> Two cores on separate sockets should show higher numbers if it's
> an L2 cache issue.  If they are the same as those for 2 cores on one
> socket then you have a problem with the North bridge or getting
> full bandwidth from the FB-DIMMs.
>
> A complication in this test could be that in the one core per socket case
> the whole L2 cache is allocated to a single core.  Watching performance
> change as the array sizes grow should reveal this.
>
> rbw
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.scyld.com/pipermail/beowulf/attachments/20060815/f80502be/attachment.bin


More information about the Beowulf mailing list