Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] AMD64 results...

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Kozin, I (Igor) i.kozin at dl.ac.uk
Thu Dec 16 05:27:33 PST 2004


Hi Bill, very interesting results.
 
> Ah, got icc-8.1 to cooperate, dual 2.2 Ghz opteron+pc3200+2.4 kernel,
> 915.5MB array:
> -O1
> Function      Rate (MB/s)   Avg time     Min time     Max time
> Copy:        2285.8039       0.2640       0.2800       0.3200
> Scale:       2206.9798       0.2690       0.2900       0.3000
> Add:         2341.5554       0.3740       0.4100       0.4200
> Triad:       2181.9031       0.4060       0.4400       0.4800
> 
> -O2
> Function      Rate (MB/s)   Avg time     Min time     Max time
> Copy:        2370.4856       0.2570       0.2700       0.3400
> Scale:       2285.8280       0.2670       0.2800       0.3400
> Add:         2461.6513       0.3710       0.3900       0.4600
> Triad:       2285.8229       0.3920       0.4200       0.5000

pls note that your "average time" is sometimes less than "min time".
 
> -O3 
> Function      Rate (MB/s)   Avg time     Min time     Max time
> Copy:        2461.5867       0.2730       0.2600       0.3400
> Scale:       2370.4237       0.2910       0.2700       0.3500
> Add:         2526.3684       0.4050       0.3800       0.4800
> Triad:       2341.5151       0.4320       0.4100       0.5100
> 
> The strange thing is they are 32 bit binaries, despite being built
> on a 64 bit os on a 64 bit hardware.

how do you know they are not 64bit? From what I see it is.

quad 2.2 Opteron, 9 GB, SLES 9, 2.4.21
it seems my memory is a bit slower than yours.
      PARAMETER (n=32000000,offset=0,ndim=n+offset,ntimes=50)
i.e. using  732 MB

pathscale 1.4 -O3
Function      Rate (MB/s)   Avg time     Min time     Max time
Copy:       3555.5778      0.1444      0.1440      0.1450
Scale:      3483.0084      0.1473      0.1470      0.1480
Add:        3588.8372      0.2142      0.2140      0.2150
Triad:      3605.6772      0.2134      0.2130      0.2140

ifort -O3 -xW
Copy:       3657.1588      0.1458      0.1400      0.1500
Scale:      3657.1588      0.1475      0.1400      0.1500
Add:        3490.9503      0.2273      0.2200      0.2300
Triad:      3339.1509      0.2317      0.2300      0.2400

> Not sure why the timer is so lousy,
> I had to make the array large to get a reasonably accurate time:

This is indeed another interesting point. I'd really like to understand it.
In addition when I re-run stream the rates vary quite a bit despite 
the high loop count (50) and very small std dev (min & max are pretty close).

e.g. two more times  ifort -O3 -xW
Copy:       2560.0146      0.2094      0.2000      0.2200
Scale:      2560.0146      0.2094      0.2000      0.2200
Add:        2477.4389      0.3219      0.3100      0.3300
Triad:      2400.0023      0.3285      0.3200      0.3300

Copy:       3657.1588      0.1454      0.1400      0.1500
Scale:      3657.1588      0.1473      0.1400      0.1500
Add:        3490.9503      0.2256      0.2200      0.2300
Triad:      3339.1371      0.2300      0.2300      0.2300

Igor
 
> I played around with various mentioned optimizations (including -xW)
> on the manpage, I never managed a 64 bit binary with icc-8.1 though.
> The man page has numerous i32em and em64t references.
> 
> 
> 
> 
> -- 
> Bill Broadley
> Computational Science and Engineering
> UC Davis
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) 
> visit http://www.beowulf.org/mailman/listinfo/beowulf
> 

I. Kozin  (i.kozin at dl.ac.uk)
CCLRC Daresbury Laboratory
tel: 01925 603308
http://www.cse.clrc.ac.uk/disco




More information about the Beowulf mailing list