[Beowulf] AMD64 results...
Kozin, I (Igor)
i.kozin at dl.ac.uk
Thu Dec 16 05:27:33 PST 2004
Hi Bill, very interesting results.
> Ah, got icc-8.1 to cooperate, dual 2.2 Ghz opteron+pc3200+2.4 kernel,
> 915.5MB array:
> -O1
> Function Rate (MB/s) Avg time Min time Max time
> Copy: 2285.8039 0.2640 0.2800 0.3200
> Scale: 2206.9798 0.2690 0.2900 0.3000
> Add: 2341.5554 0.3740 0.4100 0.4200
> Triad: 2181.9031 0.4060 0.4400 0.4800
>
> -O2
> Function Rate (MB/s) Avg time Min time Max time
> Copy: 2370.4856 0.2570 0.2700 0.3400
> Scale: 2285.8280 0.2670 0.2800 0.3400
> Add: 2461.6513 0.3710 0.3900 0.4600
> Triad: 2285.8229 0.3920 0.4200 0.5000
pls note that your "average time" is sometimes less than "min time".
> -O3
> Function Rate (MB/s) Avg time Min time Max time
> Copy: 2461.5867 0.2730 0.2600 0.3400
> Scale: 2370.4237 0.2910 0.2700 0.3500
> Add: 2526.3684 0.4050 0.3800 0.4800
> Triad: 2341.5151 0.4320 0.4100 0.5100
>
> The strange thing is they are 32 bit binaries, despite being built
> on a 64 bit os on a 64 bit hardware.
how do you know they are not 64bit? From what I see it is.
quad 2.2 Opteron, 9 GB, SLES 9, 2.4.21
it seems my memory is a bit slower than yours.
PARAMETER (n=32000000,offset=0,ndim=n+offset,ntimes=50)
i.e. using 732 MB
pathscale 1.4 -O3
Function Rate (MB/s) Avg time Min time Max time
Copy: 3555.5778 0.1444 0.1440 0.1450
Scale: 3483.0084 0.1473 0.1470 0.1480
Add: 3588.8372 0.2142 0.2140 0.2150
Triad: 3605.6772 0.2134 0.2130 0.2140
ifort -O3 -xW
Copy: 3657.1588 0.1458 0.1400 0.1500
Scale: 3657.1588 0.1475 0.1400 0.1500
Add: 3490.9503 0.2273 0.2200 0.2300
Triad: 3339.1509 0.2317 0.2300 0.2400
> Not sure why the timer is so lousy,
> I had to make the array large to get a reasonably accurate time:
This is indeed another interesting point. I'd really like to understand it.
In addition when I re-run stream the rates vary quite a bit despite
the high loop count (50) and very small std dev (min & max are pretty close).
e.g. two more times ifort -O3 -xW
Copy: 2560.0146 0.2094 0.2000 0.2200
Scale: 2560.0146 0.2094 0.2000 0.2200
Add: 2477.4389 0.3219 0.3100 0.3300
Triad: 2400.0023 0.3285 0.3200 0.3300
Copy: 3657.1588 0.1454 0.1400 0.1500
Scale: 3657.1588 0.1473 0.1400 0.1500
Add: 3490.9503 0.2256 0.2200 0.2300
Triad: 3339.1371 0.2300 0.2300 0.2300
Igor
> I played around with various mentioned optimizations (including -xW)
> on the manpage, I never managed a 64 bit binary with icc-8.1 though.
> The man page has numerous i32em and em64t references.
>
>
>
>
> --
> Bill Broadley
> Computational Science and Engineering
> UC Davis
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe)
> visit http://www.beowulf.org/mailman/listinfo/beowulf
>
I. Kozin (i.kozin at dl.ac.uk)
CCLRC Daresbury Laboratory
tel: 01925 603308
http://www.cse.clrc.ac.uk/disco
More information about the Beowulf
mailing list