We have a project that I  have timed on various CPUs, and compilers. It 
makes heavy use of FFTW (single precision).

Running native i386 code does very well for us on the Opteron, unlike 
the Itanium 2.

*Itanium 2 1.0  GHz on Red Hat Linux Advanced Workstation release 2.1AW*
234 seconds (IA64 code and ECC, -fno-alias -IPF_fma -Ob2 -ipo -restrict)
1515 seconds (i386 code and GCC)

*AMD Opteron 1.4 GHz on United Linux 1.0*
189 seconds (gcc, x86_64 code)
221 seconds (gcc, i386 code)

*Athlon 1.5 GHz  (AMD Athlon(tm) MP Processor 1800+) on Redhat 8.0*
190 seconds (gcc, i386 code)

*Intel Xeon 1.7 GHz on Redhat 8.0*
267 seconds (gcc, i386 code)

*Sun Intel Xeon 2.8 GHz on Redhat 7.3*
150 seconds (gcc, i386 code)

*Alpha 667 GHz*
430 seconds (Compaq compiler for both, optimized code)

*Intel 2.2Ghz Xeon on RH 7.3*
208 seconds (gcc, i386 code)


