SSE & compilers

Pedro Díaz Jiménez pdiaz88 at
Tue Aug 21 09:28:17 PDT 2001

Hash: SHA1

The Intel Compiler does a pretty good job optimizing for the Pentium  family 
(It can also compile/optimize for the Itanium, but don't have a such a beast 
to test it). Right now is in beta stage and a trial-ware version can be 
downloaded from Intel's webpage (available until September)

An unfinished (finished probably this october - not too much spare time here 
:-) review of the C/C++ compiler can be found here

And in the comments there is a link to another review of both the Fortran and 
C/C++ compiler from Computational Battery

- From the tests I've made so far (little ones: mandelbrot, sorting, (alleged) 
RC4 cipher, factorials and a couple of real world apps: lame and bzip2) speed 
achievement is important in C code. You can instruct the compiler to generate 
MMX & SSE code, while deciding if you want to retain compatibility for non 
MMX/SSE capable chips. All those tests do not include an interesting feature 
of the compiler: profile based optimization (they will, someday)

Another tests not included yet in the review are my implementation of the IBM 
Mars (AES finalist) cipher (25 Mbps gcc optimized - 40 Mbps icc optimized, 
all in ECB mode) and testing with the PovRay raytracer (not significant 

The compiler lacks of some gnu gcc extensions, making dificult/impossible to 
test some of the programs I would wanted (GnuPG and MySQL). Due to this it is 
also imposible to compile the Linux Kernel right now (Intel says it will fix 
this in the final release)

Take a look at where you will find the 
results of the bechnmarks and an interesting Intel paper about the 
optimization features in their compiler


Guys, there is an interesting poll at Planet Cluster (homepage). I'll 
appreciate your votes

For last, but no least, Rajkumar Buyya (IEEE TFCC) and I are preparing a 
cluster FAQ. It is unfinished, and can be found here:
(yeah, I should make a link or something :-)

The FAQ is about clusters in general, not specific to Beowulf clusters.We 
will appreciate your comments and suggestions

On Tuesday 21 August 2001 13:19, Dan Kirkpatrick wrote:
> 1. What do you know about SSE ? Apparently the p III and P 4 have extra
> hardware for fp work which is ignored by current compilers but can buy big
> factors in speed...
> we can code in assembler for it pretty easily if the PIII prcocessors have
> it.  Comments?
> 2. Apparently there are several optimizing compilers out there (like
> portland) which do better than gcc.  Any suggestions?  Information on
> costs?
> Thanks!
> Dan
> =======================================================
> Dan Kirkpatrick                   dkirk at
> Computer Systems Manager
> Department of Physics
> Syracuse University, Syracuse, NY
>    Fax:(315) 443-9103
> =======================================================
> _______________________________________________
> Beowulf mailing list, Beowulf at
> To change your subscription (digest mode or unsubscribe) visit

- -- 

 * Pedro Diaz Jimenez: pdiaz88 at, pdiaz at 
 * GPG KeyID: E118C651                              
 * Fingerprint: 1FD9 163B 649C DDDC 422D  5E82 9EEE 777D E118 C65
 * Clustering & H.P.C. news and documentation       
 * "La sabiduria me persigue, pero yo soy mas rapido"

Version: GnuPG v1.0.4 (GNU/Linux)
Comment: For info see


More information about the Beowulf mailing list