[Beowulf] bizarre scaling behavior on a Nehalem

Tom Elken tom.elken at qlogic.com
Thu Aug 13 16:37:07 PDT 2009


> On Behalf Of Christian Bell
> On Aug 12, 2009, at 11:14 AM, Bill Broadley wrote:
> 
> > Is it really necessary for dynamic arrays
> >  to be substantially slower than static?
> 
> Yes -- when pointers, the compiler assumes (by default) that the
> pointers can alias each other, which can prevent aggressive
> optimizations that are otherwise possible with arrays. 
 ...
>  I remember stacking half a dozen pragmas over a
> 3-line loop on a Cray C compiler years ago to ensure that accesses
> where suitably optimized (or in this case, vectorized).

To add some details to what Christian says, the HPC Challenge version of STREAM uses dynamic arrays and is hard to optimize.  I don't know what's best with current compiler versions, but you could try some of these that were used in past HPCC submissions with your program, Bill:

PathScale 2.2.1 on Opteron:
Base OPT flags: -O3 -OPT:Ofast:fold_reassociate=0 
STREAMFLAGS=-O3 -OPT:Ofast:fold_reassociate=0 -OPT:alias=restrict:align_unsafe=on -CG:movnti=1

Intel C/C++ Compiler 10.1 on Harpertown CPUs:
Base OPT flags:	 -O2 -xT -ansi-alias -ip -i-static

Intel recently used
Intel C/C++ Compiler 11.0.081 on Nehalem CPUs:
	 -O2 -xSSE4.2 -ansi-alias -ip
and got good STREAM results in their HPCC submission on their ENdeavor cluster.

-Tom

> 
> 
> 	. . christian
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
> Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf




More information about the Beowulf mailing list