[Beowulf] Again about NUMA (numactl and taskset)

Vincent Diepeveen diep at xs4all.nl
Tue Jun 24 14:28:46 PDT 2008

What would interest me is if you describe how you get your information
on how instructions pair and what weak sequences are at the processor.

Like for example the by  now old AMD K8 used to have the feature that  
if you do an
integer multiplication, that the first and last cycle of the latency,  
it blocks all other execution units.

Besides that this still kicks butt compared to intel's implementation  
of integer multiplication
(proof of this statement: Most of GMP's functions are integer  
multiplication dominated and AMD k8 already
murders core2 as a result of that), this is total crucial to know  
when building a compiler.

Did this information already sit in Pathscales database of "processor  

If so, where did you get the knowledge?


On Jun 24, 2008, at 11:07 PM, Greg Lindahl wrote:

> On Tue, Jun 24, 2008 at 10:21:01PM +0200, Vincent Diepeveen wrote:
>> The PG compiler and especially pathscale compiler are doing rather
>> well at benchmarks,
>> especially that last, yet at our codes they're real ugly. Maybe they
>> do better for floating point
>> oriented workloads, which doesn't describe game tree search.
> There are certainly unusual codes out there, and PathScale has gotten
> a lot of examples sent in by customers, thanks to the "if we're slower
> than someone else, it's a bug" philosophy. This allowed us to improve
> the compiler on a lot of non-benchmark codes.
> In your case, I'd suggest that you use pathopt to search for better
> flags.
> -- greg

More information about the Beowulf mailing list