[Beowulf] Re: computing on Altix? (Andrew Piskorski)
hahn at physics.mcmaster.ca
Mon Sep 12 13:19:30 PDT 2005
> >Google for "superlinear speedup". Most likely, as you split up your
> >fixed problem size among more processors, more and more of it fits
> >into the processor cache, where it runs much faster due to fewer main
> >memory accesses.
also google for "strong scaling" and contrast to "weak scaling".
the former assumes a fixed problem size and a range of ncpus;
the latter assumes a fixed problem *per* cpu. I suspect you'll have
a hard time showing superlinear speedup under weak scaling ;)
> This cache effect is quite profound on Altix since some of these have
> something like 9 MB cache per processor. You can see this result on
that's the irony: the it2 really works well when data is all in-cache,
or can somehow be prefetch+streamed so that cache misses don't happen.
once you start missing, performance becomes unexceptional - you can
easily see this by looking at SpecFP results. there, the it2's excellent
scores is mainly due to extremely high results in the 2-3 very smallest
around here, it's mainly serial monte-carlo jobs that are so small that
they're always in-cache. so the "high-end" it2 (and expensive) is best
suited for the lowest-end jobs...
More information about the Beowulf