[Beowulf] SPEC CPU 2006 released

Robert G. Brown rgb at phy.duke.edu
Mon Sep 4 13:18:07 PDT 2006

On Mon, 4 Sep 2006, Joe Landman wrote:

> The most important benchmark is the one that uses the same code you use in 
> the way you are going to use it.

True (and I generally agree with everything else Joe said as well except
that I think he meant a one-dimensional projection, or even two, of a
multidimensional function, not really zero).

> Anything else is an entropy generator.

Well, no, not really.  They can reduce the entropy of the decision
space, they just don't live in the exact same projectional subspace as
the problem itself, so at most they yield (hopefully important)
components of the expected performance.  How important those components
are depends in large part in how close the projections are.  A monte
carlo benchmark CAN be a good predictor for other similar monte carlo
programs, not just the one that generated the benchmark.  Code that is
bound by a given stream-like linear operation CAN be well-predicted by
stream.  The more mixed the code, the more unlike all the component
benchmarks, the less useful.

And to be fair, although people tend to quote e.g. "specint" or "specfp"
performance (which are cumulative/projective) one can go in and look at
performance on the components, which is perhaps more useful if one of
them has a good projective overlap with your task.

None of which changes my agreement with you regarding open source
solutions and performing a fairly systematic search for a good
constellation of macro benchmarks inside a harness that can test YOUR
application in CONCERT with the constellation and perhaps build a
functional performance relationship out of the performance spectrum.
That is, it is quite possible to build a "benchmarking tool" that one
can run on your own application, suitably encapsulated, and a set of
micro and macro benchmarks that can (over time and with the accumulation
of some "experience") be transformed into a multivariate map of the
covariance of the distinct results with your application, i.e. a
predictive map.  This can then be used to extrapolate performance even
on hardware where the result vector is known but your application
performance cannot for some reason be directly measured.  Probably quite
accurately with as few as 2-3 distinct hardware platforms as model input

(Just a little hint on a way to proceed to generate a genuinely NEW kind
of "benchmarking tool" -- one that is good for something!)


> [/soap box]

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu

More information about the Beowulf mailing list