energy costs
Richard Walsh
rbw at ahpcrc.org
Wed Mar 12 06:42:50 PST 2003
Mark Hahn wrote:
>> PS Pentium 4 sustained performance from memory is about
>> 5% of peak (stream triad).
>
>that should be 50%, I think.
>
Nope ... not "from memory".
A 2.8 GHz P4 using SSE2 instructions can deliver two
64-bit floating point results per clock or 5.6 Gflops
peak performance at this clock. The stream triad (a
from-memory, multiply-add operation) for a 2.8 GHz
P4 produces only 200 Mflops (see stream website). The
arithmetic is then:
200/5600 = .0357 or 3.57% (so 5% is a gift)
This is a worse-case, from-memory, scenario which
include for little or no cache re-use. It asks the
question, "what part of peak can my memory sub-system
sustain?" On the Cray X1 (and most vector machines),
the same worst-case, from-memory, scenario yields 25%
of peak. This is why Cray is still making custom vector
computers.
As you suggest, the P4 will (as does the Cray X1) do
significantly better when cache use/re-use is a
significant factor.
Regards,
rbw
#---------------------------------------------------
# Richard Walsh
# Project Manager, Cluster Computing, Computational
# Chemistry and Finance
# netASPx, Inc.
# 1200 Washington Ave. So.
# Minneapolis, MN 55415
# VOX: 612-337-3467
# FAX: 612-337-3400
# EMAIL: rbw at networkcs.com, richard.walsh at netaspx.com
# rbw at ahpcrc.org
#
#---------------------------------------------------
# "Without mystery, there can be no authority."
# -Charles DeGaulle
#---------------------------------------------------
More information about the Beowulf
mailing list