[Beowulf] AMD64 results...
hahn at physics.mcmaster.ca
Wed Dec 15 20:35:32 PST 2004
> > They are all below. Executive summary is that the AMD barely beats
> > (real) clock speed scaling compared to the P2 for stream. I suspect
sure - stream is normally dram test, not a CPU test.
> Double registers only help if you need them. Most codes won't
> automatically utilize native 64 bit ints or pointers to any
> significant advantage.
indeed, going 64b often costs a noticable overhead in code size
expansion and inflation of space to store pointers.
the real appeal of x86-64 is that you get twice as many registers.
yes, being able to actually use more than about 2.5 GB is nice,
and important to some people. but almost any real code will take
advantage of having twice as many registers (integer and SIMD).
> or with a 2.6 Kernel (which is better about insuring that pages and the
> process acting on the page is on the same cpu).
don't forget to turn on node interleave in the bios, too.
> Kudos for the pathscale-1.4 compiler with -O3.
ironically, icc -xW generates pretty good-for-opteron code,
though of course, it's 32b. I haven't tried using icc to
generate em64t/and64 code.
regards, mark hahn.
More information about the Beowulf