[Beowulf] AMD64 results...
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Robert G. Brown rgb at phy.duke.eduThu Dec 16 05:08:21 PST 2004
- Previous message: [Beowulf] AMD64 results...
- Next message: [Beowulf] AMD64 results...
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Wed, 15 Dec 2004, Bill Broadley wrote: > Group reply: > > On Wed, Dec 15, 2004 at 05:49:09PM -0500, Robert G. Brown wrote: > > Just for those of you who were asking after AMD64's as viable compute > > platforms, I just ran stream and the bogomflops benchmark in my renamed > > "benchmaster" (was cpu_rate) shell on both a 2.4 GHz AMD64 3400+ > > That is a s754 amd64? Yes (as per earlier discussion, an Asus K8NE, but I should have restated it -- the P2 is an MSI mobo but I'm downstairs and don't remember which one). > > > They are all below. Executive summary is that the AMD barely beats > > (real) clock speed scaling compared to the P2 for stream. I suspect > > that this is not yet the end of the story, though, as I see little > > difference between the i386 benchmark results and the x86_64 results > > when running the program compiled both ways on metatron. > > Double registers only help if you need them. Most codes won't > automatically utilize native 64 bit ints or pointers to any > significant advantage. Well, stream is as much a memory bandwidth test as it is a floating point test per se anyway. I always hope for something dramatic when I use faster/wider memory, but usually reality is fairly sedate. > > The INTERESTING story is in bogomflops, which includes division. There > > metatron was a whopping 2.8x faster than lucifer, while its clock is > > only 1.33x faster. It more than doubled its relative clockspeed > > advantage, so to speak. One can see how having 64 bits would really > > speed up 64 bit division compared to doing it in software across > > multiple 32 bit registers... > > Interesting data point. > > > Hope this is interesting/useful to somebody. I put "real stream" at the > > very end. "real stream" uses the best time where benchmaster uses the > > average time so benchmaster results are typically a few percent lower > > (and likely just that much more realistic as well). > > Similar data points for an opteron, dual (stream using 1 cpu) 2.2 GHz, > with PC3200 memory (915.5MB array). Not sure why the timer is so lousy, > I had to make the array large to get a reasonably accurate time: You should try stream inside my leedle harness that uses the CPU cycle counter clock. It autotunes iterations and so forth and generates an SD as well as mean. That's the "clock granularity" thing in my test results. Note that it is 3 nsec on the AMD 64 and almost 100 nsec on the P2. This is also an interesting data point -- it suggests that integer instructions may be considerably faster on the AMD64. I'll have to run a mixed code program to find out, though. > I suspect the below numbers would be higher if I had a uniprocessor system > (never have a remote memory access or wait for the memory coherency) > or with a 2.6 Kernel (which is better about insuring that pages and the > process acting on the page is on the same cpu). > > Kudos for the pathscale-1.4 compiler with -O3. Now that's something to try. I still haven't started my thirty day three trial that I signed up for two months ago (I should know better than to do that right before the end of classes). I've got almost a good month of reduced teaching ahead -- maybe I'll start it now. From everything I've heard and seen, I'm going to end up buying a license or two anyway -- it seems like it is just a really, really good product being maintained by some very serious people. Of course, a factor of two in speed (for certain code) for the cost of a software license is a hell of a lot cheaper than buying a cluster twice as large. That helps. rgb > > gcc-3.2.3 -O1: > Function Rate (MB/s) Avg time Min time Max time > Copy: 2206.8823 0.3010 0.2900 0.3800 > Scale: 2285.7067 0.2880 0.2800 0.3700 > Add: 2285.7087 0.4140 0.4200 0.5300 > Triad: 2285.7152 0.3910 0.4200 0.4700 > > -O2 > Function Rate (MB/s) Avg time Min time Max time > Copy: 1777.7736 0.3240 0.3600 0.3600 > Scale: 1777.7783 0.3240 0.3600 0.3600 > Add: 1882.3495 0.4590 0.5100 0.5100 > Triad: 1882.3530 0.4590 0.5100 0.5100 > > -O3 > Function Rate (MB/s) Avg time Min time Max time > Copy: 1777.7924 0.3260 0.3600 0.3700 > Scale: 1828.4723 0.3230 0.3500 0.3600 > Add: 1882.3679 0.4640 0.5100 0.5200 > Triad: 1846.1717 0.4720 0.5200 0.5300 > > gcc-3.4.3 -O1: > Function Rate (MB/s) Avg time Min time Max time > Copy: 1729.6823 0.3330 0.3700 0.3700 > Scale: 1828.5184 0.3230 0.3500 0.3600 > Add: 1846.1048 0.4680 0.5200 0.5200 > Triad: 1846.1040 0.4680 0.5200 0.5200 > > -O2: > Function Rate (MB/s) Avg time Min time Max time > Copy: 2133.3337 0.2960 0.3000 0.3500 > Scale: 2133.3337 0.2980 0.3000 0.3500 > Add: 2232.5578 0.4270 0.4300 0.5100 > Triad: 2181.8132 0.4310 0.4400 0.5100 > > -O3: > Function Rate (MB/s) Avg time Min time Max time > Copy: 2285.6561 0.2630 0.2800 0.3600 > Scale: 2285.6581 0.2580 0.2800 0.3100 > Add: 2341.4071 0.3800 0.4100 0.4700 > Triad: 2285.6555 0.3880 0.4200 0.5200 > > Pathscale-1.4 -O1: > Function Rate (MB/s) Avg time Min time Max time > Copy: 1999.9498 0.2880 0.3200 0.3200 > Scale: 2064.4625 0.2840 0.3100 0.3200 > Add: 2232.5009 0.3950 0.4300 0.4400 > Triad: 2232.4910 0.3930 0.4300 0.4400 > > -O2 > Function Rate (MB/s) Avg time Min time Max time > Copy: 2461.5205 0.2410 0.2600 0.2700 > Scale: 2285.6970 0.2530 0.2800 0.2900 > Add: 2341.4466 0.3730 0.4100 0.4200 > Triad: 2399.9765 0.3670 0.4000 0.4100 > > -O3 > Function Rate (MB/s) Avg time Min time Max time > Copy: 3764.6831 0.1540 0.1700 0.1800 > Scale: 3764.6831 0.1530 0.1700 0.1700 > Add: 4173.8781 0.2080 0.2300 0.2400 > Triad: 4173.8781 0.2110 0.2300 0.2400 > > -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
- Previous message: [Beowulf] AMD64 results...
- Next message: [Beowulf] AMD64 results...
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
