[Beowulf] [gorelsky@stanford.edu: CCL:dual-core Opteron275performance]
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Mikhail Kuzminsky kus at free.netMon Jul 4 10:48:28 PDT 2005
- Previous message: [Beowulf] [gorelsky@stanford.edu: CCL:dual-core Opteron275performance]
- Next message: [Beowulf] RASM : random memory latency test
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
In message from Vincent Diepeveen <diep at xs4all.nl> (Mon, 04 Jul 2005 17:59:40 +0200): > ... > ... >Of course we take a large buffer. Around 400MB is the working set >size for >the hashtable which i use for my chess software (which is reading >randomly >a 8-64 bytes from the cache). > >Results: > single cpu A64 : 91 ns (cl2 memory) > single cpu P4 : 220 ns (cl2 memory, bus overclocked) > dual opteron : 120 ns > quad opteron : 133 ns > dual xeon : 280 ns (800Mhz bus) > dual xeon : 400 ns (533Mhz bus) The latencies should depends from processors frequencies (although RAM part is much higher), so what was the frequencies for A64/P4/Opteron/Xeon ? And do I understand you correctly that you have 1/2/4 threads which perform "random" read of some bytes from main memory ? > >So obviously things that do not fit in L2 cache, the opteron runs >away with >it. Only if the executable is optimized in question by the intel c++ >compiler it will have done stuff to run it faster at intel processors >than >at opteron, >then results do not look too bad for P4. If the results above are for "bad" (bad optimizing) compiler - in some sense it's the problem of compiler :-) Yes, old binary software will work slow. But many, many HPC applications may be compiled from source. BTW, more good results are for icc++ only - do you know something about PGI and PathScale compilers ? > Yet that's a matter of >optimizing >it for opteron better, which most software dudes do NOT do, as intel >delivers good support and AMD historically didn't deliver *any* kind >of >support (they are improving now, but even then their math libraries >are so >pathetic compared to the ease of the intel libraries that i can >imagine at >least *that* part of >the problems). acml 2.1 gives me a set of good results for Opteron in comparison w/MKL Yours Mikhail
- Previous message: [Beowulf] [gorelsky@stanford.edu: CCL:dual-core Opteron275performance]
- Next message: [Beowulf] RASM : random memory latency test
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
