[Beowulf] CCL:Question regarding Mac G5 performance (fwd from mmccallum@pacific.edu)
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Joe Landman landman at scalableinformatics.comThu May 20 18:01:06 PDT 2004
- Previous message: [Beowulf] CCL:Question regarding Mac G5 performance (fwd from mmccallum@pacific.edu)
- Next message: [Beowulf] CCL:Question regarding Mac G5 performance (fwd from mmccallum@pacific.edu)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Thu, 2004-05-20 at 20:26, Michael Huntingdon wrote: > At 01:23 PM 5/20/2004, Joe Landman wrote: > >On Thu, 2004-05-20 at 02:05, Michael Huntingdon wrote: > > > I've spent some time sifting through the attached numbers. Though not each > > > >Any paper that starts out praising spec as a "good" predictive benchmark > >is suspect. Benchmarking is difficult to do right, in large part > >because of deceptively simple scoring functions (time, frequency (not of > >the CPU, but number of iterations per unit time), ...). > Although SPEC was included among others, I didn't see where Martyn Guest > "praised" SPEC. Page 5: "One of the most useful indicators of CPU performance is provided by the SPEC ... benchmarks" Many folks would take issue with the exact utility of these benchmarks. At least he points out later in the document (page 6) some of the serious flaws in the benchmark. The problem is that he continues to use it as a valid scoring function. We can argue and debate over this, but the numbers are of highly dubious value at best. > > >Further, looking these over, I did not see much of a discussion (though > >it is implied by the use of certain compilers) of the effects of things > >like SSE2 in the P4, memory alignment, 32/64 bit > >compilation/optimization, use of tuned libraries where available... > >Given the sheer number of machines tested, it is unlikely that they used > >up to date compilers (latest gcc's are better than earlier gcc's for > >performance), or recompiled the binary for all the different platforms > >to run native. The ifc results seem to indicate that they used SSE2 on > >P4's but probably used plain old 32 bit code on Opterons. > > I wouldn't begin to speculate; however, would hope Daresbury Laboratory and > Martyn Guest were working to advance research, using the best technology > available for each platform. I didn't see anything in their mission > statement which leads me to think otherwise. No one is implying that they would do anything less than advance the state of knowledge. It is important to note that little information (I may have missed it, so please do point it out if you find it) exists on the use of the -m64 gcc compilation for Opterons (gets you a nice performance boost in many cases, and in a number of chemistry applications I have worked with), or the ACML libraries for high performance *GEMM operations on AMD, or the Altivec compilation/math libraries, or the SGI performance libraries, ... etc. That is, as I implied, it would be quite difficult for the lab to a) test all the machines, b) test all the machines optimally. In fact, they specifically indicate that they could not do so (see page 4) due to time constraints. While the information in here does appear to be useful (and I did not state otherwise), it does not constitute an exhaustive investigation of machine performance characteristics. It does appear to compare how well some programs ran on limited time loaner machines, donated hardware, etc. Which means the operative issue is to get results quickly and hope you can do some fast optimization. It would be dangerous to draw conclusions beyond the text which the authors specifically caution against. > > > lends itself to hp Itanium 2, there appears to be a very balanced trend. > > > >... in a specific set of operations relevant for specific classes of > >calculation. > > The tables cover a wide range of benchmarks specific to the interests of > those working in computational chemistry. With respect to this, the rx2600 > (Itanium 2 based) ranked among the top ten (with the exception of table 4 > where it was ranked #11). Averaged out, the tables reflect an overall > rating of 5.86 among the 400 platforms tested. My initial conclusion may > have been less than scientific, but I'll stay with it for now. Thats fine. You of course are entitled to your opinion. You asked a simple question as to why there is not more discussion of this in these and other circles. Well, other people are entitled to their opinions, and it appears the market is indeed deciding between competitors. Aside from this, "benchmarks" are problematic to do right, in a completely non-biased manner. These benchmarks are interesting, but there was not enough detail given of the systems for others to try to replicate the work. For example, which OS, specific compiler versions, patches were used? For the non-spec codes, which compilation options were used? For the chips with SIMD capability, was it used (P4, Opteron, G5)? How was memory laid out? Was any attention paid to processor affinity and related scheduling? Remember that using the ifc/efc compilers with the Itanium chips as well as the Pentium chips gives you a significant leg up in performance as compared to using the gcc system on similar architectures. Moreover, there is a performance penalty to be paid for not picking the compiler options carefully under gcc or ifc. > >Not everyone in HPC does matrix work, eigenvalue extraction, etc. Some > >of us do things like string/db searching (informatics). There, the > >numbers look quite different. > > My comments referenced numerically intensive research rather than I/O > intensive environments. I'm surprised 8GB of memory is enough to sustain > superior performance when searching very large data sets normally > associated with bio-informatics. 8GB is enough for some, not enough for others. Some projects I have worked on (http://www.sgi.com/newsroom/press_releases/2001/january/msi.html) have used a few processors and a little memory. Again, as indicated by many others (and myself), the only things that matter are your (the end users) tests, with real data. I am intrigued by Martyn's chemistry tests, and when I get a free moment, I will send a note about possibly including a more protein oriented set of tests into BBS (http://bioinformatics.org/bbs) v2 due out soon (yeah I know I keep saying that, but it is soon...) > > Ciao~ > Michael > > > >[... snip ...] > > > >-- > >Joseph Landman, Ph.D > >Scalable Informatics LLC, > >email: landman at scalableinformatics.com > >web : http://scalableinformatics.com > >phone: +1 734 612 4615 -- Joseph Landman, Ph.D Scalable Informatics LLC, email: landman at scalableinformatics.com web : http://scalableinformatics.com phone: +1 734 612 4615
- Previous message: [Beowulf] CCL:Question regarding Mac G5 performance (fwd from mmccallum@pacific.edu)
- Next message: [Beowulf] CCL:Question regarding Mac G5 performance (fwd from mmccallum@pacific.edu)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
