/. US DOE gets a $24.5 Million Linux Supercomputer

Richard Walsh rbw at ahpcrc.org
Thu Apr 18 07:37:07 PDT 2002

Greg Lindahl wrote:

>This bid is for an install in the future, and it involves a
>combination of McKinley and Madison parts. I don't believe that Intel
>has made Madison's specs available, nor has HP made the specs of the 
>chipset they'll be using available. 

 True, but the dual-floating point units in the core are not
 likely to be added to ... so its a question of what the clock
 is going to be (is my estimate of 1.5 GHz unreasonable?), what
 the impact of the chipset/system-bus on memory bandwidth is 
 going to, and cache sizes.

>It's likely that they aren't quoting peak; PNL prefers figures like
>the actual speed of matrix-matrix multiple (DGEMM). Now the Itanium is
>reasonably good at delivering a nice % of peak for DGEMM, but it's not
>the same as peak. It's a lot more fair number to use than peak, and
>gives you a good idea of what the Top500 Linpack score will be.

 True, the source is not official, I guess, but when no qualifying
 information is given the numbers presented are usually peak. If
 they aren't then my numbers would need to be reworked on a different
 envelop ;-) ... 
 ... but there is another issue if we assume that the 8.3 TFLOPS is DGEMM 
 performance at say 50% of peak (doing this on a large matrix (G98) 
 would require very good bandwidth to memory) then these 1400 processors 
 must have a system peak of around 17 TFLOPS. What does this mean for
 clock period ... ??

 Assuming the same number (4) of FMA's per core on the Madison, then each
 processor is capable of 12 GFLOPS peak.  This would mean that the processors
 would have to be running at 3 GHz (when are they taking delivery).  This 
 seems a bit high to me seeing as the Itanium is sitting at 800 MHz and 
 does not have a 20 stage pipeline like the Pentium 4 ... but if the deliver
 is far enough into the future who knows.  The idea that the Madison will 
 have more floating point cores seems unlikely (how you going to feed them 
 without real vector memory loads?).

 My $$ per MFLOPS estimates are ballpark numbers, but did include 
 the cost of interconnect (Myrinet or better), and a large chunk 
 of disk and memory. But I won't claim they are perfectly apples-to
 -apples. They were estimated based on estimated purchase price 
 only ... they do not include total cost of ownership effects or 
 factor in expected utilization over the term of ownership (an
 import consideration).

 When I saw the posting, I was surprise how few IA-64 processors 
 (even with the extras) were to be had for ~$25,000,000. The Pentium
 4 looks a better deal at this level of analysis.



