[Beowulf] DARPA issues 20 MUSD grant to nVidia to go from 1 GFLOPS/Watt to 75 GFLOPS/Watt
Vincent Diepeveen
diep at xs4all.nl
Mon Dec 17 17:30:03 PST 2012
On Dec 17, 2012, at 11:23 PM, Mark Hahn wrote:
>> "todays 1 gflop/watt" ?
>
> press releases always put the new shiny thing in the best light.
> they're probably thinking of a conventional compute node,
> (say, 32 cores, 2.3 GHz, 4 flops/cycle, or 16c and 8 f/c -
> either way totalling 294 Gflops for 300W or less.)
For a fair compare you have to add motherboard power losses, as of
course that's all included at the gpu cards.
As for the gflops it delivers, let's do a more realistic calculation.
AVX does have multiply add,
yet i doubt you can issue on average every clock another multiply-add
in a sustained manner at Sandy Bridge,
if we compare it with Nehalem.
Note the CPU's tend to have just 1 execution unit that can issue
multiplications and historically always had big problems
issuing every clock another one; another reason why the manycores
hammer away the CPU's so bigtime, as in the end
it doesn't matter whether you do matrix multiplications or run FFT's
for prime numbers - it's about the multiplication speed
the chip can deliver as that's going to determine how fast your code
can run on that chip.
http://ark.intel.com/products/64596/Intel-Xeon-Processor-E5-2690-20M-
Cache-2_90-GHz-8_00-GTs-Intel-QPI
That's the fastest i could find. It's 2.9Ghz CPU.
So the cpu delivers in terms of Gflops.
2.9Ghz * 1 multiplication a clock * 4 doubles a vector * 8 cores =
92.8 Gflops
This for $2057 tray price at introduction.
http://ark.intel.com/products/64596/Intel-Xeon-Processor-E5-2690-20M-
Cache-2_90-GHz-8_00-GTs-Intel-QPI
So i wonder where you got that 294 gflops from.
Now in terms of gflops/watt that's 92.8 / 135 watt TDP = 0.68 flops/
watt for the $2k Xeon.
One order of a magnitude less than the K20.
That's why intel created the Xeon Phi of course.
>
>> The K20X delivers 1.4 Tflop nearly.
>> If i google it's 235 watt TDP.
>>
>> 1.4 Tflop / 235 = 6 gflops/watt
>
> debatable whether we can honestly claim that's shipping.
> K10 is .78 Gflops DP/W or 17.2 SP. I wonder of the 75 goal
> is merely a 4.4x improvement....
" The PERFECT program
will leverage anticipated industry fabrication geometry advances to 7
nm."
7 nm gives a factor 16 boost over 28 nm, in theory. So the derived
truth from the article points me to double precision.
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
> Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf
mailing list