[Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand business

Wed Jan 25 16:46:57 PST 2012

The supercomputing codes i saw run on processors, to say polite, were  
losing it everywhere.

Also NASA when porting from Origin3800 to Itanium2 1.5Ghz, reported  
publicly a speedup of factor 2 in the forums.

However my own chessprogram, not exactly optimized for itanium2, got  
a boost of factor 4 moving from 500Mhz R14000 (origin3800)
to itanium2 1.3Ghz. That was just a single compile, and it's an  
integer program, whereas the itanium2 is a floating point processor.

The itanium2 1.5Ghz has 6 gflops on paper versus the R14k 500Mhz has  
1 Gflop on paper.

Now a Chinese reporter posted on THIS mailing list, the beowulf  
mailing list, already at GPU hardware some generations ago
an IPC of 25% at nvidia and 50% at AMD.

At the same gpu's back then, most studentprojects got around 25% at  
nvidia; Volkov then went ahead and understood GPU's better
and scored 70% efficiency - again at very old gpu's. Sincethen they  
really improved.

See: http://www.cs.berkeley.edu/~volkov/

So you want to build a supercomptuer now 10x more expensive, and each  
generation lose more efficiency on newer hardware,
whereas some who do effort to write new good code, they get very high  
efficiency?

Just learn how to program and ignore the desinformation - if you have  
a box that fast you really can get a lot of speed out of it.

You shouldn't ask for a 1 billion dollar box that can run your  
oldschool Fortran codes as good as a 5 million GPU box,
look what you can do to write good codes for that manycore hardware.  
OpenGL works at all, CUDA just at nvidia.

Vincent

On Jan 25, 2012, at 11:01 PM, Prentice Bisbal wrote:

> On 01/24/2012 12:02 AM, Steve Crusan wrote:
>>
>>
>> On Jan 23, 2012, at 8:44 PM, Vincent Diepeveen wrote:
>>
>>
>>> It's 500 euro for a 1 teraflop double precision Radeon HD7970...
>>
>>
>> Great, and nothing runs on it. GPUs are insanely useful for certain
>> tasks, but they aren't going to be able to handle most normal
>> workloads(similar to the BG class of course). Any center that buys  
>> BGP
>> (or Q at this point) gear is going to pay for a scientific programmer
>> to adapt their code to take advantage of the BG's strengths;  
>> parallelism.
>>
>> But It's nice that supercomputing centers use GPUs to boost their
>> flops numbers. Any word on that Chinese system's efficiency? If you
>> look at the architecture of the new K computer in Japan, it's similar
>> to the BlueGene line.
>
> I attended a presentation at Princeton U. on Monday about the state of
> HPC in China. The talk  was given by someone who has been to China and
> spoken with the leaders of their HPC efforts. While the Chinese  
> systems
> get great scores on LINPACK, even the Chinese concede that on their
> "real" applications, they are getting well below the theoretical max
> flops, because their codes aren't getting the most out of their  
> systems.
> In other words, on real programs, they aren't all that efficient  
> (yet).
>
> --
> Prentice
>
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf