[Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand business

Vincent Diepeveen diep at xs4all.nl
Tue Jan 24 04:51:54 PST 2012

On Jan 24, 2012, at 6:02 AM, Steve Crusan wrote:

> Hash: SHA1
> On Jan 23, 2012, at 8:44 PM, Vincent Diepeveen wrote:
>> It's 500 euro for a 1 teraflop double precision Radeon HD7970...
> Great, and nothing runs on it.

You build a system of millions of euro's alltogether, NCSA having a  
huge budget and you can't even pay for a few programmers who
write some crunching code for gpu's????

> GPUs are insanely useful for certain tasks, but they aren't going  
> to be able to handle most normal workloads(similar to the BG class  
> of course). Any center that buys BGP (or Q at this point) gear is  
> going to pay for a scientific programmer to adapt their code to  
> take advantage of the BG's strengths; parallelism.

bluegene is ibm's equivalent of a HPC gpu, just it's a lot more  
expensive such box.

> But It's nice that supercomputing centers use GPUs to boost their  
> flops numbers. Any word on that Chinese system's efficiency?

Actually on this mailing list if you scroll back in history, and look  
in 2007, some chinese researchers here posted their codes were,
we speak of the 512 streamcore ATI's, already reaching 50% IPC, and  
it worked crossplatform at AMD and Nvidia. They got 25% efficiency
at nvidia.

Now if we realize that most codes on this planet can't use multiply- 
add, then 25% at nvidia and 50% at ATI was really good.

If we look to all sorts of applications and see that if 1 good  
programmer is doing effort, suddenly it works great at gpu's.

> If you look at the architecture of the new K computer in Japan,  
> it's similar to the BlueGene line.
> PS: I'm really not an IBMer.

I took a look at latest BlueGene/Q and basically it's 4 threads per  
core @ 18 core @ 1.6Ghz or something they are gonna build.
that's a much improved chip over the old bluegenes which are 3 watt  
per gflop.

Yet to my surprise, or maybe not, it's still not in the league of  
gpu's. the not yet built bluegene/q supercomputer claims
2 flops per watt now.

GPU's are 4 flops per watt now and already you can buy it in a shop.

And at least 1 chinese researcher posted here in 2007 to get 2 flops  
per watt out of it.

What works on such ibm hardware efficient should also be no problem  
to port to a GPU.

I see no money amounts quoted on what bluegene/q is gonna cost, yet  
we can be sure it's gonna cost you more than a gpu in the shops.

So a chip not yet sold by ibm, if i may believe wiki, especially  
designed for its purpose, can't compete with a gpu, that's already in  
the shops,
which has been designed for gamers.

Realize that the gpu has been designed for single precision  
calculations and delivers 4x more single precision flops than double,
and we are comparing it double precision here.

BG/Q is using 45 nm processors and AMD7970 is using 28 nm proces  
technology, to just show my point.

>> On Jan 24, 2012, at 2:19 AM, Ellis H. Wilson III wrote:
>>> On 01/23/2012 07:40 PM, Vincent Diepeveen wrote:
>>>>>> On Jan 24, 2012, at 12:02 AM, Joshua mora acosta wrote:
>>>>>> Nanosecond latency of QPI using 2 rings versus something that  
>>>>>> has a
>>>>>> latency up to factor 1000 slower
>>>>>> with the pci-e as the slowest delaying factor.
>>>>>> Doing cache coherency over that forget it.
>>>>> Hear that Shai F?  Stop work on vSMP now, cause Vincent says it
>>>>> can't
>>>>> work!!!
>>>>> More seriously, with this acquisition, I could see serious
>>>>> contention
>>>>> for ScaleMP.  SoC type stuff, using IB between many nodes, in
>>>>> smaller boxen.
>>>> That would be some BlueGene type machine you speak about that intel
>>>> would produce with a low power SoC.
>>>> This where at this point the bluegene type machines simply can't
>>>> compete with the tiny processors
>>>> that get produced by the dozens of millions.
>>> For...chess?  ;D
>>>> "The tiny processors have won"
>>>>     Linus Thorvalds
>>> *Torvalds, and if Linux (or any well-supported kernel/OS for that
>>> matter) currently had data structures designed for extremely high
>>> parallelism on a single MoBo (i.e. 100s to 10,000s of cores) then I
>>> would agree with this statement.  As I currently see it, all we can
>>> really say is that someday, probably, perhaps even hopefully:
>>> "The tiny processors will win."
>>> That's after we work out all the nasty nuances involved with  
>>> designing
>>> new data structures for OSes that can handle that number of  
>>> cores, and
>>> probably design new applications that can use these new OS features.
>>> And no, GPU support in Linux doesn't count as this already having  
>>> been
>>> done.  We just farm out very specific code to run on those  
>>> things.  If
>>> somebody has an example of a full-blown, usable OS running on a GPU
>>> ALONE, I would stand (very interestingly) corrected.
>>>> Intel has themselves a second law of Moore. You can google for it.
>>> Thanks, for a moment there, I almost used AskJeeves.
>>>> A good example of massproduced processors are gpu's.
>>> Was waiting for the hook.  Inevitable really.  I think if we were
>>> discussing the efficacy and quality of resultant bread from various
>>> bread machines versus the numerous methods for making bread by hand
>>> somehow, someway, a GPU would make better bread.  Might be a  
>>> wholesome
>>> cyber-loaf of artisan wheat, but nonetheless, it would be better in
>>> every way.
>>> Best,
>>> ellis
>>> _______________________________________________
>>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
>>> Computing
>>> To change your subscription (digest mode or unsubscribe) visit
>>> http://www.beowulf.org/mailman/listinfo/beowulf
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
>> Computing
>> To change your subscription (digest mode or unsubscribe) visit  
>> http://www.beowulf.org/mailman/listinfo/beowulf
>  ----------------------
>  Steve Crusan
>  System Administrator
>  Center for Research Computing
>  University of Rochester
>  https://www.crc.rochester.edu/
> Version: GnuPG/MacGPG2 v2.0.17 (Darwin)
> Comment: GPGTools - http://gpgtools.org
> B+70SwJL/aBDstcDG4ChT5uW0WCcuvS7qRx5e1Zwu68m7qFEZRvIwc0uu0bgHbxt
> KRynFRZ6suwudEp0o4HMpCBYNaC7uG7xkUeFbUHKfnfCflWDoz4Y9Fq3a/OhoriK
> a5JrQqjVI6HZij+xDqrFvyn80Ec8eSwfRYd8lxfq4abHtE1tKYm/cF5I5Bn2lD5l
> wVNvBQiU99ZPeqhcbL5XyvIsceB6ncodJ9zmBxIahrNIogMCq7UJbUhsikSRp6Dd
> cL7r0AekTyiRmvZaHZZKbuad68DfATT4hy9/HzodBqTWLxxTMlrW8vNH9a7dSOo=
> =oA7r

More information about the Beowulf mailing list