NDAs Re: [Beowulf] Nvidia, cuda, tesla and... where's my double floating point?

Bill Broadley bill at cse.ucdavis.edu
Mon Jun 16 18:58:40 PDT 2008

Greg Lindahl wrote:
> On Mon, Jun 16, 2008 at 05:51:21PM -0700, Bill Broadley wrote:
>> Personally finding a port of McCalpin's stream seeing 50GB/sec or so
>> caught my attention.
> Well, given that GPUs don't do so hot on Linpack, one should hope that
> they're close to peak on *something* !

Heh.  Is there a published linpack for some CUDA based solution?  Or possibly 
code available?

> For a while I was trying to incite some friends to go do a CPU that
> plugged into VRAM, but it didn't get very far.

Well hopefully ATI and AMD will figure out a way to give a $300 CPU as much 
memory bandwidth as a $200 video card.  Seems like with the continuing 
increase in cores, not to mention the yet again increased width of the SSE 
registers would make increased bandwidth more usable.

I'm pretty sure hypertransport allows for a significant number of outstanding 
memory transactions, so even a single gpu/cpu hybrid could farm out a 
100GB/sec memory system to numerous sockets.... sounds like a good 
justification for HT3 to me.

