[Beowulf] NVIDIA GPUs, CUDA, MD5, and "hobbyists"

Vincent Diepeveen diep at xs4all.nl
Mon Jun 23 14:57:06 PDT 2008


Not really,

The architecture of AMD versus Nvidia is quite different. I would  
encourage each manufacturer to have their own system.
So to speak AMD is a  low clocked core2 supercomputer @ 64 cores,  
versus nvidia a 240 processor mips supercomputer.

I feel the real limitation is that the achievement of the GPU's only  
exist on marketing paper.

I can also claim my cars engine is capable of driving 20000 miles per  
hour.

Sure it is, in space!

A PC processors is just as good as its caches and memory controller is.

There is too little technical data about GPU's with respect to  
bottlenecks, whereas bottlenecks will dominate such hardware of course.
If you KNOW what is a bottleneck, then in a theoretic model you can  
work around it.

The few reports from individuals who work fulltime with GPU's who  
tried writing some number crunching code for it,
yes even 32 bits number crunching codes, the practical odds for  
succes for an individual programmer is too small currently for GPU's.

It is a fact that to program those things well, you first need to be  
hell of a programmer. Those hell of programmers know very well
that you need full technical information, even if that means bad news  
for Nvidia and AMD as suddenly the GPU's look a lot weaker then.

If there is technical specifications that also show the bottlenecks  
very well,
then the algorithmic strong among us (trying to not look too much  
outside of the window),
will find some clever solutions to get a specific thing done.

This is all paper work.

If there is on paper an algorithmic solution, or even a method of how  
to get something done, then there will be programmers implementing it,
as they see no risks. they just see that solution that is gonna give  
them more crunching power.

It is all about risk assesment from programmers viewpoint.

Right now the only thing he knows is big bragging stories, he of  
course realizes that if you do something within the register files of  
it,
that this can be fast, other than that he knows that in the first  
place it is a GPU meant for displaying graphics and not especially  
designed
just to do numbercrunching.

If you go for a platform into the deep, so without information, you  
just don't do it.

At the time, if you went for SSE/SSE2 assembler code, you knew full  
specs of it, every instruction, every latency of every instruction
and so on.

To take the step to CONSIDER writing something on a GPU means that  
the programmer in question is already a total hardcore addict;
you really want to get the ultimate achievement out of the hardware  
to achieve your numbercrunching. The same is true for SSE/SSE2.

I would argue writing SSE2 code is tougher than writing for a GPU,  
from implementation viewpoint seen,
under the condition that you DO have a parallel model how to get  
things done on a GPU.

The number of people who know how to write a parallel model on paper  
that theoretical works and gets the maximum out of crunching
hardware that is non trivial to parallellize is just real little. If  
within a specific specialism that is more than a dozen, that's a lot  
already.
The number of good programmers who can write you that code, in  
whatever language, is little compared to the total number of  
programmers,
but real huge compared to algorithmic designers who are expert in  
your field.

It will not take long until such solutions are simply posted on the  
net. That might increase the number of people who toy with GPU's.
In itself Seymour Crays statement, "If you were plowing a field,  
which would you rather use? Two strong oxen or 1024
  chickens? "  is very true from practical viewpoint; it is simpler  
to work with 4 cores than with 64, let alone 240,
but objectively of course such a majority should be able to beat 2  
strong oxen. Doesn't mean it is simple to do it.
So the number of persons who start writing solutions there you can  
really count on a few hands. Most of them currently really are
a few students who tried at a card which delivers already in  
marketing value such tiny amounts of single precision gflops,
that it really must get seen as a hobby project of a student who just  
learns advanced programming a tad better,
as their quadcore with existing highly optimized free software is for  
sure faster.

In itself that is very weird, as in itself there is not really anyone  
who doubts that in the long run many tiny processors are gonna win it  
for number crunching.

Vincent

On Jun 23, 2008, at 6:44 PM, Bogdan Costescu wrote:

> On Wed, 18 Jun 2008, Prentice Bisbal wrote:
>
>> The biggest hindrance to doing "real" work with GPUs is the lack  
>> of dual-precision capabilities.
>
> I think that the biggest hindrance is a unified API or language for  
> all these accelerators (taking into account not only the GPUs !).  
> Many developers are probably scared that their code depends on the  
> whim of the accelerator producer in terms of long-term  
> compatibility of the source code with the API or language or of the  
> binary code with the available hardware; sure, you can't prevent  
> the hardware being obsoleted or the company from going out of  
> bussiness, but if you're only one recompilation away it's manageable.
>
> At the last week's ISC'08, after the ATI/AMD and NVidia talks,  
> someone asked a NVidia guy about any plans of unification with at  
> least ATI/AMD on this front and the answer was "we're not there  
> yet"... while the ATI/AMD presentation went on to say "we learnt  
> from mistakes with our past implementations and we present you now  
> with OpenCL" - yet another way of programming their GPU...
>
> I see this situation very similar to the SSE vs. 3Dnow of some  
> years ago or the one before MPI came to replace all the proprietary  
> communication libraries. Anybody else shares this view ?
>
> -- 
> Bogdan Costescu
>
> IWR, University of Heidelberg, INF 368, D-69120 Heidelberg, Germany
> Phone: +49 6221 54 8869/8240, Fax: +49 6221 54 8868/8850
> E-mail: bogdan.costescu at iwr.uni-heidelberg.de
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf
>





More information about the Beowulf mailing list