[Beowulf] GPU question

Micha Feigin michf at post.tau.ac.il
Tue Sep 1 01:20:01 PDT 2009


On Sun, 30 Aug 2009 04:35:30 +0500
amjad ali <amjad11 at gmail.com> wrote:

> Hello all, specially Gil Brandao
> 
> Actually I want to start CUDA programming for my |C.I have 2 options to do:
> 1) Buy a new PC that will have 1 or 2 CPUs and 2 or 4 GPUs.
> 2) Add 1 GPUs to each of the Four nodes of my PC-Cluster.
> 
> Which one is more "natural" and "practical" way?
> Does a program written for any one of the above will work fine on the other?
> or we have to re-program for the other?
> 

If you use mpi to have several processes, each controlling one gpu then the same
code would work in both scenarios. The first scenario though will make
communication easier and would allow you to avoid mpi. Cuda runs
asynchronously so you can do work on the pc in parallel and control all gpus
from one process. If the pcs are powerful enough, the communication is low
enough and you want to combine cpu and gpu that both do work than the second
option may work better. It will make cooling simpler and you can use smaller
PSUs. Note that each of these monsters (g200, tesla, quadro) can take over 200W
and solve your heating problem for the winter if you're in Alaska.

Also take note that there are big differences in terms of communication (speed
and latency). If you put all cards in one pc, communication between them is gpu
via pci-e to memory. In several pcs you have 
gpu (pc a) -- pci-e --> memory (pc a) -- network --> memory (pc b) --> gpu (pc b)

> Regards.
> 
> On Sat, Aug 29, 2009 at 5:48 PM, <madskaddie at gmail.com> wrote:
> 
> > On Sat, Aug 29, 2009 at 8:42 AM, amjad ali<amjad11 at gmail.com> wrote:
> > > Hello All,
> > >
> > >
> > >
> > > I perceive following computing setups for GP-GPUs,
> > >
> > >
> > >
> > > 1)      ONE PC with ONE CPU and ONE GPU,
> > >
> > > 2)      ONE PC with more than one CPUs and ONE GPU
> > >
> > > 3)      ONE PC with one CPU and more than ONE GPUs
> > >
> > > 4)      ONE PC with TWO CPUs (e.g. Xeon Nehalems) and more than ONE GPUs
> > > (e.g. Nvidia C1060)
> > >
> > > 5)      Cluster of PCs with each node having ONE CPU and ONE GPU
> > >
> > > 6)      Cluster of PCs with each node having more than one CPUs and ONE
> > GPU
> > >
> > > 7)      Cluster of PCs with each node having ONE CPU and more than ONE
> > GPUs
> > >
> > > 8)      Cluster of PCs with each node having more than one CPUs and more
> > > than ONE GPUs.
> > >
> > >
> > >
> > > Which of these are good/realistic/practical; which are not? Which are
> > quite
> > > “natural” to use for CUDA based programs?
> > >
> >
> > CUDA is kind of new technology, so I don't think there is a "natural
> > use" yet, though I read that there people doing CUDA+MPI and there are
> > papers on CPU+GPU algorithms.
> >
> > >
> > > IMPORTANT QUESTION: Will a cuda based program will be equally good for
> > > some/all of these setups or we need to write different CUDA based
> > programs
> > > for each of these setups to get good efficiency?
> > >
> >
> > There is no "one size fits all" answer to your question. If you never
> > developed with CUDA, buy one GPU an try it. If it fits your problems,
> > scale it with the approach that makes you more comfortable (but
> > remember that scaling means: making bigger problems or having more
> > users). If you want a rule of thumb: your code must be
> > _truly_parallel_. If you are buying for someone else, remember that
> > this is a niche. The hole thing is starting, I don't thing there isn't
> > many people that needs much more 1 or 2 GPUs.
> >
> > >
> > > Comments are welcome also for AMD/ATI FireStream.
> > >
> >
> > put it on hold until OpenCL takes of  (in the real sense, not in
> > "standards papers" sense), otherwise you will have to learn another
> > technology that even fewer people knows.
> >
> >
> > Gil Brandao
> >




More information about the Beowulf mailing list