[Beowulf] NVIDIA GPUs, CUDA, MD5, and "hobbyists"
Kilian CAVALOTTI
kilian at stanford.edu
Thu Jun 19 17:16:41 PDT 2008
On Thursday 19 June 2008 04:32:11 pm Chris Samuel wrote:
> ----- "Kilian CAVALOTTI" <kilian at stanford.edu> wrote:
> > AFAIK, the multi GPU Tesla boxes contain up to 4 Tesla processors,
> > but are hooked to the controlling server with only 1 PCIe link,
> > right? Does this spell like "bottleneck" to anyone?
>
> The nVidia website says:
>
> http://www.nvidia.com/object/tesla_tech_specs.html
>
> # 6 GB of system memory (1.5 GB dedicated memory per GPU)
The latest S1070 has even more than that: 4GB per GPU as it seems,
according to [1].
But I think this refers to the "global memory", as decribed in [1]
(slide 12, "Kernel Memory Access"). It's the graphics card main memory,
the kind of one which is used to store textures in games, for instance.
Each GPU core also has what they call "shared memory" and which is
really only shared between threads on the same core (it's more like a
L2 cache actually).
> So my guess is that you'd be using local RAM not the
> host systems RAM whilst computing.
Right, but at some point, you do need to transfer data from the host
memory to the GPU memory, and back. That's where there's probably a
bottleneck if all 4 GPUs want to read/dump data from/to the host at the
same time.
Moreover, I don't think that the different GPUs can work together, ie.
exchange data and participate to the same parallel computation. Unless
they release something along the lines of a CUDA-MPI, those 4 GPUs
sitting in the box would have to be considered as independent
processing units. So as I understand it, the scaling benefits from your
application's parallelization would be limited to one GPU, no matter
how many you got hooked to your machine.
I don't even know how you choose (or even if you can choose) on which
GPU you want your code to be executed. It has to be handled by the
driver on the host machine somehow.
> There's a lot of fans there..
They probably get hot. At least the G80 do. They say "Typical Power
Consumption: 700W" for the 4 GPUs box. Given that a modern gaming rig
featuring a pair of 8800GTX in SLI already requires a 1kW PSU, I would
put this on the optimistic side.
[1]http://www.nvidia.com/object/tesla_s1070.html
[2]http://www.mathematik.uni-dortmund.de/~goeddeke/arcs2008/C1_CUDA.pdf
Cheers,
--
Kilian
More information about the Beowulf
mailing list