[Beowulf] GPU Beowulf Clusters

Jon Forrest jlforrest at berkeley.edu
Thu Jan 28 09:38:14 PST 2010


I'm about to spend ~$20K on a new cluster
that will be a proof-of-concept for doing
GPU-based computing in one of the research
groups here.

A GPU cluster is different from a traditional
HPC cluster in several ways:

1) The CPU speed and number of cores are not
that important because most of the computing will
be done inside the GPU.

2) Serious GPU boards are large enough that
they don't easily fit into standard 1U pizza
boxes. Plus, they require more power than the
standard power supplies in such boxes can
provide. I'm not familiar with the boxes
that therefore should be used in a GPU cluster.

3) Ideally, I'd like to put more than one GPU
card in each computer node, but then I hit the
issues in #2 even harder.

4) Assuming that a GPU can't be "time shared",
this means that I'll have to set up my batch
engine to treat the GPU as a non-sharable resource.
This means that I'll only be able to run as many
jobs on a compute node as I have GPUs. This also means
that it would be wasteful to put CPUs in a compute
node with more cores than the number GPUs in the
node. (This is assuming that the jobs don't do
anything parallel on the CPUs - only on the GPUs).
Even if GPUs can be time shared, given the expense
of copying between main memory and GPU memory,
sharing GPUs among several processes will degrade
performance.

Are there any other issues I'm leaving out?

Cordially,
-- 
Jon Forrest
Research Computing Support
College of Chemistry
173 Tan Hall
University of California Berkeley
Berkeley, CA
94720-1460
510-643-1032
jlforrest at berkeley.edu



More information about the Beowulf mailing list