[Beowulf] Opinions of Hyper-threading?

Thu Feb 28 13:02:00 PST 2008

> STREAM Benchmark implementation in CUDA
> Array size (single precision)=8000000
> using 128 threads per block, 62500 blocks
> Function      Rate (MB/s)   Avg time     Min time     Max time
> Copy:       16706.3212       0.0039       0.0038       0.0044
> Scale:      16666.2770       0.0046       0.0038       0.0100
> Add:        18408.0866       0.0053       0.0052       0.0056
> Triad:      18738.6603       0.0052       0.0051       0.0055

I got
  STREAM Benchmark implementation in CUDA
  Array size (single precision)=8000000
  using 128 threads per block, 62500 blocks
Copy:       50006.6051       0.0013       0.0013       0.0013
Scale:      50006.6051       0.0013       0.0013       0.0013
Add:        56409.8044       0.0017       0.0017       0.0017
Triad:      56409.8044       0.0017       0.0017       0.0017

on a "nVidia Corporation G80 [Quadro FX 4600] (rev a2)".
wikipedia quotes 67.2 GB/s theoretical.

it didn't matter whether the machine was in init 3 or 5, though the X 
config was just an idle 1280x1024 server.

> Kudos to Nvidia for having a linux friendly toolchain that I could find, 
> download, install, and compile a code with minimal hassle.

absolutely.  AMD has really dropped the ball on this, even though it looks
like they at least announced availability of DP earlier...