[Beowulf] Engineers boost AMD CPU performance by 20% without overclocking
hahn at mcmaster.ca
Tue Feb 28 12:09:28 PST 2012
> The paper is now available online, "CPU-Assisted GPGPU on Fused
> CPU-GPU Architectures":
thanks for the reference.
> (I have not read the whole paper yet) I think the core idea is that
> the CPU acts as a prefetch thread and pulls data into the shared L3
> for the GPU cores (this work is like other prefetch thread research
yes, though it's a bit puzzling, since the whole point of GPU design
is to have lots of runnable threads on hand, so that you simply switch
from stalled to non-stalled threads to hide latency.
so in the context of prefetching, I'd expect a bundle of threads to
make a non-prefetched reference, stall, but for other bundles to utilize
the vector unit while the reference is resolved. gotta read the paper I guess!
More information about the Beowulf