[Beowulf] difference between accelerators and co-processors
Vincent Diepeveen
diep at xs4all.nl
Sun Mar 10 14:08:50 PDT 2013
On Mar 10, 2013, at 9:03 PM, Mark Hahn wrote:
>> Is there any line/point to make distinction between accelerators and
>> co-processors (that are used in conjunction with the primary CPU
>> to boost
>> up the performance)? or these terms can be used interchangeably?
>
> IMO, a coprocessor executes the same instruction stream as the
> "primary" processor. this was the case with the x87, for instance,
> though the distinction became less significant once the x87 came
> onchip.
> (though you certainly notice that FPU on any of these chips is mostly
> separate - not sharing functional units or register files,
> sometimes even
> with separate micro-op schedulers.)
>
>> Specifically, the word "accelerator" is used commonly with GPU. On
>> the
>> other hand the word "co-processors" is used commonly with Xeon Phi.
>
> I don't think it is a useful distinction: both are basiclly
> independent
> computers. obviously, the programming model of Phi is dramatically
> more
> like a conventional processor than Nvidia.
>
Mark, that's the marketing talk about Xeon Phi.
It's surprisingly the same of course except for the cache coherency;
big vector processors.
> there is a meaningful distinction between offload and coprocessor
> approaches.
> that is, offload means you use the device to accelerate a set of
> libraries
> (offload matrix multiply, eig, fft, etc). to use a coprocessor, I
> think the
> expectation is that the main code will be very much aware of the
> state of the
> PCIe-attached hardware.
>
> I suppose one might suggest that "accelerator" to some extent implies
> offload usage: you're accelerating a library.
>
> another interesting example is AMD's upcoming HSA concept: since
> nearly all
> GPUs are now on-chip, AMD wants to integrate the CPU and GPU
> programming
> models (at least to some extent). as far as I understand it, HSA
> is based
> on introducing a quite general intermediate ISA that can be
> executed using
> all available hardware resources: CPU and/or GPU. although Nvidia
> does have
> its own intermediate ISA, they don't seem to be trying to make it
> general,
> *and* they don't seem interested in making it work on both C/GPU.
> (well,
> so far at least - I wouldn't be surprised if they _did_ have a PTX
> JIT for
> their ARM-based C/GPU chips...)
>
> I think HSA is potentially interesting for HPC, too.
> I really expect
> AMD and/or Intel to ship products this year that have a C/GPU chip
> mounted on
> the same interposer as some high-bandwidth ram.
How can an integrated gpu outperform a gpgpu card?
Something like what is it 25 watt versus 250 watt, what will be faster?
I assume you will not build 10 nodes with 10 cpu's with integrated
gpu in order to rival a
single card.
> a fixed amount of very high
> performance memory sounds very tasty to me. a surprising amount of
> power
> in current systems is spend getting high-speed signals off-socket.
>
> imagine a package dissipating say 40W containing a, say, 4 CPU cores,
> 256 GPU ALUs and 2GB of gddr5. the point would be to tile 32 of them
> in a 1U box. (dropping socketed, off-package dram would probably make
> it uninteresting for memcached and some space-intensive HPC.
>
> then again, if you think carefully about the numbers, any code today
> that has a big working set is almost as anachronistic as codes that
> use
> disk-based algorithms. (same conceptual thing happening: capacity is
> growing much faster than the pipe.)
>
> regards, mark hahn.
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
> Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf
mailing list