[Beowulf] getting a phi, dammit.

Thu Mar 7 09:29:35 PST 2013

On Mar 6, 2013, at 9:42 PM, James Cownie wrote:

>
> On 6 Mar 2013, at 06:00, Mark Hahn wrote:
>
>>> The issue here is that because we offer 8GB of memory on the  
>>> cards, some
>>> BIOSes are unable to map all of it through the PCI either due to  
>>> bugs or
>>> failure to support so much memory. This is not the only people  
>>> suffering
>>
>> interesting.  but it seems like there are quite a few cards out there
>> with 4-6GB (admittedly, mostly higher-end workstation/gp-gpu cards.)
>> is this issue a bigger deal for Phi than the Nvidia family?
>> is it more critical for using Phi in offload mode?
>
> I think this was answered by Brice in another message. We map all  
> of the memory
> through the PCI, whereas many other people only map a smaller  
> buffer, and therefore
> have to do additional copies.

James, not really following exactly what you mean by 'through the PCI'.

If you do memory through the pci, isn't that factor 10+ worse in  
bandwidth than when using device RAM?

What matters is how much RAM you can allocate on the device for your  
threads of course.
Anything you ship through that PCI is going to be that slow in terms  
of bandwidth,
that you just do not want to do that and really want to limit it.

If you transfer data from HOST (the cpu's) to the GPU, then AMD and  
Nvidia gpgpu cards can do that
without stopping the gpu cores from calculation. So it happens in  
background. In this manner you need of
course a limited buffer.

A problem some report with OpenCL is that if they by accident  
overallocate the amount of RAM they want to
use on the gpu, that it is allocating Host memory, which as said  
before is pretty slow. Really more than factor 10.