[Beowulf] getting a phi, dammit.

James Cownie jcownie at cantab.net
Wed Mar 6 12:42:20 PST 2013

On 6 Mar 2013, at 06:00, Mark Hahn wrote:

>> The issue here is that because we offer 8GB of memory on the cards, some
>> BIOSes are unable to map all of it through the PCI either due to bugs or
>> failure to support so much memory. This is not the only people suffering
> interesting.  but it seems like there are quite a few cards out there
> with 4-6GB (admittedly, mostly higher-end workstation/gp-gpu cards.)
> is this issue a bigger deal for Phi than the Nvidia family? 
> is it more critical for using Phi in offload mode?

I think this was answered by Brice in another message. We map all of the memory
through the PCI, whereas many other people only map a smaller buffer, and therefore 
have to do additional copies.

> it would be interesting to know how Intel thinks about the issue of 
> card-host/hostcard memory accesses.  
I am not authorised to speak for Intel, so bear in mind that this is my 
opinion, rather than "how Intel thinks".

> my understanding of Phi is that 
> there's a DMA engine that can perform copies across PCIe.  and your 
> comments imply that Phi ram can be mapped directly into the host 
> virtual space (including user-level?).  

I believe that's true (as it is for any PCI device). If you want to be completely
sure, the sources for the host device drivers are all in the source package that
you can download. (Go to http://software.intel.com/mic-developer then "Tools and Downloads")
So you can find out precisely what is going on.

> can code on the Phi also map host memory into its space?
I think the aperture size issue kills this as a way of expanding the memory
on the card.

>> (Indeed other threads here have been complaining that 8GB is too little memory).
> well, it's not much per-core, especially if, as you suggest elsewhere,
> it's important to try to use HT (ie, 8G/120 is only 67M/core...)
Agreed, More memory would be great, but... there are space and power issues
that make that hard at present. (As can be seen by the amount of memory
on other PCI cards).

> I suppose the picture changes if the card can make direct references
> to host memory, though.
Even if it can, I wouldn't recommend it; memory bandwidth is one of the 
critical issues, and doing cache-line sized fetches over the PCI is guaranteed
to be horribly slow, so if the PCI is already limiting when you're aggregating the
transfers into large chunks (by expressing them as offload operations), you can 
expect the performance to be much worse when you try this random access approach.
(This is similar to one of the reasons that everyone uses MPI, rather than UPC...)

-- Jim
James Cownie <jcownie at cantab.net>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20130306/d3a8fceb/attachment.html>

More information about the Beowulf mailing list