[Beowulf] Intro question

Patrick Geoffray patrick at myri.com
Mon Dec 8 07:21:03 PST 2008

Bogdan Costescu wrote:
> about on this list: interconnect hardware being able to DMA directly 
> to/from CPU cache. I don't know how useful such a feature is for a 

You can do something similar today using Direct Cache Access (DCA) on 
(recent) Intel chips with IOAT. It's an indirect cache access, you tag a 
DMA to automatically prefetch the data in the L3 of a specific socket.

It does nothing for latency, since polling will fetch the cache line 
just as fast, but it works well if there is a delay between the data 
being delivered and the data being used. The best example is a 
communication overlapped by computation: cache prefetching is overlapped 
as well, no more memory latency.


More information about the Beowulf mailing list