Speed of writes to cache and memory.
Robert G. Brown
rgb at phy.duke.edu
Fri Feb 28 03:56:15 PST 2003
On Thu, 27 Feb 2003, Ed Porter wrote:
> Some newbee questions.
> I am interested in trying to get a rough estimate of the speed at which CPUs
> can perform successive write operations.
> How fast can the average processor write to local main memory, both in
> bursts to successive addresses and to random addresses?.
The first is one thing addressed by the stream benchmark, but stream 1
at least doesn't really let you explore for arbitrary length vectors, so
you can't see the effect of e.g. crossing cache boundaries with the
length of the array.
cpu_rate (discussed over the last couple of days) has a variable-length
vector stream embedded, and has some other variable length memory tests
as well. Including one that read, write accesses memory in a shuffled
order that defeats most cache prefetch and so forth. It exhibits a
dramatic latency difference, to say the least, when vector sizes cross
the cache boundaries.
I published cpu_rate results for various r/w tests including random
access out of man memory a couple or three days ago on the list -- check
> I know that in many systems to read a cache line from main memory take
> between 1/10 to 1/20th of a micro seconds. This is because the CPU has to
> issue the read, find if the desired memory is in its various levels of
> cache, if not, send an address out on the memory bus, wait for the roughly
> 30 ns RAM read delay, and then burst back 8 or so words into cache, at the
> memory-to-CPU bandwidth.
Whether or not you can see this depends very much on system state and
you code -- that's one pretty safe conclusion from the random vs
streaming tests mentioned above. On my Celeron laptop, for example, I
measure about 60 nsec as an average time for totally random access
(average of read and write, defeating the prefetch). The same test for
streaming sequential r/w access produces about 10 ns.
> If a CPU tries to write to local memory presumably it would be much quicker,
> because the CPU does not have to wait for any response. That being the
> case, how fast can a CPU write cache lines to different locations in main
> Another question: how do most CPUs interact with cache on write operations?
> Presumably a write to cache take about the same number of cycles as a read
> from the same level cache. Is this true?
> My understanding is that in many CPUs one can select either (1) to make all
> memory writes to cache also written through to main memor, or (2) to have
> writes made to main memory only when a cache line is replaced in cache and
> thus swapped out to main memory. Is this true?. If so, what are the
> typical delays associated with each type of write?
> Is it possible to select which of these two types of writes to use on a per
> instruction method?
> If one is using method (1) can one cause the main memory image of a cache
> line to be update under program control?
> Also, are there any systems where one can indicate to the CPU that certain
> information is to be kept in cache and other should not, or is all caching
> controlled only by which of various cache lines have been most recently
> I would appreciate any enlightenment on these subjects.
I'm not really this level of expert, so I'll leave these questions to
people on the list that probably are. My >recollection< is that most
memory management is >not< under program control unless you are willing
to go to the assembler level and manipulate prefetch's. Otherwise you
take what the compiler plus the CPU itself (which has its own prediction
algorithms) give you.
However, I probably need enlightment here as well.
> Thank you.
> Ed Porter
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
More information about the Beowulf