LAM SMP performance

Josip Loncaric josip at
Fri Dec 8 15:31:47 PST 2000

Patrick Geoffray wrote:
> On another hand, the message can be asynchronous and the
> cache can be trashed on the receiving side before the user application
> uses the payload.

I forgot to say that this is not a "data push" situation.  It is the
receiver's act of picking up the payload activates the cache-to-cache
transfer, because (thanks to cache snooping) the sender's CPU detects
that the receiver's CPU is trying to access modified data in sender's
cache.  The sender's CPU signals this to the receiver (via HITM# signal
line) and performs an implicit write-back of the modified data.  Intel's
PII manual states that "The implicit write-back is transferred directly
to the initial requesting processor and snooped by the memory controller
to assure that system memory has been updated."  This single step gives
the receiver's CPUs the sender's data, while the memory controller
updates RAM.  This situation is very likely when the receiver acts

However, if the receiver is busy doing something else for a while and
then decides to act, it could find that the sender's copy has long gone
from cache to RAM.  Then, the receiver would have to reload the data
from RAM.  Since the receiver acted so slowly, this outcome seems fair
to me.

BTW, since spinlocks are so fast, the probability of finding the data
still in sender's cache is greater.


Dr. Josip Loncaric, Senior Staff Scientist        mailto:josip at
ICASE, Mail Stop 132C           PGP key at
NASA Langley Research Center             mailto:j.loncaric at
Hampton, VA 23681-2199, USA    Tel. +1 757 864-2192  Fax +1 757 864-6134

More information about the Beowulf mailing list