[Beowulf] Q: IB message rate & large core counts (per node)?
hahn at mcmaster.ca
Tue Feb 23 13:57:23 PST 2010
> Coalescing produces a meaningless answer from the message rate
> benchmark. Real apps don't get much of a benefit from message
> coalescing, but (if they send smallish messages) they get a big
> benefit from a good non-coalesced message rate.
in the interests of less personal/posturing/pissing, let me ask:
where does the win from coalescing come from? I would have thought
that coalescing is mainly a way to reduce interrupts, a technique
that's familiar from ethernet interrupt mitigation, NAPI, even
basic disk scheduling.
to me it looks like the key factor would be "propagation of desire" -
when the app sends a message and will do nothing until the reply,
it probably doesn't make sense to coalesce that message. otoh it's
interesting if user-level can express non-urgency as well. my guess
is the other big thing is LogP-like parameters (gap -> piggybacking).
assuming MPI is the application-level interface, are there interesting
issues related to knowing where to deliver messages? I don't have a
good understanding about where things stand WRT things like QP usage
(still N*N? is N node count or process count?) or unexpected messages.
now that I'm inventorying ignorance, I don't really understand why
RDMA always seems to be presented as a big hardware issue. wouldn't
it be pretty easy to define an eth or IP-level protocol to do remote puts,
gets, even test-and-set or reduce primitives, where the interrupt handler
could twiddle registered blobs of user memory on the target side?
regards, mark hahn.
More information about the Beowulf