[Beowulf] Performance characterising a HPC application
Christian Bell
christian.bell at qlogic.com
Mon Mar 26 10:42:48 PDT 2007
On Mon, 26 Mar 2007, Gilad Shainer wrote:
>
> > Offload, usually implemented by RDMA offload, or the ability
> > for a NIC to autonomously send and/or receive data from/to
> > memory is certainly a nice feature to tout. If one considers
> > RDMA at an interface level (without looking at the
> > registration calls required on some interconnects), it's the
> > purest and most flexible form of interconnect data transfer.
> > Unfortunately, this pure form of data transfer has a few caveats...
>
>
> When Mellanox refers to transport offload, it mean full transport
> offload - for all transport semantics. InfiniBand, as you probably
> know, provides RDMA AND Send/Receive semantics, and in both cases
> you can do Zero-copy operations.
Zero-copy at the transport level doesn't translate into zero-copy in
the MPI application. It would be disingenuous to lead people into
believing that zero-copy means "no copy at all" through the entire
software stack.
> This full flexibility provides the programmer with the ability to choose
> the best semantics for his use. Some programmers choose
> Send/Receive and some RDMA. It is all depends on their application.
Vanilla send/receive and RDMA are arguably not the best semantics for
MPI, since MPI is a receiver-driven model.
Buying the screwdriver set with 288 bits doesn't mean it will include
the 5-pt torx bit you need to solve your problem (that's why my
seagate hard drive enclosure is still sealed tight!)
. . christian
--
christian.bell at qlogic.com
(QLogic SIG, formerly Pathscale)
More information about the Beowulf
mailing list