[Beowulf] Thoughts on IB EDR and Intel OmniPath
landman at scalableinformatics.com
Fri Apr 29 15:45:44 PDT 2016
On 04/29/2016 06:18 PM, Hutcheson, Mike wrote:
> Hi. We are looking at EDR and OmniPath for a new cluster we are looking to purchase this summer. I am interested in what the Beowulf community has to say regarding the pros and cons for each of the technologies. I am certainly not trying to start a flame war here, just looking for unbiased observations based on knowledge and experience.
I can talk more from the vendor side of this (and we integrate both
vendor's gear into our systems, so take this with whatever amount of
NaCl you think it needs). Both are nice for a fairly widely overlapping
range of needs.
We've been using the EDR and 100GbE side for a while now for our storage
and hyperconverged systems. I like that we can drive full bandwidth
over the IB fabric without crushing the CPU. With apologies to
everyone, a link to a post I did about a month ago:
We plan to try a similar test with OPA at some point, and preliminary
tests others have done have suggested that similar performance is
Basically I think the choice comes down to the specifics of the codes
you'll be running, and how you will be implementing the storage side.
Both systems are nice, but you may find the polled mechanism of OPA
might match your code base differently than the offload mechanism of EDR.
I would recommend staying away from 100GbE if you have the option to do
EDR/OPA. While we really enjoy 100GbE (and have shown some pretty
terrific performance with it), I don't think the technology on
congestion mediation is quite up to snuff handling these data rates, as
compared to EDR/OPA. Even with RoCE2, some of the testing we did
demonstrated very significant congestion related slowdowns that we
couldn't easily tune for (with PFC and other bits that RoCE needs).
I've used iWARP in the dim and distant past, and it was much better than
plain old gigabit on the same systems (with Ammasso cards). But I'd
recommend going with EDR/OPA if you have the choice. You can always
have your storage or other nodes handle ethernet gatewaying if needed.
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics, Inc.
e: landman at scalableinformatics.com
p: +1 734 786 8423 x121
c: +1 734 612 4615
More information about the Beowulf