[Beowulf] MPI over RoCE?
Prentice Bisbal
prentice at ucar.edu
Thu Feb 27 17:28:11 UTC 2025
On 2/27/25 3:19 AM, Brice Goglin wrote:
> Hello
>
> While meeting vendors to buy our next cluster, we got different
> recommendations about the network for MPI. The cluster will likely be
> about 100 nodes. Some vendors claim RoCE is enough to get <2us latency
> and good bandwidth for such low numbers of nodes. Some others say RoCE
> is far behind IB for both latency and bandwidth and we likely need to
> get IB if we care about network performance.
>
> If anybody tried MPI over RoCE over such a "small" cluster, what NICs
> and switches did you use?
>
> Also, is the configuration easy from the admin (installation) and
> users (MPI options) points of view?
>
>
I hope this isn't a dumb question: Do the Ethernet switches you're
looking at have crossbar switches inside them? I believe crossbar
switches are a requirement for IB, but are only found in "higher
performance" Ethernet switches. IB isn't just about latency. The
crossbar switches allow for high bisectional bandwidth, non-blocking
communication, etc.
--
Prentice
More information about the Beowulf
mailing list