[Beowulf] MPI over RoCE?

Brice Goglin brice.goglin at gmail.com
Thu Feb 27 08:19:41 UTC 2025


Hello

While meeting vendors to buy our next cluster, we got different 
recommendations about the network for MPI. The cluster will likely be 
about 100 nodes. Some vendors claim RoCE is enough to get <2us latency 
and good bandwidth for such low numbers of nodes. Some others say RoCE 
is far behind IB for both latency and bandwidth and we likely need to 
get IB if we care about network performance.

If anybody tried MPI over RoCE over such a "small" cluster, what NICs 
and switches did you use?

Also, is the configuration easy from the admin (installation) and users 
(MPI options) points of view?

Thanks

Brice





More information about the Beowulf mailing list