[Beowulf] MPI over RoCE?
Brice Goglin
brice.goglin at gmail.com
Fri Feb 28 09:01:29 UTC 2025
Le 27/02/2025 à 18:28, Prentice Bisbal a écrit :
> On 2/27/25 3:19 AM, Brice Goglin wrote:
>> Hello
>>
>> While meeting vendors to buy our next cluster, we got different
>> recommendations about the network for MPI. The cluster will likely be
>> about 100 nodes. Some vendors claim RoCE is enough to get <2us
>> latency and good bandwidth for such low numbers of nodes. Some others
>> say RoCE is far behind IB for both latency and bandwidth and we
>> likely need to get IB if we care about network performance.
>>
>> If anybody tried MPI over RoCE over such a "small" cluster, what NICs
>> and switches did you use?
>>
>> Also, is the configuration easy from the admin (installation) and
>> users (MPI options) points of view?
>>
>>
> I hope this isn't a dumb question: Do the Ethernet switches you're
> looking at have crossbar switches inside them? I believe crossbar
> switches are a requirement for IB, but are only found in "higher
> performance" Ethernet switches. IB isn't just about latency. The
> crossbar switches allow for high bisectional bandwidth, non-blocking
> communication, etc.
>
>
I don't know but that's a good question, I will ask vendors.
Brice
More information about the Beowulf
mailing list