[Beowulf] MPI over RoCE?

Brice Goglin brice.goglin at gmail.com
Fri Feb 28 09:01:29 UTC 2025


Le 27/02/2025 à 18:28, Prentice Bisbal a écrit :
> On 2/27/25 3:19 AM, Brice Goglin wrote:
>> Hello
>>
>> While meeting vendors to buy our next cluster, we got different 
>> recommendations about the network for MPI. The cluster will likely be 
>> about 100 nodes. Some vendors claim RoCE is enough to get <2us 
>> latency and good bandwidth for such low numbers of nodes. Some others 
>> say RoCE is far behind IB for both latency and bandwidth and we 
>> likely need to get IB if we care about network performance.
>>
>> If anybody tried MPI over RoCE over such a "small" cluster, what NICs 
>> and switches did you use?
>>
>> Also, is the configuration easy from the admin (installation) and 
>> users (MPI options) points of view?
>>
>>
> I hope this isn't a dumb question: Do the Ethernet switches you're 
> looking at have crossbar switches inside them? I believe crossbar 
> switches are a requirement for IB, but are only found in "higher 
> performance" Ethernet switches. IB isn't just about latency. The 
> crossbar switches allow for high bisectional bandwidth, non-blocking 
> communication, etc.
>
>

I don't know but that's a good question, I will ask vendors.

Brice




More information about the Beowulf mailing list