[Beowulf] Lowered latency with multi-rail IB?

Fri Mar 27 11:27:15 PDT 2009

, Håkon

On Mar 27, 2009, at 19:09 , Joshua mora acosta wrote:

> So a way to quantify if multirail helps on network latency driven  
> workloads
> there should be a sinthetic benchmark that can be built to show off  
> the impact
> of balancing these requests among multiple HCAs bound to different  
> network
> paths or core pairs like an all-to-[all,gather,scatter] or barrier  
> benchmark
> and in theory observe half the total latency of that overall  
> communication.
> So I think multirail will reduce latency of sinchronizations  
> (collective
> calls) but not to latency driven point to point communications.
>

Well, you're wrong. As stated, we do see speedup due to increased  
message rate on apps not using collectives. As to a benchmark, and all- 
to-all with _many_ processes per node will show you this. ($MPI_HOME/ 
examples/bin/mpi_msg_rate in our distro).

I would actually claim the opposite; it will _not_ help on most  
collective operations, because they perform SMP optimizations and one  
process sends and receives on behalf of the other processes.  
Uncorrelated messages from many processes on a single node will, on  
the other hand, take advantages of the accumulated increased message  
rate provided by multiple HCAs.

Håkon

> Joshua
>
> ------ Original Message ------
> Received: 12:37 PM CDT, 03/27/2009
> From: Håkon Bugge <hbugge at platform.com>
> To: Craig Tierney <Craig.Tierney at noaa.gov>Cc: Joshua mora acosta
> <joshua_mora at usa.net>, DPHURST at uncg.edu,        beowulf at beowulf.org
> Subject: Re: [Beowulf] Lowered latency with multi-rail IB?
>
>> On Mar 27, 2009, at 18:20 , Craig Tierney wrote:
>>
>>> What about using multi-rail to increase message rate?  That isn't
>>> the same as latency, but if you put messages on both wires you
>>> should get more.
>>
>> Exactly why we saw almost 2x speedup on message rate (latency)
>> sensitive apps using Platform MPI. We call the technique alternating;
>> any communicating peer will use a HCA or port, but a single MPI
>> process will alternate between different HCAs (or ports), depending  
>> on
>> which peer he communicates with.
>>
>>
>> Håkon
>
>
>