[Beowulf] interconnect and compiler ?

Patrick Geoffray patrick at myri.com
Wed Feb 11 17:30:58 PST 2009

Vincent Diepeveen wrote:
> All such sorts of switch latencies are at least factor 50-100 worse than 
> their one-way pingpong latency.

I think you are a bit confused about switch latencies.

There is the crossbar latency that is the time it takes for a packet to 
be decoded and routed to the right output port. It is essentially the 
difference between the pingpong latency with and without the crossbar in 
the middle for the smallest packet size. Typical crossbar latencies are 
in the order of 100ns for recent Ethernot, 200ns for Ethernet. To build 
bigger fabric, you need to connect multiple crossbars into Clos, 
Fat-tree or Torus topologies. The end-to-end switch latency is then 
dependent on the number of crossbars the packet crosses.

There is the PHY/transceiver latency. That only applies to the edge of 
the switch, where a physical cable plugs into a sockets. SFP+ for 
example requires serialization compared to QSFP. With fiber, the 
transceiver have some overhead. Typical overhead is 250ns per port for 
serial fiber PHY, almost nothing for parallel copper.

Another overhead is the head-of-Line blocking. It happens when the 
packet has to wait for another one to pass in order to be switched. This 
is equivalent to 2 cars turning on the same road: one will have to wait 
on the other to make the turn. This latency can be high, specially if 
the packets are large (imagine a couple of trains instead of cars).
Is that what you call "ugly switch latency" ? HOL blocking will reduce 
you switch efficiency to ~40% with random traffic. That means your 
latency will be about two times higher in average, assuming all packets 
have the same size. Where is the factor 50-100 ?

> My assumption is always: "if manufacturer doesn't respond it must be 
> real bad for his network card".

Maybe they don't respond because the question does not make any sense.

> Note that pingpong latency also gets demonstrated in a wrong manner.
> Requirement to determine one way pingpong should be that it eats no cpu 
> time obtaining it.

You mean blocking on an interrupt ? When you go to a restaurant, do you 
place your order and go back home waiting for a phone call or do you 
wait at a table ? I, for one, sit down and busy poll.


More information about the Beowulf mailing list