[Beowulf] interconnect and compiler ?
patrick at myri.com
Wed Feb 11 17:30:58 PST 2009
Vincent Diepeveen wrote:
> All such sorts of switch latencies are at least factor 50-100 worse than
> their one-way pingpong latency.
I think you are a bit confused about switch latencies.
There is the crossbar latency that is the time it takes for a packet to
be decoded and routed to the right output port. It is essentially the
difference between the pingpong latency with and without the crossbar in
the middle for the smallest packet size. Typical crossbar latencies are
in the order of 100ns for recent Ethernot, 200ns for Ethernet. To build
bigger fabric, you need to connect multiple crossbars into Clos,
Fat-tree or Torus topologies. The end-to-end switch latency is then
dependent on the number of crossbars the packet crosses.
There is the PHY/transceiver latency. That only applies to the edge of
the switch, where a physical cable plugs into a sockets. SFP+ for
example requires serialization compared to QSFP. With fiber, the
transceiver have some overhead. Typical overhead is 250ns per port for
serial fiber PHY, almost nothing for parallel copper.
Another overhead is the head-of-Line blocking. It happens when the
packet has to wait for another one to pass in order to be switched. This
is equivalent to 2 cars turning on the same road: one will have to wait
on the other to make the turn. This latency can be high, specially if
the packets are large (imagine a couple of trains instead of cars).
Is that what you call "ugly switch latency" ? HOL blocking will reduce
you switch efficiency to ~40% with random traffic. That means your
latency will be about two times higher in average, assuming all packets
have the same size. Where is the factor 50-100 ?
> My assumption is always: "if manufacturer doesn't respond it must be
> real bad for his network card".
Maybe they don't respond because the question does not make any sense.
> Note that pingpong latency also gets demonstrated in a wrong manner.
> Requirement to determine one way pingpong should be that it eats no cpu
> time obtaining it.
You mean blocking on an interrupt ? When you go to a restaurant, do you
place your order and go back home waiting for a phone call or do you
wait at a table ? I, for one, sit down and busy poll.
More information about the Beowulf