[Beowulf] Q: IB message rate & large core counts (per node)?

Brian Dobbins bdobbins at gmail.com
Fri Feb 19 12:23:20 PST 2010

Hi Joe,

  I'm beginning to look into configurations for a new cluster and with the
>> AMD 12-core and Intel 8-core chips 'here' (or coming soonish), I'm curious
>> if anyone has any data on the effects of the messaging rate of the IB cards.
>>  With a 4-socket node having between 32 and 48 cores, lots of computing can
>> get done fast, possibly stressing the network.
> The big issue will be contention for the resource.  As you scale up the
> number of requesters, if the number of resources don't also scale up (even
> vitualized non-blocking HCA/NICs are good here), you could hit a problem at
> some point.

  My knowledge of the latest low-level hardware is sadly out of date - does
a virtualized non-blocking HCA mean that I can have one HCA which
virtualizes into four (one per socket say), and each of those four has its
own memory-mapped buffer so that I don't get cache invalidation / contention
on multi-socket boxes, or am I totally off-base here?  I'm all for scaling
up NICs as I scale up cores, but each additional NIC / HCA port means more
switch ports, which adds up fast.  In fact, if I have a standard 2-socket
node now, with 8 cores in it and a DDR IB port, and then get a 2-socket node
with 24 cores in it and a QDR IB port,... how's the math work?  I've got 3x
the cores, 1x the adapters, but that adapter has 2x the speed.  Blah.

  I know Qlogic has made a big deal about the InfiniPath adapter's extremely
>> good message rate in the past... is this still an important issue?  How do
>> the latest Mellanox adapters compare?  (Qlogic documents a ~30M messages
>> processsed per second rate on its QLE7342, but I didn't see a number on the
>> Mellanox ConnectX-2... and more to the point, do people see this effecting
>> them?)
We see this on the storage side.  Massive oversubscription of resources
> leads to contention issues for links, to ib packet requeue failures among
> other things.

  So (ignoring disk latencies and just focusing on link contention), is
there any difference between using 2x the storage nodes or the same number
of storage nodes, but with 2x the NICs?

  On a similar note, does a dual-port card provide an increase in on-card
>> processing, or 'just' another link?  (The increased bandwidth is certainly
>> nice, even in a flat switched network, I'm sure!)
> Depends.  If the card can talk to the PCIe bus at full speed, you might be
> able to saturate the link with a single QDR port.  If your card is throttled
> for some reason (we have seen this) then adding the extra port might or
> might not help.  If you are at the design stage, I'd suggest "go wide" as
> you can ... as many IB HCAs as you can get to keep the number of ports/core
> as high as reasonable.

  Oh dear.  I need to go re-learn a lot of things.  So if I want multiple
full-speed QDR cards in a node, I need that node to have independent PCIe
buses, and each card to be placed on a separate bus.

Of course I'd have to argue the same thing on the storage side :)

  No argument from me there!

  Thanks again, as always, for your input.
  - Brian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20100219/145b4f3d/attachment.html>

More information about the Beowulf mailing list