[Beowulf] building Infiniband 4x cluster questions - using 1 port out of 2

Vincent Diepeveen diep at xs4all.nl
Mon Nov 7 09:20:24 PST 2011

On Nov 7, 2011, at 5:07 PM, Robert Horton wrote:

> On Mon, 2011-11-07 at 15:45 +0100, Vincent Diepeveen wrote:
>> What's the second one doing, is this just in case the switch fails,
>> a
>> kind of 'backup'  port?
>> In my naivity i had thought that both ports together formed the
>> bidirectional link to the switch.
>> So i thought that 1 port was for 10 gigabit upstream and the other
>> port was for 10 gigabit downstream,
>> did i misunderstood that?
> It's "normal" to just use single port cards in a compute server. You
> might want to use 2 (or more) to increase the bandwidth to a  
> particular
> machine (might be useful for a fileserver, for instance) or if you are
> linking nodes to each other (rather than via a switch) in a taurus- 
> type
> topology.
> Rob

It's still not clear to me what exactly the 2nd link is doing. If i  
want to ship th emaximum amount of
short messages, say 128 bytes each message, is a 2nd cable gonna  
increase the number of messages
i can ship?

In fact the messages i'll be shipping out is requests to read remote  
in a blocking manner 128 bytes.
So say this proces P at node N0 wants from some other node N1 exactly  
128 bytes from the gigabytes
big hashtable.

That's a blocked read.

The number of blocked reads per second that can read a 128 bytes is  
the only thing that matters for the
network, nothing else. Note it will also do writes, but with writes  
you always can be doing things in a more tricky
manner. So to speak you can queue up a bunch and ship them. Writes do  
not need to be non-blocking. If they
flow at  a tad slower speed to the correct node and get written  
that's also ok. The write is 32 bytes max.

In fact i don't want to read 128 bytes. As that 128 bytes is 4  
entries and as in such cluster network it's a layered system,
if i would be able to modify the source code doing the read, all i  
would give is a location, the host processor then can do
the read of 32 bytes and give that.

As i assume the network to be silly and not able to execute remote  
code, i read 128 bytes and figure out here which of the
4 positions *possible* is the correct position stored (odds about a  
tad more than 5% that a position already was stored before).

So ideally i'd be doing reads of 32 bytes, yet as the request for the  
read is not capable of selecting the correct position, it has to
scan 128 bytes for it, so i get the entire 128 bytes.

The number of 128 byte reads per second randomized over the hashtable  
that's spreaded over the nodes, is the speed at which
the 'mainsearch' can search.

I'm guessing blocked reads to eat nearly 10 microseconds with  
infiniband 4x, so that would mean i can do about a 100k lookups
a card.

Question is whether connecting the 2nd port would speedup that to  
more than 100k reads per second.

More information about the Beowulf mailing list