[Beowulf] How to configure a cluster network

andrew holway andrew at moonet.co.uk
Thu Jul 24 11:15:17 PDT 2008

Well the top configuration(and the one that I suggested) is the one
that we have tested and know works. We have implimented it into
hundereds of clusters. It also provides redundancy for the core

With any network you need to avoid like the plauge any kind of loop,
they can cause weird problems and are pretty much unnessasary. for
instance, why would you put a line between the two core switches? Why
would that line carry any traffic?

When you consider that it takes 2-4μs for an mpi message to get from
one node to another on the same switch, each extra hop will only
introduce another 0.02μs (I think?) to that latency so its not really
worth worrying about especially at the expence of reliability.

Most applications dont use anything like the full bandwidth of the
interconnect so the half bisectionalness of everything can generally
be safeley ignored.

All the spare ports you have on the edge switches can be used for
extra connections to the core switches.



On Thu, Jul 24, 2008 at 6:16 PM, Daniel Pfenniger
<Daniel.Pfenniger at obs.unige.ch> wrote:
> Andrew,
> Here are joined some possible topologies I was contemplating, with some
> remarks about them.  Many other topologies are possible.
> The first one is the one you mention.  If 12 nodes linked to one switch
> communicate with 12 nodes on another switch the bandwidth  is reduced to
> 8/12 = 2/3.  All the packets needs either 1 or 3 hops through a switch.
> The second topology improves the bandwidth between the core and edge
> switches.  The bandwidth for the above case is not reduced except that some
> routes need 4 hops.
> The third topology has the feature that 2 hops node to node communications
> are possible, but global communications are slightly degraded with respect
> to the previous case.
> In the fourth case we have one core switch and 4 edge switches. When
> 10 nodes communicate with 10 other nodes on another edge switch
> 2 or 3 routes need 2 hops and the rest 3 hops, without bandwidth reduction.
>  It seems to me that this topology is better than the previous ones.
> Finally the last topology has no core switch.  All the routes need either 1
> or 2 hops.  This one seems to me even better.
> Since I am not network expert I would be glad if somebody explains
> why the first solution is the best one.
>        Dan
> andrew holway wrote:
>> Daniel
>> To give a half bisectional bandwidth the best approach is to set up
>> two as core switches and the other 4 as edge switches.
>> Each edge switch will have four connections to each core switch
>> leaving 16 node connections on each edge switch.
>> Should provide a 64 port network.
>> Make sense?
>> Ta
>> Andy
>> On Thu, Jul 24, 2008 at 11:06 AM, Daniel Pfenniger
>> <Daniel.Pfenniger at obs.unige.ch> wrote:
>>> Hi,
>>> I have the problem of connecting with InfiniBand 50 1-HCA nodes with 6
>>> 24-port switches.  Several configurations may be imagined, but which one
>>> is
>>> the best?  What is the general method to solve such a problem?
>>> Thanks,
>>>       Dan
>>> _______________________________________________
>>> Beowulf mailing list, Beowulf at beowulf.org
>>> To change your subscription (digest mode or unsubscribe) visit
>>> http://www.beowulf.org/mailman/listinfo/beowulf

More information about the Beowulf mailing list