[Beowulf] if you had 2 switches and a bunch of cable, what would you do?

Reuti reuti at staff.uni-marburg.de
Mon Dec 21 04:38:06 PST 2009


Am 21.12.2009 um 10:54 schrieb Hearns, John:

> Option 1. Create 2 separate internal networks. In wiring drawings for
> clusters, I often see one administrative network and one for
> computations (mpi, and so forth).

first a matter of definition: what's "administrative network"? Just  
the option to ssh to a node & SGE, or to have access to some  
facitility of a dedicated service processor like "Lights Out"  

In the former case, I would do it the other way round: use the  
primary one for ssh, SGE and MPI, and the second one for NFS. Simply  
because then there is no need to alter the generated list of hosts  
(ssh to the nodes is in my case only for admin staff anyway), and  
SGE's is communication to the nodes is not so high (When local spool  
directories on the nodes are used, there is no further communication  
needed to store this informaton. Otherwise it will go via NFS to the  
central spool directory, but this would be the second network then.)

With some MPI implementations it can be tricky (but possible) to  
force them to use the secondary interface, especially for both  
directions. In MPICH(1) (old, but sometimes still used) also the  
environment variable MPI_HOST must be set to have the name of the  
secondary interface.

Well, you need a second (or third) network for the headnode then: one  
for NFS and maybe one for going to the outside world (this way all  
internal traffic can use private addresses and are invisible from the  


If the administrative network is "Lights Out" management, I would  
look for a switch with less performance laying around. If your  
servers have it built-in, I would use it. If you have enough ports on  
the switches, you can also connect it to the first network from above.

-- Reuti

> Paul, definitely recommend option 1.
> Use the second switch for MPI traffic.
> The way you achieve this is to use your batch scheduling system and  
> run
> a script
> which takes the machines list provided by the batch system and
> translates it into one
> which is fed to to the mpirun utility, ie the hostnames are turned  
> into
> something like
> hostname-eth1
> The contents of this email are confidential and for the exclusive  
> use of the intended recipient.  If you receive this email in error  
> you should not copy it, retransmit it, use it or disclose its  
> contents but should return it to the sender immediately and delete  
> your copy.
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf

More information about the Beowulf mailing list