Host/interface naming and network path selection

Chris Greer cgreer1 at midsouth.rr.com
Mon Jan 29 00:03:21 PST 2001


We did something similar with DNS on a machine with the
multiple networks.  

Replace domain.com with your domain in the fillowing.

We defined a.domain.com, b.domain.com, etc as DNS zones (We are up
to 6 subnets on our NFS servers).  

nfs is defined in each of these zones, so nfs.a.domain.com, 
nfs.b.domain.com are all separate.  

nfs.domain.com is a CNAME to the defaultroute interface on the
hosts.  

The magic of the clients is then done with the search line in
/etc/resolv.conf.  

If machine is on the subnet b.domain.com and they reference
NFS, then the resolver magically makes them talk over the
right interface to the server.  This really solved our 
problem with naming in a multi-homed environment.  It does
not exactly address your issue, but it really solved our problems.
Before we came up with this, machines try to talk to nfs.domain.com
(usually through a gateway), and the NFS server would answer back
with the local network that the request came in on.  Since it
wasn't the right IP address in the packets, they were thrown out.




Josip Loncaric wrote:
> 
> Some of our machines now have multiple network interfaces, which leads
> to the following question:
> 
> Say machines A,B,C,... can communicate over multiple networks labeled
> 1,2,...; and say you have a parallel application which launches
> processes on A, B and C.
> 
> How does your parrallel application know which communication paths to
> use?  Of course, routing is done based on IP addresses, so the choice of
> the path is actually made when names are resolved to IP addresses.
> Several weird situations can arise.
> 
> (1)  Say that network 2 is faster than network 1 but that there is no A2
> interface.  We could globally identify A=A1, B=B2, C=C2.  Now, paths
> C2<->B2 and B1,C1->A1 work fine, but A1->B2,C2 requires a gateway (very
> bad).  One might change /etc/hosts on A such that B=B1 and C=C1 (on A
> only), but this is not a globally consistent naming scheme.  Some
> software needs globally unique machine name -> IP address mappings (it
> gets confused when A thinks B=B1 but C thinks B=B2).
> 
> (2) Message passing model does not care which interface is used -- it
> just wants to talk to some process on some host.  A sensible expectation
> is that gethostbyname(A) would return a prioritized list of A's
> interface IP addresses.  This is not what happens.  If /etc/hosts is
> used, gethostbyname(A) returns the IP address of the first match; if DNS
> is used and A is associated with multiple IP addresses, gethostbyname(A)
> returns the address list BUT rotates IP addresses on each invocation
> (the aim is to provide load sharing for web sites, I guess).  This fails
> to prioritize paths and can confuse applications which assume globally
> unique name<->address mappings.
> 
> (3) Another problem can arise in naming public/private interfaces.
> The /etc/hosts file can look like this:
> 
> 192.168.1.1     A-1.domain A-1 A        # fast network, private
> 128.2.2.2       A.domain   A-2 A        # slow network, public
> 
> Locally, A resolves to the fast private network while the FDQN form
> A.domain gives the slow public interface, but unfortunately A and
> A.domain resolve to different addresses...
> 
> I'm sure other related examples can be found.  On our system, we were
> forced to do the following:
> 
> (i) All hosts within the cluster use the same primary network 1 so that
> canonical names resolve to A=A1, B=B1, etc.
> (ii) Secondary names like A2,B2,... are used where appropriate
> (iii) Parallel codes use either hostnames A,B,... (network 1) or
> A2,B2,... (network 2) but almost never a mixture of the two
> 
> This situation begs for a better solution.  One approach (not
> universally followed) is to name interfaces A-1,A-2,... and then derive
> the canonical hostname A by truncating each name at the '-' character
> (some software packages use this procedure).  Some kind of consensus on
> whether we are talking about hosts or interfaces is needed, particularly
> since we'd like parallel codes to be portable between clusters.
> Administrator tools to prioritize addresses returned by gethostbyname()
> would also be nice.
> 
> Any suggestions?
> Josip
> 
> P.S.  My personal preference would be to use canonical hostnames like A
> and let the local system figure out what's the best IP address to use.
> This would imply that parallel applications should identify
> participating hosts by canonical hostnames, not by IP addresses (a host
> could have several).  Interface naming could follow the A-1,A-2,...
> style, but unfortunately this style is not a standard.
> 
> --
> Dr. Josip Loncaric, Senior Staff Scientist        mailto:josip at icase.edu
> ICASE, Mail Stop 132C           PGP key at http://www.icase.edu./~josip/
> NASA Langley Research Center             mailto:j.loncaric at larc.nasa.gov
> Hampton, VA 23681-2199, USA    Tel. +1 757 864-2192  Fax +1 757 864-6134
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf




More information about the Beowulf mailing list