[Beowulf] Experience of using multiple network devices on a node in cluster

Mark Hahn hahn at physics.mcmaster.ca
Mon May 16 07:00:48 PDT 2005


> We have implemented clusters using one interface for parallel traffic
> (Score) and one for general purpose/NFS traffic.

segregating traffic is a common suggestion, but I don't really understand
why it would be sensible.  a node is unlikley to be running some mixture
of MPI and IO jobs, at least the normal kind of node (dual).
control/monitoring really ought to be minimal in bandwidth (per-node), no?

failing to use both ports seems like a shame to me - host ports are 
smarter than switch ports, and let you build extremely high bisection
networks.  for instance:

	- NxM grid of nodes.
	- each N nodes across plugs into the m-th "row" switch.
	- each M nodes down plugs into the n-th "column" switch.
	- each switch plugs into its peers: the m-th row switch
	plugs into M-1 other row switches.

each route is 1-2 switch hops; nodes only do the initial bit of routing
(which port to use).  this mainly makes sense where you have cheap switches,
but more nodes than FNN can use (and biger switches are expensive.)




More information about the Beowulf mailing list