[Beowulf] using two separate networks for different data streams
gropp at mcs.anl.gov
Sat Jan 28 12:01:03 PST 2006
At 06:28 PM 1/27/2006, Dan Stromberg wrote:
>On Fri, 2006-01-27 at 19:57 +0100, Daniel Pfenniger wrote:
> > Ricardo Reis wrote:
> > >
> > > First, Hi all and thanks for your answers. Were truly useful. Which
> > > brings me to...
> > >
> > > On Fri, 27 Jan 2006, Mark Hahn wrote:
> > >
> > >> I wonder whether anyone has critically evaluated whether this is
> > >> important.
> > >> cluster people I talk to like to say fuzzy things like "separate
> > >> make the cluster breathe better".
> > >>
> > >> as much as I admire car analogies, I observe that when apps are
> doing IO,
> > >> they tend not to be doing MPI. if your workload is like that, bonding
> > >> rather than partitioning would actually improve performance. I wonder
> > >> whether the partitioning approach might actual reflect other
> > >> such as using half-duplex hubs, or low-bisection networks.
> > The network for MPI should in many cases have low latency, so is expensive
> > (Myrinet, InfiniBand, etc.) in regards of Ethernet. The I/O, NFS and
> > system network does not need low latency, and so for bargain cost can be
> > added, with the additional ground that it provides a control network to
> > tweak the nodes remotely when the expensive low latency network is down.
>That leads to a question for the compute cluster we're currently
>planning to buy here at UCI:
>Is there a way of characterizing in what proportion a given application
>relies on OpenMP, and how much the application depends on MPI (and hence
>MPI network latency) - other than speaking with application developers
>to get their intuitive feel, that is? :)
>We're looking to buy a Gigabit Ethernet network for the MPI on this, but
>if that's obscenely high latency, and the primary application the
>cluster's being purchased for is heavily dependent on MPI, then we might
>want to be ignoring the GigE and going for something else.
If you can get your users to relink (but not recompile) their MPI
applications, there are a number of tools that you can use to understand
the communication needs of those applications. For example, FPMPI2
(www.mcs.anl.gov/fpmpi) collects summary information about each MPI
communication routine and separates the information into messages of
various sizes; this lets you see how much time you are spending on short
(latency-sensitive) messages. There are more sophisticated tools that can
allow you to estimate the performance of those applications under changes
in the latency and bandwidth of the MPI implementation.
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit
More information about the Beowulf