Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] using two separate networks for different data streams

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Mark Hahn hahn at physics.mcmaster.ca
Fri Jan 27 18:36:03 PST 2006


> > The network for MPI should in many cases have low latency, so is expensive
> > (Myrinet, InfiniBand, etc.) in regards of Ethernet.

I tend to think of IB as mainly for high-bandwidth, since afaikt its 
latency isn't even as good as myri 2g (mx, 3 us).  to say nothing of 
myri 10g, quadrics, numalink, sci.

> > The I/O, NFS and
> > system network does not need low latency,

well, sort of.  I can imagine workloads (perhaps bio-database stuff)
that might take real advantage of lower-latency networks for IO.  but it's
also quite easy to see IO workloads that would exceed the bandwidth that 
a single GBE offers (say 80 MB/s).  and there are storage systems that
can actually drive many high-bandwidth links (Lustre, DDN, etc).
I'm jaded, but I do think of IB as more a faster, cheaper SAN than 
as a MPI-oriented low-latency interconnect ;)

> > and so for bargain cost can be
> > added, with the additional ground that it provides a control network to
> > tweak the nodes remotely when the expensive low latency network is down.

I guess.  which MPI-oriented nets are commonly down?  I haven't had any
problems with myri 2g or quadrics.

> Is there a way of characterizing in what proportion a given application
> relies on OpenMP, and how much the application depends on MPI (and hence
> MPI network latency) - other than speaking with application developers
> to get their intuitive feel, that is?  :)

we use logs from our job scheduler(s).  queues are separate for serial,
threaded and parallel/mpi.

> We're looking to buy a Gigabit Ethernet network for the MPI on this, but
> if that's obscenely high latency, and the primary application the
> cluster's being purchased for is heavily dependent on MPI, then we might

well, "heavily dependent" isn't really the same as "latency sensitive".
I find surprisingly many uses who are not unhappy with gigabit until they
scale above moderate (say, 16-64) numbers of CPUs.

regards, mark hahn.




More information about the Beowulf mailing list