[Beowulf] Can one Infiniband net support MPI and a parallel file system?

Mark Hahn hahn at mcmaster.ca
Tue Aug 5 16:37:39 PDT 2008


> Is anybody using Infiniband to provide both
> MPI connection and parallel file system services on a Beowulf cluster?

of course!  many people have a strong opinion that sharing networks
with file and mpi traffic is a bad thing, but I haven't seen anyone 
actually produce numbers.  obviously contention increases the chances
that a latency-sensitive operation (say, small synchronous mpi message)
will be hurt by a stream of large file packets.  and moreso when the 
fabric is not full-bisection - even if it's multiple cores sharing a
node's single interface.

but consider gigabit - 1500-byte packets consume <20 us of wire time,
and most people who are using gb for mpi are expecting zero-byte latency
quite a lot higher than that (say 50 us).  by contrast, a max-size packet 
on old-gen SDR IB is about 4 us wire time, about the same as 0B latency.

as has been pointed out here recently, the fabric will drop pretty 
significantly in performance once links become contended; this would 
make the latency-vs-bandwidth conflict more painful.  (it also affects
certain networks more than others - depending on their ability to adjust
routes dynamically.)

IMO, you have to ponder in your heart whether your expected workload 
will suffer from these issues.  there is really no general rule, since 
workloads vary so widely in latency sensitivity and in bandwidth demands,
all convolved with the fabric properties...

if you have a well-defined workload, why not measure it?  run an mpi
app that has some sort of performance feedback while applying an 
increasing large-transfer NFS load...

regards, mark hahn.



More information about the Beowulf mailing list