[Beowulf] Can one Infiniband net support MPI and a parallel filesystem?
Ashley Pittman
apittman at concurrent-thinking.com
Wed Aug 13 03:29:05 PDT 2008
On Tue, 2008-08-12 at 12:09 -0600, Craig Tierney wrote:
> Chris Samuel wrote:
> > ----- "I Kozin (Igor)" <i.kozin at dl.ac.uk> wrote:
> > But that assumes you're not sharing a node with other
> > jobs that may well be doing I/O.
> >
> I am wondering, who shares nodes in cluster systems with
> MPI codes?
In my experience, almost everyone. In practise though most jobs ask for
even numbers of CPU's so larger jobs rarely get scheduled this way.
> We never have shared nodes for codes that need
> multiple cores since be built our first SMP cluster
> in 2001. The contention for shared resources (like memory
> bandwidth and disk IO) would lead to unpredictable code performance.
Unpredictable maybe but if the alternative is to not run at all then
it's still a win. What you wouldn't want is to have a small number of
processes in a big job sharing a node with a resource hogging job and
slow down the entire big job however I've never seen this happening in
the wild.
> Also, a poorly behaved program can cause the other codes on
> that node to crash (which we don't want).
It goes without saying that this shouldn't be able to happen.
Ashley.
More information about the Beowulf
mailing list