[Beowulf] Can one Infiniband net support MPI and a parallel filesystem?
Craig Tierney
Craig.Tierney at noaa.gov
Wed Aug 13 07:55:19 PDT 2008
Joe Landman wrote:
> Craig Tierney wrote:
>> Chris Samuel wrote:
>>> ----- "I Kozin (Igor)" <i.kozin at dl.ac.uk> wrote:
>>>
>>>>> Generally speaking, MPI programs will not be fetching/writing data
>>>>> from/to storage at the same time they are doing MPI calls so there
>>>>> tends to not be very much contention to worry about at the node
>>>>> level.
>>>> I tend to agree with this.
>>>
>>> But that assumes you're not sharing a node with other
>>> jobs that may well be doing I/O.
>>>
>>> cheers,
>>> Chris
>>
>> I am wondering, who shares nodes in cluster systems with
>> MPI codes? We never have shared nodes for codes that need
>
> The vast majority of our customers/users do. Limited resources, they
> have to balance performance against cost and opportunity cost.
>
> Sadly not every user has an infinite budget to invest in contention free
> hardware (nodes, fabrics, or disks). So they have to maximize the
> utilization of what they have, while (hopefully) not trashing the
> efficiency too badly.
>
>> multiple cores since be built our first SMP cluster
>> in 2001. The contention for shared resources (like memory
>> bandwidth and disk IO) would lead to unpredictable code performance.
>
> Yes it does. As does OS jitter and other issues.
>
>> Also, a poorly behaved program can cause the other codes on
>> that node to crash (which we don't want).
>
> Yes this happens as well, but some users simply have no choice.
>
>>
>> Even at TACC (62000+ cores) with 16 cores per node, nodes
>> are dedicated to jobs.
>
> I think every user would love to run on a TACC like system. I think
> most users have a budget for something less than 1/100th the size. Its
> easy to forget how much resource (un)availability constrains actions
> when you have very large resources to work with.
>
TACC probably wasn't a good example for the "rest of us". It hasn't been
difficult to dedicate nodes to jobs when the number of cores was 2 or 4.
We now have some 8 core nodes, and we are wondering if the policy of
not sharing nodes is going to continue, or at least modified to minimize
waste.
Craig
> Joe
>
>
--
Craig Tierney (craig.tierney at noaa.gov)
More information about the Beowulf
mailing list