[Beowulf] scalability

Fri Dec 11 11:28:40 PST 2009

Howdy!

Gus Correa wrote:
> Hi Chris
> 
> Chris Samuel wrote:
>> ----- "Gus Correa" <gus at ldeo.columbia.edu> wrote:
>>
>>> This is about the same number I've got for an atmospheric model
>>> in a dual-socket dual-core Xeon computer.
>>> Somehow the memory path/bus on these systems is not very efficient,
>>> and saturates when more than two processes do
>>> intensive work concurrently.
>>> A similar computer configuration with dual- dual-
>>> AMD Opterons performed significantly better on the same atmospheric
>>> code (efficiency close to 90%).
>>
>> The issue is that there are Xeon's and then there
>> are Xeon's.
>>
>> The older Woodcrest/Clovertown type CPUs had the
>> standard Intel bottleneck of a single memory
>> controller for both sockets.
>>
> 
> Yes, that is for fact, but didn't the
> Harpertown generation still have a similar problem?

Yes.

> Amjad's Xeon small cluster machines are dual socket dual core,
> perhaps a bit older than the type I had used here
> (Intel Xeon 5160 3.00GHz) in standalone workstations
> and tested an atmosphere model with the efficiency
> numbers I mentioned above.
> According to Amjad:
> 
> "I have, with my group, a small cluster of about 16 nodes
> (each one with single socket Xeon 3085 or 3110;
> And I face problem of poor scalability. "

What's the application?  WRF may fail to scale even at modest numbers of 
cores if the domain size is sufficiently large.

IS this a NWP code? (sorry for coming in late on the discussion... I've 
been hacking on RWFv3.1.1 with openMPI and PGI and seeing some 
interesting problems.

gerry

> I lost track of the Intel number/naming convention.
> Are Amjad's and mine Woodcrest?
> Clovertown?
> Harpertown?
> 
>> The newer Nehalem Xeon's have HyperTransport^W QPI
>> which involves each socket having its own memory
>> controller with connections to local RAM.
> 
> That has been widely reported, at least in SPEC2000 type of tests.
> Unfortunately I don't have any Nehalem to play with our codes.
> However, please take a look at the ongoing discussion on the OpenMPI
> list about memory issues with Nehalem
> (perhaps combined with later versions of GCC) on MPI programs:
> 
> http://www.open-mpi.org/community/lists/users/2009/12/11462.php
> http://www.open-mpi.org/community/lists/users/2009/12/11499.php
> http://www.open-mpi.org/community/lists/users/2009/12/11500.php
> http://www.open-mpi.org/community/lists/users/2009/12/11516.php
> http://www.open-mpi.org/community/lists/users/2009/12/11515.php
> 
>>
>> This is essentially what AMD have been doing with
>> Opteron for years and why they've traditionally
>> done better than Intel with memory intensive codes.
>>
> 
> Yes, and we're happy with their performance, memory bandwidth
> and scalability on the codes we run (mostly ocean/atmosphere/climate).
> Steady workhorses.
> 
> Not advocating any manufacturer's cause,
> just telling our experience.
> 
>> cheers,
>> Chris
> 
> Cheers,
> Gus Correa
> ---------------------------------------------------------------------
> Gustavo Correa
> Lamont-Doherty Earth Observatory - Columbia University
> Palisades, NY, 10964-8000 - USA
> ---------------------------------------------------------------------
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf