[Beowulf] Re: Opteron 275 performance

Wed Jul 27 11:09:32 PDT 2005

Steve,
Things depend upon what your software is sensitive to.

If that's cpu speed, it will run at all these systems great, 
and you can skip the below lines i wrote.

If the most important issue is memory latency then read below:

Memory latency, when all 4 processors are busy with their own memory TLB
trashing job, is about 200 ns at opteron (1GB ram).

Origin3000 series has when testing just 1 cpu to its own memory, a memory
latency from 280 ns.

So your jobs will just as fine, provided memory is a problem for you.

Please note that at 16 cpu's Altix3000-3800 the latency to RAM when taking
250 MB ram is growing to around 700ns.

This where quad opteron dual core with 1.8Ghz dual core cpu's has 234ns
latency (each cpu 250MB ram).

We are running a simplistic Ubuntu distribution and only upgraded its
kernel to the latest default SMP kernel they compiled for AMD64 in Ubuntu, 
which is by the way:

diep at ubuntu:/egtb$ uname -a
Linux ubuntu 2.6.10-5-amd64-k8-smp #1 SMP Fri Jun 24 17:23:48 UTC 2005
x86_64 GNU/Linux

In reality it's a NUMA kernel, so SMP is confusing to put there.

I have to add that single cpu latency from the opterons SINGLE core,
is a LOT better to memory. I measure 111 ns to a single cpu 
at a dual opteron 2.2Ghz
single core with the same Ubuntu and kernel installed.

So if your only worry is TLB trashing main memory then 2 
machines dual opteron will outperform anything thanks to the 
memory latency.

On the other hand if only the speed of the cpu matters, then what you worry
about. A dual core quad opteron 2.2Ghz will just outperform a 8 processor 
Itanium2 like silly for the average application.

Scaling of the quad opteron dual core for a 8 cpu job will be a tad less
than at the itanium2. Something like 7.80 versus 8.0 for the altix3200.

Yet the nps at the itanium2 1.5Ghz for Diep is around 800k nps,
versus 1+ million at a quad opteron dual core 1.8Ghz.

At 12:43 PM 7/27/2005 -0400, Steve Cousins wrote:
>
>On Thu, 14 Jul 2005 11:25:12 +0100 Igor Kozin wrote:
>
>>  But now for 4cores/2CPUs per Opteron node to force the using of
>> > only 2 cores (from 4), by 1 for each chip, we'll need to have
>> > cpu affinity support in Linux.
>> 
>> Mikhail,
>> you can use "taskset" for that purpose. 
>> For example, (perhaps not in the most elegant form)
>>         mpiexec  -n 1 taskset -c 0 $code : -n 1 taskset -c 2 $code
>> But I doubt you want to let the idle cores to do something else 
>> in the mean time. However small you will generally see an increase 
>> in performance if you use all the cores.
>
>We are considering getting a Dual Dual-Core Opteron system vs. two Dual
>Opteron systems.  We like the ability to use all four cores on one model
>but a lot of what we'll do is have two models running at the same time,
>each using two cores.  
>
>We are worried that running two models on one system with four cores (each
>model using two cores) will not work as well as using two systems, each
>with two cores/cpu's.  Is this what you were refering to (Igor) when you
>wrote:
>
>> But I doubt you want to let the idle cores to do something else
>> in the mean time. 
>
>We have an 8 CPU SGI Origin 3200 that has no problem doing this sort of
>thing.  I'm just curious what the implications are of doing this with the
>Dual Core Opteron cpu's.  
>
>Thanks,
>
>Steve 
>______________________________________________________________________
> Steve Cousins, Ocean Modeling Group    Email: cousins at umit.maine.edu
> Marine Sciences, 208 Libby Hall        http://rocky.umeoce.maine.edu
> Univ. of Maine, Orono, ME 04469        Phone: (207) 581-4302
>
>
>
>
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
>
>