[Beowulf] Re: Opteron 275 performance

Robert G. Brown rgb at phy.duke.edu
Wed Jul 27 20:25:03 PDT 2005

Steve Cousins writes:

> On Wed, 27 Jul 2005, Joe Landman wrote:
>> Hi Steve:
>>    Not knowing the details of your calculations might be an issue, but 
>> you can read about our experiences with a number of chemistry and 
>> informatics codes on dual core Opteron systems.  See 
>> http://enterprise2.amd.com/downloadables/Dual_Core_Performance.pdf for 
>> more details.
>> Joe
> Hi Joe,
> Thanks a lot.  I just took a look and it seems to make a good case for
> getting the Dual Dual Core machine.  
> I'm fairly certain that the memory latency issue that Vincent was warning
> about won't be an issue, although I'm a bit clueless about how to know for
> sure.  How would I go about finding out if our model is TLB trashing main
> memory?  I feel like I just bit the hook... I don't want to start a huge
> discussion on this but if there are some quick tell-tale signs of it I'd
> be interested to find out.

Why bother with tell-tale signs?  Like I said, your previous post was
dead on the money.  Get a loaner (which can physically be far far away
and should be "free"), install YOUR application and run the only
benchmark or test that matters.

On paper, the memory access schemes used by the Opterons should largely
ameliorate the kind of difficulty encountered with the dual PIII's --
they ought to do better than just divide single processor bandwidth
between two processors at any rate.  You can visit the hypertransport
site and look at white papers, e.g. --


or look at multicore hype (with some useful info mixed in) here:


especially its generic description of DCA (Direct Connect Architecture).
The design was driven by the desire to reduce latency in shared access
situations; HT does this by a fairly complicated interleaving on a
request queue (as best I can tell).  With a dual core dual CPU design in
particular you have to worry about connecting four distinct cores to
each other and to memory and to peripherals.  This ultimately makes it
very difficult to predict whether any given application will scale the
way it "should" in an ideal universe.

I honestly think that the only way to be SURE your particular
application fits in the probably very broad category of applications
that can scale from one to four cores in nearly constant time is to try
it.  Preferrably at several mixes of program scales to "force" the
processors to interleave memory in all the ways it might ever need to in
application.  Eventually enough may be learned about the architecture so
that somebody can say "yeah, run the X benchmark, and if it does well so
will your application" but I think we aren't quite there yet.


> Thanks,
> Steve
>> Steve Cousins wrote:
>> >> On Thu, 14 Jul 2005 11:25:12 +0100 Igor Kozin wrote:
>> >> 
>> >> 
>> >>> But now for 4cores/2CPUs per Opteron node to force the using of
>> >>>
>> >>>>only 2 cores (from 4), by 1 for each chip, we'll need to have
>> >>>>cpu affinity support in Linux.
>> >>>
>> >>>Mikhail,
>> >>>you can use "taskset" for that purpose. 
>> >>>For example, (perhaps not in the most elegant form)
>> >>>        mpiexec  -n 1 taskset -c 0 $code : -n 1 taskset -c 2 $code
>> >>>But I doubt you want to let the idle cores to do something else 
>> >>>in the mean time. However small you will generally see an increase 
>> >>>in performance if you use all the cores.
>> >> 
>> >> 
>> >> We are considering getting a Dual Dual-Core Opteron system vs. two Dual
>> >> Opteron systems.  We like the ability to use all four cores on one model
>> >> but a lot of what we'll do is have two models running at the same time,
>> >> each using two cores.  
>> >> 
>> >> We are worried that running two models on one system with four cores (each
>> >> model using two cores) will not work as well as using two systems, each
>> >> with two cores/cpu's.  Is this what you were refering to (Igor) when you
>> >> wrote:
>> >> 
>> >> 
>> >>>But I doubt you want to let the idle cores to do something else
>> >>>in the mean time. 
>> >> 
>> >> 
>> >> We have an 8 CPU SGI Origin 3200 that has no problem doing this sort of
>> >> thing.  I'm just curious what the implications are of doing this with the
>> >> Dual Core Opteron cpu's.  
>> >> 
>> >> Thanks,
>> >> 
>> >> Steve 
>> >> ______________________________________________________________________
>> >>  Steve Cousins, Ocean Modeling Group    Email: cousins at umit.maine.edu
>> >>  Marine Sciences, 208 Libby Hall        http://rocky.umeoce.maine.edu
>> >>  Univ. of Maine, Orono, ME 04469        Phone: (207) 581-4302
>> >> 
>> >> 
>> >> 
>> >> 
>> >> _______________________________________________
>> >> Beowulf mailing list, Beowulf at beowulf.org
>> >> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>> >
>> -- 
>> Joseph Landman, Ph.D
>> Founder and CEO
>> Scalable Informatics LLC,
>> email: landman at scalableinformatics.com
>> web  : http://www.scalableinformatics.com
>> phone: +1 734 786 8423
>> fax  : +1 734 786 8452
>> cell : +1 734 612 4615
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20050727/107fdf4b/attachment.sig>

More information about the Beowulf mailing list