[Beowulf] bring back 2012?
pbisbal at pppl.gov
Wed Aug 17 07:37:21 PDT 2016
On 08/16/2016 09:18 PM, Stu Midgley wrote:
> oh, indeed the top bin Xeon systems are fast and damn expensive. Even
> when we purchased these AMD systems they were a LOT cheaper than any
> Intel system that could come close in specfp rate performance - let
> alone after 4 years of heavy use.
> One issue is that the AMD systems are far more numa, thus requiring
> tighter programming.
I'm going to have to disagree with the above statement. Before the Intel
Nehalem, the AMD Opterons were true NUMA processors, but the Intel
processors were SMP designs (the *other* SMP - Symmetrical
Multiprocessing). Since the Nehalem, Xeons have been NUMA processors,
too, and I don't think it's accurate to say one design is any more NUMA
than the other.
For symmetric multiprocessing, every read/write to main memory took the
same amount of time, but that also meant the memory controller was a
bottleneck, as it could only service one processor at a time. It's a
resource contention issue. To improve performance, you'd have to do
whatever you could to organize your code so that the various cores
weren't all trying to access memory at the same time.
With NUMA, on the other hand, each processor has its own memory
controller and can access it's own portion of memory very quickly. If if
needs to access memory elsewhere, it takes longer because it has to ask
another processor, or that other processors memory controller to perform
the memory operation (I forget the exact low-level implementation
details). Accessing a remote processor's memory takes longer than
accessing your own memory, hence the name Non-uniform Memory Access. The
advantage of this is that most of the time a processor is accessing it's
local memory, so it can use it's own memory controller without resource
contention most of the time. Sure, still want to organize your program
to keep as many memory accesses local as you can, but I don't think
that's much different than trying to keep as much data in the local
caches as possible to prevent reads of main memory, which you should be
doing regardless of whether your system is SMP or NUMA.
I would think SMP needs tighter programming, since you want to reduce
contention for the memory controller as much as possible.
PS - Yes, I know today's systems are actually a mixture of SMP and NUMA,
depending on what level your looking at the architecture, so put the
torches and pitchforks away!
More information about the Beowulf