[Beowulf] multi-threading vs. MPI
Tom Elken
tom.elken at qlogic.com
Sat Dec 8 09:51:48 PST 2007
Don,
You asked:
"... AMD X2 5400+ in the Master node (dual core) and three AMD X2 4000+
dual core processors enclosed in inexpensive boxes. .... I believe, some
hyperthreading between the dual cores so what is the story about how the
dual cores can be addressed individually but still have hyperthreading
between the dual cores?"
There is no hyperthreading (hardware threading) in AMD CPUs. Each core
appears like a separate CPU to the Operating System and is treated as
such by MPI libraries. So you can happily use your Microwulf running
two MPI processes per node with good performance. The thrust of the
discussion is that, for the average user, you can ignore software
threading between the cores of a node, just use MPI to obtain good
parallel speed-ups.
-Tom
________________________________
From: beowulf-bounces at beowulf.org
[mailto:beowulf-bounces at beowulf.org] On Behalf Of Donald Shillady
Sent: Friday, December 07, 2007 4:23 PM
To: richard.walsh at comcast.net; Toon Knapen; BeowulfMailing List
Subject: RE: [Beowulf] multi-threading vs. MPI
This is a very interesting discussion to me. I have started to
purchase components for an 8 core microWulf based on the Calvin College
microWulf constructed by Prof. Joel Adams and his student except I will
use slightly faster cores with an AMD X2 5400+ in the Master node (dual
core) and three AMD X2 4000+ dual core processors enclosed in
inexpensive boxes. The Master node has an MSI K9N SLI Platinum
motherboard which has two Gigabit ports so perhaps the initial
configuration with three satellite dual core CPU can be extended to a
second set of boxes later. All these AM2-socket CPU are dual core and
apparently Prof. Adams was able to address them in the microWulf as
individual cores but there is, I believe, some hyperthreading between
the dual cores so what is the story about how the dual cores can be
addressed individually but still have hyperthreading between the dual
cores? I am an experienced programmer for Von Neuman architecture and a
total novice on parallel systems but as I build the microWulf I wonder
if MPI will decouple the hyperthreading or is it not there? From what
little I have learned so far the microWulf switch depends on the
relatively slow Gigabit Ethernet so there is probably time within each
dual core CPU for hyperthreading to occur if indeed provision is
provided for hyperthreading in the AMD X2 dual cores. Sorry to ask such
a dumb question but I am trying to learn.
Don Shillady
Emeritus PRofessor of Chemistry, VCU
Ashland Va (working at home)
________________________________
From: richard.walsh at comcast.net
To: toon.knapen at gmail.com; beowulf at beowulf.org
Subject: Re: [Beowulf] multi-threading vs. MPI
Date: Fri, 7 Dec 2007 22:15:25 +0000
CC:
-------------- Original message --------------
From: "Toon Knapen" <toon.knapen at gmail.com>
How come there is almost unanimous agreement in
the beowulf-community while the rest is almost unanimous convinced of
the opposite ? Are we just tapping ourselves on the back or is MP not
sufficiently dissiminated or ... ?
Mmm ... I think the answer to this is that the
rest of world (non-HPC world) is in a time
warp. HPC went through its SMP-threads phase in
the early-mid 1990s with OpenMP, and then we needed more a more scalable
approach (MPI). Now that multi-core and multi-socket has brought
parallelism to the rest of the Universe, SMP-based parallelism has had a
resurgence ... this has also naturally caused some in HPC to revisit the
question as nodes have fattened.
The allure of a programming model that is
intuitive, expressive, symbolically light-weight,
and provides a way to manage the latency
variance across memory partitions is irresistable.
I kind of like the CAF extension to Fortran and
the concept of co-arrays. The co-array is
and array of identical normal arrays, but one
per active image/process. They are defined as such:
real, dimension (N) [*] :: X, Y
If the program is run on 8
cores/processors/images the * becomes 8. 8, 1D arrays of size
N are created on each processor. In any
references to the locale component of the co-array
(the image on the processor referencing it), you
can drop the []s ... all other references (remote)
must include it. This is symbolically light,
but reminds the programmer of every costly non-
local reference with the presence of the []s in
the assignment or operation. There is much
more to it than that of course, but as the
performance gap between carefully constructed
MPI applications and CAF compiled code shrinks I
can see the later gaining some traction
for purely programming elegance related reasons.
If you accept that notion that most MPI
programs are written at a B- level in terms of
efficiency then the idea of gap closing may not
be so far fetched. CAF is supposed to be
include in the Fortran 2008 standard.
rbw
--
"Making predictions is hard, especially about
the future."
Niels Bohr
--
Richard Walsh
Thrashing River Consulting--
5605 Alameda St.
Shoreview, MN 55126
--Forwarded Message Attachment--
From: toon.knapen at gmail.com
To: beowulf at beowulf.org
Subject: [Beowulf] multi-threading vs. MPI
Date: Fri, 7 Dec 2007 20:07:32 +0000
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or
unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf
mailing list