[Beowulf] Haswell as supercomputer microprocessors
landman at scalableinformatics.com
Mon Aug 3 06:37:19 PDT 2015
On 08/03/2015 05:06 AM, Mikhail Kuzminsky wrote:
> New special supercomputer microprocessors (like IBM Power BQC and
> Fujitsu SPARC64 XIfx) have 2**N +2 cores (N=4 for 1st, N=5 for 2nd),
> where 2 last cores are redundant, not for computations, but only for
> other work w/Linux or even for replacing of failed computational core.
> Current Intel Haswell E5 v3 may also have 18 = 2**4 +2 cores. Is there
> some sense to try POWER BQC or SPARC64 XIfx ideas (not exactly), and use
> only 16 Haswell cores for parallel computations ? If the answer is
> "yes", then how to use this way under Linux ?
Its possible to do this with some taskset incantation with cpuset
filesystem bits (burnt offerings generally not needed). I don't think
there are "redundant" cores in the Intel product.
Its left as an exercise to the reader to implement though ...
More seriously, you can do some of this also with cgroups
https://en.wikipedia.org/wiki/Cgroups which is actually what Docker et
al. do (in part).
There are many ways to attack this problem.
If you are trying to isolate the OS from the computation, say to reduce
OS jitter impacts upon processes, you might also like work on setting
interrupt affinity, as well as start working with memory placement
directly (to minimize QPI usage). The issue you will encounter is that
most of the HPC systems with a single HCA/NIC will require IO to/from a
remote (in a NUMA sense) node. Which means going over QPI. Unless you
have the Intel Infinipath (or Omnipath ... I am not as up on the new
naming as I should be) or a multi-rail config set up specifically to put
one NIC/HCA on each socket.
The point I am trying (subtly) to make here is that you can possibly
spend more time and effort on optimization here. The question is (and
for the above) the relative value of this. For various codes, OS jitter
is very important, and you should seek to eliminate it. For others ...
not so much.
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics, Inc.
e: landman at scalableinformatics.com
p: +1 734 786 8423 x121
c: +1 734 612 4615
More information about the Beowulf