[Beowulf] Haswell as supercomputer microprocessors
Prentice Bisbal
prentice.bisbal at rutgers.edu
Mon Aug 3 08:10:43 PDT 2015
The processor in the IBM BG/Q is actually a POWER A2.[1] I never
understood why Top500 listed them as BQC. The POWER A2 processor
actually has 18 cores: 16 for computations, 1 for the OS itself, and 1
'spare'. I believe the spare is not a hot spare, but is there to
increase the yield in chip manufacturing. If there are 18 usable cores
on the chip, one is disabled. If one core is not usable, well, they
still have the 17 they were hoping for. (This is what I heard, but I
don't remember who the source was or how credible it was. If this is
wrong, someone please correct me!).
I wouldn't core the for the OS redundant. It actually improves the
performance of the total system, as documented by the well-known 'ASCI
Q' paper [2].
Now to answer your question, the answer is yes. I highly recommend you
read [2] for a good explanation of why (the authors did a better job
explaining it than I can in a quick e-mail). However, the improvement in
performance increases with the size of the cluster, so it probably won't
be noticeable on small clusters.
In addition to dedicating a single core for the OS, you also want to
reduce OS 'noise' (also called 'jitter') as much as possible by
reducing services on the head node. You can do this by turning off or
uninstalling unnecessary services and building a custom kernel that has
only the services and hardware support needed by your cluster. This is
the idea being the very minimal kernel compute-node kernel (CNK) of the
Blue Gene Nodes. This is an active area of research with many different
groups working in this area:
https://en.wikipedia.org/wiki/Lightweight_Kernel_Operating_System
https://en.wikipedia.org/wiki/Compute_Node_Linux
http://www.mcs.anl.gov/research/projects/zeptoos/
http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=323279
[1]
http://www-01.ibm.com/common/ssi/cgi-bin/ssialias?subtype=SP&infotype=PM&appname=STGE_DC_DC_USEN&htmlfid=DCD12345USEN&attachment=DCD12345USEN.PDF
[2] http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1592958
Prentice Bisbal
Systems Programmer/Administrator
Office of Instructional and Research Technology
Rutgers University
http://oirt.rutgers.edu
On 08/03/2015 05:06 AM, Mikhail Kuzminsky wrote:
> New special supercomputer microprocessors (like IBM Power BQC and
> Fujitsu SPARC64 XIfx) have 2**N +2 cores (N=4 for 1st, N=5 for 2nd),
> where 2 last cores are redundant, not for computations, but only for
> other work w/Linux or even for replacing of failed computational core.
>
> Current Intel Haswell E5 v3 may also have 18 = 2**4 +2 cores. Is
> there some sense to try POWER BQC or SPARC64 XIfx ideas (not exactly),
> and use only 16 Haswell cores for parallel computations ? If the
> answer is "yes", then how to use this way under Linux ?
>
> Mikhail Kuzminsky,
> Zelinsky Institute of Organic Chemistry RAS,
> Moscow
>
>
>
>
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20150803/8b6c5738/attachment.html>
More information about the Beowulf
mailing list