<div dir="ltr"><div>We aren't after average HPC programmers...</div><div><br></div><div>Even good compilers (Intel) are very very limited in their optimisations. We got factors of 2x and 3x by hand writing SSSE3 commands on standard Xeon's rather than let the compiler do its thing... Compiler limitations isn't particular to Phi.</div><div><br><div class="gmail_quote"><div dir="ltr">On Wed, Jun 20, 2018 at 2:47 AM Prentice Bisbal <<a href="mailto:pbisbal@pppl.gov">pbisbal@pppl.gov</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF"><br>
If you organize your code correctly, and call the compiler with the
right optimization flags, shouldn't the compiler automatically
handle a good portion of this 'low-level' stuff? I understand that
hand-coding this stuff usually still give you the best performance
(See GotoBLAS/OpenBLAS, for example), but does your average HPC
programmer trying to get decent performance need to hand-code that
stuff, too? <br>
<blockquote type="cite"><br></blockquote></div>
</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr">Dr Stuart Midgley<br><a href="mailto:sdm900@gmail.com" target="_blank">sdm900@gmail.com</a></div></div></div></div>