<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    <p>Thanks for the explanation. I've always found the documentation
      on HPCG to be lacking, and what I remember reading about it said
      it's supposed to be a more holistic approach to benchmarking which
      I assumed meant it stressed the whole system, not just one
      subsystem. <br>
    </p>
    <p>I'll do a search for presentations from the BOFs. If you can send
      me the PDF you referenced below, I will be grateful. <br>
    </p>
    <p>Prentice<br>
    </p>
    <pre class="moz-signature" cols="72">
</pre>
    <div class="moz-cite-prefix">On 3/21/22 8:42 PM, Massimiliano Fatica
      wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:CABuTdwHaL2m=6GEnhHmNRXEPJPtQyteizkpJ5osdA8QqqMfeow@mail.gmail.com">
      <meta http-equiv="content-type" content="text/html; charset=UTF-8">
      <div dir="ltr">No, HPCG  is all memory bandwidth. 
        <div>You can see this old presentation where GPUs with basically
          no double precision, perform on par with others with 10x
          performance.</div>
        <div><br>
        </div>
        <div><a
            href="http://www.hpcg-benchmark.org/downloads/sc14/HPCG_BOF.pdf"
            moz-do-not-send="true" class="moz-txt-link-freetext">http://www.hpcg-benchmark.org/downloads/sc14/HPCG_BOF.pdf</a><br>
        </div>
        <div><br>
        </div>
        <div>
          <div class="gmail-page" title="Page 7">
            <div class="gmail-section" style="color:rgb(0,0,0)">There
              were more examples during recent HPCG BOFs ( but I can't
              find the pdf online, if you want I can send them to you).</div>
            <div class="gmail-section" style="color:rgb(0,0,0)">For
              example, if you look at the specs of a K80 ( 2xGK210 , <span
                style="color:rgb(34,34,34)">1.4TF DP and 384 bit memory
                bus  at 5GHz</span> ) and M40 (GM200, 0.2TF DP and <span
                style="color:rgb(34,34,34)">384 bit memory bus  at
                6GHz), you may think that the K80 will much faster.</span> Exactly
              the opposite, and the results scale perfectly with memory
              bandwidth.</div>
            <div class="gmail-section" style="color:rgb(0,0,0)"><br>
            </div>
            <b>1 x K80 (2 GK210 GPUs), ECC enabled, clk=875</b><br>
            2x1x1 process grid<br>
            256x256x256 local domain<br>
            SpMV = 49.1 GF ( 309.1 GB/s Effective) 24.5 GF_per ( 154.6
            GB/s Effective) SymGS = 62.2 GF ( 480.2 GB/s Effective) 31.1
            GF_per ( 240.1 GB/s Effective) total = 58.7 GF ( 445.3 GB/s
            Effective) 29.4 GF_per ( 222.7 GB/s Effective) final = 55.1
            GF ( 417.5 GB/s Effective) 27.5 GF_per ( 208.8 GB/s
            Effective)<br>
            <br>
            <b>2 x M40 (2 GM200 GPUs), ECC enabled, clk=1114</b><br>
            2x1x1 process grid<br>
            256x256x256 local domain<br>
            SpMV = 69.4 GF ( 437.2 GB/s Effective) 34.7 GF_per ( 218.6
            GB/s Effective) SymGS = 83.7 GF ( 645.7 GB/s Effective) 41.8
            GF_per ( 322.8 GB/s Effective) total = 79.6 GF ( 603.7 GB/s
            Effective) 39.8 GF_per ( 301.9 GB/s Effective) final = 74.2
            GF ( 562.7 GB/s Effective) 37.1 GF_per ( 281.4 GB/s
            Effective)</div>
          <div class="gmail-page" title="Page 7"><br>
          </div>
          <div class="gmail-page" title="Page 7">
            <div class="gmail-section" style="color:rgb(0,0,0)">Regarding
              Linpack, on CPU systems  the trailing matrix update is
              slow, you can hide all the network traffic with the
              look-ahead if you have a decent network (most CPU-only
              systems on the list are not real  HPC systems, just some
              OEMs stuffing the list with cloud systems with very poor
              network).</div>
            <div class="gmail-section" style="color:rgb(0,0,0)">On
              accelerated systems ( for example GPU), network becomes
              really critical.</div>
            <div class="gmail-section" style="color:rgb(0,0,0)"><br>
            </div>
            <div class="gmail-section" style="color:rgb(0,0,0)">Now,
              memory bw is the real limitation in most HPC workloads, so
              if I had to select a system, I would care more about
              memory bw than HPL.</div>
            <div class="gmail-section" style="color:rgb(0,0,0)"><br>
            </div>
            <div class="gmail-section" style="color:rgb(0,0,0)">M</div>
          </div>
        </div>
        <div><br>
        </div>
      </div>
      <br>
      <div class="gmail_quote">
        <div dir="ltr" class="gmail_attr">On Mon, Mar 21, 2022 at 11:51
          AM Prentice Bisbal via Beowulf <<a
            href="mailto:beowulf@beowulf.org" moz-do-not-send="true"
            class="moz-txt-link-freetext">beowulf@beowulf.org</a>>
          wrote:<br>
        </div>
        <blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">
          <div>
            <p>M, <br>
            </p>
            <p>Isn't it more accurate to say that HPCG measures the
              whole system more realistically, and memory bandwidth
              happens to be the "rate limiting step" in just about all
              architectures? Even with LINPACK, which should be
              CPU-bound, the Top500 list shows that HPL results are
              affected by the network. For example, there's this article
              which is a bit old, but I think still applies (doing the
              same analysis on the current top500 list is on my to-do
              list, actually): <br>
            </p>
            <p><a
href="https://www.nextplatform.com/2015/07/20/ethernet-will-have-to-work-harder-to-win-hpc/"
                target="_blank" moz-do-not-send="true"
                class="moz-txt-link-freetext">https://www.nextplatform.com/2015/07/20/ethernet-will-have-to-work-harder-to-win-hpc/</a><br>
            </p>
            <div>On 3/18/22 8:34 PM, Massimiliano Fatica wrote:<br>
            </div>
            <blockquote type="cite">
              <div dir="ltr">HPCG measures memory bandwidth, the FLOPS
                capability of the chip is completely irrelevant.
                <div>Pretty much all the vendor implementations reach
                  very similar efficiency if you compare them to the
                  available memory bandwidth.</div>
                <div>There is some effect of the network at scale, but
                  you need to have a really large  system to see it in
                  play.</div>
                <div><br>
                </div>
                <div>M</div>
              </div>
              <br>
              <div class="gmail_quote">
                <div dir="ltr" class="gmail_attr">On Fri, Mar 18, 2022
                  at 5:20 PM Brian Dobbins <<a
                    href="mailto:bdobbins@gmail.com" target="_blank"
                    moz-do-not-send="true" class="moz-txt-link-freetext">bdobbins@gmail.com</a>>
                  wrote:<br>
                </div>
                <blockquote class="gmail_quote" style="margin:0px 0px
                  0px
0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">
                  <div dir="ltr">
                    <div><br>
                    </div>
                    <div>Hi Jorg,<br>
                    </div>
                    <div><br>
                    </div>
                    <div>  We (NCAR - weather/climate applications) tend
                      to find that HPCG more closely tracks the
                      performance we see from hardware than Linpack, so
                      it definitely is of interest and watched, but our
                      procurements tend to use actual code that vendors
                      run as part of the process, so we don't 'just' use
                      published HPCG numbers.  Still, I'd say it's still
                      very much a useful number, though.</div>
                    <div><br>
                    </div>
                    <div>  As one example, while I haven't seen HPCG
                      numbers for the MI250x accelerators, Prof. Matuoka
                      of RIKEN tweeted back in November that he
                      anticipated that to score around 0.4% of peak on
                      HPCG, vs 2% on the NVIDIA A100 (while the A64FX
                      they use hits an impressive 3%):</div>
                    <div><a
                        href="https://twitter.com/ProfMatsuoka/status/1458159517590384640"
                        target="_blank" moz-do-not-send="true"
                        class="moz-txt-link-freetext">https://twitter.com/ProfMatsuoka/status/1458159517590384640</a></div>
                    <div><br>
                    </div>
                    <div>  Why is that relevant?  Well, <i>on paper</i>,
                      the MI250X has ~96 TF FP64 w/ Matrix operations,
                      vs 19.5 TF on the A100.  So, 5x in theory, but
                      Prof Matsuoka anticipated a ~5x differential in
                      HPCG, <i>erasing</i> that differential.  Now,
                      surely <i>someone</i> has HPCG numbers on the
                      MI250X, but I've not yet seen any.  Would love to
                      know what they are.  But absent that information I
                      tend to bet Matsuoka isn't far off the mark.</div>
                    <div><br>
                    </div>
                    <div>  Ultimately, it may help knowing more about
                      what kind of applications you run - for memory
                      bound CFD-like codes, HPCG tends to be pretty
                      representative.  <br>
                    </div>
                    <div><br>
                    </div>
                    <div>  Maybe it's time to update the saying that
                      'numbers never lie' to something more accurate -
                      'numbers never lie, but they also rarely tell the
                      whole story'.</div>
                    <div><br>
                    </div>
                    <div>  Cheers,</div>
                    <div>  - Brian</div>
                    <div><br>
                    </div>
                  </div>
                  <br>
                  <div class="gmail_quote">
                    <div dir="ltr" class="gmail_attr">On Fri, Mar 18,
                      2022 at 5:08 PM Jörg Saßmannshausen <<a
                        href="mailto:sassy-work@sassy.formativ.net"
                        target="_blank" moz-do-not-send="true"
                        class="moz-txt-link-freetext">sassy-work@sassy.formativ.net</a>>
                      wrote:<br>
                    </div>
                    <blockquote class="gmail_quote" style="margin:0px
                      0px 0px
0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">Dear
                      all,<br>
                      <br>
                      further the emails back in 2020 around the HPCG
                      benchmark test, as we are in <br>
                      the process of getting a new cluster I was
                      wondering if somebody else in the <br>
                      meantime has used that test to benchmark the
                      particular performance of the <br>
                      cluster. <br>
                      From what I can see, the latest HPCG version is
                      3.1 from August 2019. I also <br>
                      have noticed that their website has a link to
                      download a version which <br>
                      includes the latest A100 GPUs from nVidia. <br>
                      <a
                        href="https://www.hpcg-benchmark.org/software/view.html?id=280"
                        rel="noreferrer" target="_blank"
                        moz-do-not-send="true"
                        class="moz-txt-link-freetext">https://www.hpcg-benchmark.org/software/view.html?id=280</a><br>
                      <br>
                      What I was wondering is: has anybody else apart
                      from Prentice tried that test <br>
                      and is it somehow useful, or does it just give you
                      another set of numbers?<br>
                      <br>
                      Our new cluster will not be at the same league as
                      the supercomputers, but we <br>
                      would like to have at least some kind of handle so
                      we can compare the various <br>
                      offers from vendors. My hunch is the benchmark
                      will somehow (strongly?) depend <br>
                      on how it is tuned. As my former colleague used to
                      say: I am looking for some <br>
                      war stories (not very apt to say these days!).<br>
                      <br>
                      Either way, I hope you are all well given the
                      strange new world we are living <br>
                      in right now.<br>
                      <br>
                      All the best from a spring like dark London<br>
                      <br>
                      Jörg<br>
                      <br>
                      <br>
                      <br>
                      _______________________________________________<br>
                      Beowulf mailing list, <a
                        href="mailto:Beowulf@beowulf.org"
                        target="_blank" moz-do-not-send="true"
                        class="moz-txt-link-freetext">Beowulf@beowulf.org</a>
                      sponsored by Penguin Computing<br>
                      To change your subscription (digest mode or
                      unsubscribe) visit <a
                        href="https://beowulf.org/cgi-bin/mailman/listinfo/beowulf"
                        rel="noreferrer" target="_blank"
                        moz-do-not-send="true"
                        class="moz-txt-link-freetext">https://beowulf.org/cgi-bin/mailman/listinfo/beowulf</a><br>
                    </blockquote>
                  </div>
                  _______________________________________________<br>
                  Beowulf mailing list, <a
                    href="mailto:Beowulf@beowulf.org" target="_blank"
                    moz-do-not-send="true" class="moz-txt-link-freetext">Beowulf@beowulf.org</a>
                  sponsored by Penguin Computing<br>
                  To change your subscription (digest mode or
                  unsubscribe) visit <a
                    href="https://beowulf.org/cgi-bin/mailman/listinfo/beowulf"
                    rel="noreferrer" target="_blank"
                    moz-do-not-send="true" class="moz-txt-link-freetext">https://beowulf.org/cgi-bin/mailman/listinfo/beowulf</a><br>
                </blockquote>
              </div>
              <br>
              <fieldset></fieldset>
              <pre>_______________________________________________
Beowulf mailing list, <a href="mailto:Beowulf@beowulf.org" target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">Beowulf@beowulf.org</a> sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit <a href="https://beowulf.org/cgi-bin/mailman/listinfo/beowulf" target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">https://beowulf.org/cgi-bin/mailman/listinfo/beowulf</a>
</pre>
            </blockquote>
          </div>
          _______________________________________________<br>
          Beowulf mailing list, <a href="mailto:Beowulf@beowulf.org"
            target="_blank" moz-do-not-send="true"
            class="moz-txt-link-freetext">Beowulf@beowulf.org</a>
          sponsored by Penguin Computing<br>
          To change your subscription (digest mode or unsubscribe) visit
          <a href="https://beowulf.org/cgi-bin/mailman/listinfo/beowulf"
            rel="noreferrer" target="_blank" moz-do-not-send="true"
            class="moz-txt-link-freetext">https://beowulf.org/cgi-bin/mailman/listinfo/beowulf</a><br>
        </blockquote>
      </div>
    </blockquote>
  </body>
</html>