<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

  </head>

  <body>

    <p><br>

    </p>

    <div class="moz-cite-prefix">On 1/22/24 11:38 AM, Scott Atchley

      wrote:<br>

    </div>

    <blockquote type="cite"

cite="mid:CAL8g0jLUj30WgE_gzoyQggbb7khaVwOs9-L7u_hHqVWp9dn9Cw@mail.gmail.com">

      <meta http-equiv="content-type" content="text/html; charset=UTF-8">

      <div dir="ltr">

        <div dir="ltr">On Mon, Jan 22, 2024 at 11:16 AM Prentice Bisbal

          <<a href="mailto:pbisbal@pppl.gov" moz-do-not-send="true"

            class="moz-txt-link-freetext">pbisbal@pppl.gov</a>>

          wrote:<br>

        </div>

        <div class="gmail_quote">

          <blockquote class="gmail_quote" style="margin:0px 0px 0px

0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">

            <div>

              <blockquote type="cite">

                <div dir="ltr">

                  <div class="gmail_quote">

                    <div><snip> </div>

                    <blockquote class="gmail_quote" style="margin:0px

                      0px 0px

0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">>

                      Another interesting topic is that nodes are

                      becoming many-core - any <br>

                      > thoughts? <br>

                      <br>

                      Core counts are getting too high to be of use in

                      HPC. High core-count <br>

                      processors sound great until you realize that all

                      those cores are now <br>

                      competing for same memory bandwidth and network

                      bandwidth, neither of <br>

                      which increase with core-count.<br>

                      <br>

                      Last April we were evaluating test systems from

                      different vendors for a <br>

                      cluster purchase. One of our test users does a lot

                      of CFD simulations <br>

                      that are very sensitive to mem bandwidth. While he

                      was getting a 50% <br>

                      speed up in AMD compared to Intel (which makes

                      sense since AMDs require <br>

                      12 DIMM slots to be filled instead of Intel's 8),

                      he asked us consider <br>

                      servers with LESS cores. Even with the AMDs, he

                      was saturating the <br>

                      memory bandwidth before scaling to all the cores,

                      causing his <br>

                      performance to plateau. For him, buying cheaper

                      processors with lower <br>

                      core-counts was better for him, since the savings

                      would allow us to by <br>

                      additional nodes, which would be more beneficial

                      to him.<br>

                    </blockquote>

                    <div><br>

                    </div>

                    <div>We see this as well in DOE especially when GPUs

                      are doing a significant amount of the work.</div>

                  </div>

                </div>

              </blockquote>

              <p>Yeah, I noticed that Frontier and Aurora will actually

                be single-socket systems w/ "only" 64 cores.</p>

            </div>

          </blockquote>

          <div> Yes, Frontier is a <b>single</b> <b>CPU</b> socket and

            <b>four GPUs</b> (actually eight GPUs from the user's

            perspective). It works out to eight cores per Graphics

            Compute Die (GCD). The FLOPS ratio is roughly 1:100 between

            the CPU and GPUs.</div>

          <div><br>

          </div>

          <div>Note, Aurora is a dual CPU and six GPU. I am not sure if

            the user sees six or more GPUs. The Aurora node is similar

            to our Summit node but with more connectivity between the

            GPUs.</div>

        </div>

      </div>

    </blockquote>

    <p>Thanks for clarfying! I thought it was a  single-CPU system like

      Frontier. Not only is the FLOPS ratio much higher on GPUs, so if

      the FLOPS/W ratio. Even though CPUs have gotten much more

      efficient lately, it's practically stagnant compared to GPU-based

      clusters, based on my analysis of the Top500 and Green500 trends.

      <br>

    </p>

    <p>Prentice<br>

    </p>

  </body>

</html>