<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p>M, <br>
</p>
<p>Isn't it more accurate to say that HPCG measures the whole system
more realistically, and memory bandwidth happens to be the "rate
limiting step" in just about all architectures? Even with LINPACK,
which should be CPU-bound, the Top500 list shows that HPL results
are affected by the network. For example, there's this article
which is a bit old, but I think still applies (doing the same
analysis on the current top500 list is on my to-do list,
actually): <br>
</p>
<p><a class="moz-txt-link-freetext" href="https://www.nextplatform.com/2015/07/20/ethernet-will-have-to-work-harder-to-win-hpc/">https://www.nextplatform.com/2015/07/20/ethernet-will-have-to-work-harder-to-win-hpc/</a><br>
</p>
<pre class="moz-signature" cols="72">
</pre>
<div class="moz-cite-prefix">On 3/18/22 8:34 PM, Massimiliano Fatica
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CABuTdwF4m7P74TdZud_bAPW3NcBcQ5SR6bgz=XhXsn8+NOaGFg@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">HPCG measures memory bandwidth, the FLOPS
capability of the chip is completely irrelevant.
<div>Pretty much all the vendor implementations reach very
similar efficiency if you compare them to the available memory
bandwidth.</div>
<div>There is some effect of the network at scale, but you need
to have a really large system to see it in play.</div>
<div><br>
</div>
<div>M</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Fri, Mar 18, 2022 at 5:20
PM Brian Dobbins <<a href="mailto:bdobbins@gmail.com"
moz-do-not-send="true" class="moz-txt-link-freetext">bdobbins@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">
<div dir="ltr">
<div><br>
</div>
<div>Hi Jorg,<br>
</div>
<div><br>
</div>
<div> We (NCAR - weather/climate applications) tend to find
that HPCG more closely tracks the performance we see from
hardware than Linpack, so it definitely is of interest and
watched, but our procurements tend to use actual code that
vendors run as part of the process, so we don't 'just' use
published HPCG numbers. Still, I'd say it's still very
much a useful number, though.</div>
<div><br>
</div>
<div> As one example, while I haven't seen HPCG numbers for
the MI250x accelerators, Prof. Matuoka of RIKEN tweeted
back in November that he anticipated that to score around
0.4% of peak on HPCG, vs 2% on the NVIDIA A100 (while the
A64FX they use hits an impressive 3%):</div>
<div><a
href="https://twitter.com/ProfMatsuoka/status/1458159517590384640"
target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">https://twitter.com/ProfMatsuoka/status/1458159517590384640</a></div>
<div><br>
</div>
<div> Why is that relevant? Well, <i>on paper</i>, the
MI250X has ~96 TF FP64 w/ Matrix operations, vs 19.5 TF on
the A100. So, 5x in theory, but Prof Matsuoka anticipated
a ~5x differential in HPCG, <i>erasing</i> that
differential. Now, surely <i>someone</i> has HPCG
numbers on the MI250X, but I've not yet seen any. Would
love to know what they are. But absent that information I
tend to bet Matsuoka isn't far off the mark.</div>
<div><br>
</div>
<div> Ultimately, it may help knowing more about what kind
of applications you run - for memory bound CFD-like codes,
HPCG tends to be pretty representative. <br>
</div>
<div><br>
</div>
<div> Maybe it's time to update the saying that 'numbers
never lie' to something more accurate - 'numbers never
lie, but they also rarely tell the whole story'.</div>
<div><br>
</div>
<div> Cheers,</div>
<div> - Brian</div>
<div><br>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Fri, Mar 18, 2022 at
5:08 PM Jörg Saßmannshausen <<a
href="mailto:sassy-work@sassy.formativ.net"
target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">sassy-work@sassy.formativ.net</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">Dear
all,<br>
<br>
further the emails back in 2020 around the HPCG benchmark
test, as we are in <br>
the process of getting a new cluster I was wondering if
somebody else in the <br>
meantime has used that test to benchmark the particular
performance of the <br>
cluster. <br>
From what I can see, the latest HPCG version is 3.1 from
August 2019. I also <br>
have noticed that their website has a link to download a
version which <br>
includes the latest A100 GPUs from nVidia. <br>
<a
href="https://www.hpcg-benchmark.org/software/view.html?id=280"
rel="noreferrer" target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">https://www.hpcg-benchmark.org/software/view.html?id=280</a><br>
<br>
What I was wondering is: has anybody else apart from
Prentice tried that test <br>
and is it somehow useful, or does it just give you another
set of numbers?<br>
<br>
Our new cluster will not be at the same league as the
supercomputers, but we <br>
would like to have at least some kind of handle so we can
compare the various <br>
offers from vendors. My hunch is the benchmark will
somehow (strongly?) depend <br>
on how it is tuned. As my former colleague used to say: I
am looking for some <br>
war stories (not very apt to say these days!).<br>
<br>
Either way, I hope you are all well given the strange new
world we are living <br>
in right now.<br>
<br>
All the best from a spring like dark London<br>
<br>
Jörg<br>
<br>
<br>
<br>
_______________________________________________<br>
Beowulf mailing list, <a
href="mailto:Beowulf@beowulf.org" target="_blank"
moz-do-not-send="true" class="moz-txt-link-freetext">Beowulf@beowulf.org</a>
sponsored by Penguin Computing<br>
To change your subscription (digest mode or unsubscribe)
visit <a
href="https://beowulf.org/cgi-bin/mailman/listinfo/beowulf"
rel="noreferrer" target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">https://beowulf.org/cgi-bin/mailman/listinfo/beowulf</a><br>
</blockquote>
</div>
_______________________________________________<br>
Beowulf mailing list, <a href="mailto:Beowulf@beowulf.org"
target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">Beowulf@beowulf.org</a>
sponsored by Penguin Computing<br>
To change your subscription (digest mode or unsubscribe) visit
<a href="https://beowulf.org/cgi-bin/mailman/listinfo/beowulf"
rel="noreferrer" target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">https://beowulf.org/cgi-bin/mailman/listinfo/beowulf</a><br>
</blockquote>
</div>
<br>
<fieldset class="moz-mime-attachment-header"></fieldset>
<pre class="moz-quote-pre" wrap="">_______________________________________________
Beowulf mailing list, <a class="moz-txt-link-abbreviated" href="mailto:Beowulf@beowulf.org">Beowulf@beowulf.org</a> sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit <a class="moz-txt-link-freetext" href="https://beowulf.org/cgi-bin/mailman/listinfo/beowulf">https://beowulf.org/cgi-bin/mailman/listinfo/beowulf</a>
</pre>
</blockquote>
</body>
</html>