[Beowulf] Best case performance of HPL on EPYC 7742 processor ...

Richard Walsh rbwcnslt at gmail.com
Wed Aug 19 05:37:27 PDT 2020


Kilian/All,

Thanks for the responses.  Regarding "peak" ... if I did not include it,
I should have said "nominal" peak, which is about the only meaning
peak has these days.

Seeing as I have not had a lot of quick "this is how you get 90% efficiency"
answers, but references and comments that corroborate the performance
I am observing, I will conclude for the moment that the 80% figure for these
64-core parts is a good-reasonable number.

Also, it is a reminder that even HPL has an on-node bandwidth performance
dependency,
although I guess we cannot be sure here what part of the 20% difference
when compared
to the 32-core parts is strictly due to bandwidth to memory and not to
increased competition
for the on-chip caches when we double the number of cores.

Thanks,

Richard

On Tue, Aug 18, 2020 at 6:22 PM Kilian Cavalotti <
kilian.cavalotti.work at gmail.com> wrote:

> Hi Richard,
>
> On Fri, Aug 14, 2020 at 2:30 PM Richard Walsh <rbwcnslt at gmail.com> wrote:
> > What have people achieved on this SKU on a single-node using the stock
> > HPL 2.3 source... ??
>
> I got similar findings as yours, about 75-80% of peak, albeit using a
> different SKU (7702), but consistent over multiple platforms (thus
> hopefully averaging manufacturer idiosyncrasies).
>
> I think this page summarizes the most relevant BIOS settings pretty
> well:
> https://hpcadvisorycouncil.atlassian.net/wiki/spaces/HPCWORKS/pages/1280442391/AMD+2nd+Gen+EPYC+CPU+Tuning+Guide+for+InfiniBand+HPC#Configurable-Thermal-Design-Power-(cTDP)
>
> > I have seen a variety of performance claims even as high as 90% of its
> nominal per node peak of 4.608 TFLOPs.
>
> Interestingly, the theoretical performance of a dual-7742 machine is
> 4.608 TFLOPs, at *base* clock (2.25 GHz).
> In practice, you probably had Turbo on, meaning that the clocks were
> probably running closer to the 3.0 GHz range, which means that the
> theoretical performance should be in the 6 TF range, hence bringing
> the observed efficiency even lower.
>
> An interesting test would be to disable Turbo to fix the core clocks
> at 2.25 GHz, and see the HPL numbers you get.
>
> Cheers,
> --
> Kilian
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://beowulf.org/pipermail/beowulf/attachments/20200819/5d186230/attachment.html>


More information about the Beowulf mailing list