[Beowulf] HPCG benchmark, again
Benson Muite
benson_muite at emailplus.org
Sat Mar 19 07:18:10 UTC 2022
For memory bandwidth, single node tests such as Likwid are helpful
https://github.com/RRZE-HPC/likwid
MPI communication benchmarks are a good complement to this.
Full applications do more than the above, but these are easier starting
points that require less domain specific application knowledge for
general performance measurement.
On 3/19/22 3:58 AM, Richard Walsh wrote:
>
> J,
>
> Trying to add a bit to the preceding useful answers …
>
> In my experience running these codes on very large systems for
> acceptances, to get optimal (HPCG or HPL) performance on GPUs (MI200 or
> A100) you need to obtain the optimized versions from the vendors which
> include scripts with ENV variable tunings specific the their versions
> and optimal affinity settings to manage the non-simple relationship
> between the NICs, the GPUs, and CPUs … you have iterate through the
> settings to find optimal settings for you system.
>
> If you set out to do this on your own, the chances of getting values
> similar to those posted on the TOP500 website are vanishingly small …
>
> As already noted, buyers of large HPC systems almost always require
> large scale runs of both HPCG (to demonstrate peak bandwidth) and HPL
> (to demonstrated peak processor) performance.
>
> Cheers!
>
> rbw
>
> Sent from my iPhone
>
>> On Mar 18, 2022, at 7:35 PM, Massimiliano Fatica <mfatica at gmail.com>
>> wrote:
>>
>>
>> HPCG measures memory bandwidth, the FLOPS capability of the chip is
>> completely irrelevant.
>> Pretty much all the vendor implementations reach very similar
>> efficiency if you compare them to the available memory bandwidth.
>> There is some effect of the network at scale, but you need to have a
>> really large system to see it in play.
>>
>> M
>>
>> On Fri, Mar 18, 2022 at 5:20 PM Brian Dobbins <bdobbins at gmail.com
>> <mailto:bdobbins at gmail.com>> wrote:
>>
>>
>> Hi Jorg,
>>
>> We (NCAR - weather/climate applications) tend to find that HPCG
>> more closely tracks the performance we see from hardware than
>> Linpack, so it definitely is of interest and watched, but our
>> procurements tend to use actual code that vendors run as part of
>> the process, so we don't 'just' use published HPCG numbers.
>> Still, I'd say it's still very much a useful number, though.
>>
>> As one example, while I haven't seen HPCG numbers for the MI250x
>> accelerators, Prof. Matuoka of RIKEN tweeted back in November that
>> he anticipated that to score around 0.4% of peak on HPCG, vs 2% on
>> the NVIDIA A100 (while the A64FX they use hits an impressive 3%):
>> https://twitter.com/ProfMatsuoka/status/1458159517590384640
>> <https://twitter.com/ProfMatsuoka/status/1458159517590384640>
>>
>> Why is that relevant? Well, /on paper/, the MI250X has ~96 TF
>> FP64 w/ Matrix operations, vs 19.5 TF on the A100. So, 5x in
>> theory, but Prof Matsuoka anticipated a ~5x differential in HPCG,
>> /erasing/ that differential. Now, surely /someone/ has HPCG
>> numbers on the MI250X, but I've not yet seen any. Would love to
>> know what they are. But absent that information I tend to bet
>> Matsuoka isn't far off the mark.
>>
>> Ultimately, it may help knowing more about what kind of
>> applications you run - for memory bound CFD-like codes, HPCG tends
>> to be pretty representative.
>>
>> Maybe it's time to update the saying that 'numbers never lie' to
>> something more accurate - 'numbers never lie, but they also rarely
>> tell the whole story'.
>>
>> Cheers,
>> - Brian
>>
>>
>> On Fri, Mar 18, 2022 at 5:08 PM Jörg Saßmannshausen
>> <sassy-work at sassy.formativ.net
>> <mailto:sassy-work at sassy.formativ.net>> wrote:
>>
>> Dear all,
>>
>> further the emails back in 2020 around the HPCG benchmark
>> test, as we are in
>> the process of getting a new cluster I was wondering if
>> somebody else in the
>> meantime has used that test to benchmark the particular
>> performance of the
>> cluster.
>> From what I can see, the latest HPCG version is 3.1 from
>> August 2019. I also
>> have noticed that their website has a link to download a
>> version which
>> includes the latest A100 GPUs from nVidia.
>> https://www.hpcg-benchmark.org/software/view.html?id=280
>> <https://www.hpcg-benchmark.org/software/view.html?id=280>
>>
>> What I was wondering is: has anybody else apart from Prentice
>> tried that test
>> and is it somehow useful, or does it just give you another set
>> of numbers?
>>
>> Our new cluster will not be at the same league as the
>> supercomputers, but we
>> would like to have at least some kind of handle so we can
>> compare the various
>> offers from vendors. My hunch is the benchmark will somehow
>> (strongly?) depend
>> on how it is tuned. As my former colleague used to say: I am
>> looking for some
>> war stories (not very apt to say these days!).
>>
>> Either way, I hope you are all well given the strange new
>> world we are living
>> in right now.
>>
>> All the best from a spring like dark London
>>
>> Jörg
>>
>>
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org
>> <mailto:Beowulf at beowulf.org> sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit
>> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
>> <https://beowulf.org/cgi-bin/mailman/listinfo/beowulf>
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org
>> <mailto:Beowulf at beowulf.org> sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit
>> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
>> <https://beowulf.org/cgi-bin/mailman/listinfo/beowulf>
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit
>> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
>
More information about the Beowulf
mailing list