[Beowulf] Theoretical vs. Actual Performance
Benson Muite
benson.muite at ut.ee
Thu Feb 22 07:42:51 PST 2018
There is a very nice and simple Max flops code that requires much less
tuning than Linpack. It is described in pg 57 of:
Rahman "Intel® Xeon Phi™ Coprocessor Architecture and Tools"
https://link.springer.com/book/10.1007%2F978-1-4302-5927-5
An example Fortran code is here:
https://github.com/bkmgit/intel-xeon-phi-coprocessor-architecture-tools/tree/master/ch05
On 02/22/2018 05:16 PM, John Hearns via Beowulf wrote:
> Prentice, I echo what Joe says.
> When doing benchmarking with HPL or SPEC benchmarks, I would optimise
> the BIOS settings to the highest degree I could.
> Switch off processor C) states
> As Joe says you need to look at what the OS is runnign in the
> background. I would disable the Bright cluster manager daemon for instance.
>
>
> 85% of theoretical peak on an HPL run sounds reasonable to me and I
> would get fogures in that ballpark.
>
> For your AMDs I would start by choosing one system, no interconnect to
> cloud the waters. See what you can get out of that.
>
>
>
>
>
>
>
>
>
> On 22 February 2018 at 15:45, Joe Landman <joe.landman at gmail.com
> <mailto:joe.landman at gmail.com>> wrote:
>
>
>
> On 02/22/2018 09:37 AM, Prentice Bisbal wrote:
>
> Beowulfers,
>
> In your experience, how close does actual performance of your
> processors match up to their theoretical performance? I'm
> investigating a performances issue on some of my nodes. These
> are older systems using AMD Opteron 6274 processors. I found
> literature from AMD stating the theoretical performance of these
> processors is 282 GFLOPS, and my LINPACK performance isn't
> coming close to that (I get approximately ~33% of that). The
> number I often hear mentioned is actual performance should be
> ~85%. of theoretical performance is that a realistic number your
> experience?
>
>
> 85% makes the assumption that you have the systems configured in an
> optimal manner, that the compiler doesn't do anything wonky, and
> that, to some degree, you isolate the OS portion of the workload off
> of most of the cores to reduce jitter. Among other things.
>
> At Scalable, I'd regularly hit 60-90 % of theoretical max computing
> performance, with progressively more heroic tuning. Storage, I'd
> typically hit 90-95% of theoretical max (good architectures almost
> always beat bad ones). Networking, fairly similar, though tuning
> per use case mattered significantly.
>
>
> I don't want this to be a discussion of what could be wrong at
> this point, we will get to that in future posts, I assure you!
>
>
> --
> Joe Landman
> t: @hpcjoe
> w: https://scalability.org
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> <mailto:Beowulf at beowulf.org> sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
> <http://www.beowulf.org/mailman/listinfo/beowulf>
>
>
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>
More information about the Beowulf
mailing list