[Beowulf] Theoretical vs. Actual Performance

Prentice Bisbal pbisbal at pppl.gov
Thu Feb 22 14:48:36 PST 2018

Just rebuilt OpenBLAS 0.2.20 locally on the test system with GCC 6.1.0, 
and I'm only getting 91 GFLOPS. I'm pretty sure OpenBLAS performance 
should be close to ACML performance, if not better. I'll have to dig 
into this later. For now, I'm going to continue my testing using the 
ACML-based build and revisit the OpenBLAS performance later.


On 02/22/2018 05:27 PM, Prentice Bisbal wrote:
> So I just rebuilt HPL using the ACML 6.1.0 libraries with GCC 6.1.0, 
> and I'm now getting 197 GFLOPS, so clearly there's a problem with my 
> OpenBLAS build. I'm going to try building OpenBLAS without the dynamic 
> arch support on the machine where I plan on running my tests, and see 
> if that version of the library is any better.
> Prentice
> On 02/22/2018 09:37 AM, Prentice Bisbal wrote:
>> Beowulfers,
>> In your experience, how close does actual performance of your 
>> processors match up to their theoretical performance? I'm 
>> investigating a performances issue on some of my nodes. These are 
>> older systems using AMD Opteron 6274 processors. I found literature 
>> from AMD stating the theoretical performance of these processors is 
>> 282 GFLOPS, and my LINPACK performance isn't coming close to that (I 
>> get approximately ~33% of that).  The number I often hear mentioned 
>> is actual performance should be ~85%. of theoretical performance is 
>> that a realistic number your experience?
>> I don't want this to be a discussion of what could be wrong at this 
>> point, we will get to that in future posts, I assure you!

More information about the Beowulf mailing list