[Beowulf] AMD and AVX512

Sat Jun 19 15:49:06 UTC 2021

On Wed, 16 Jun 2021 13:15:40 -0400, you wrote:

>The answer given, and I'm 
>not making this up, is that AMD listens to their users and gives the 
>users what they want, and right now they're not hearing any demand for 
>AVX512.
>
>Personally, I call BS on that one. I can't imagine anyone in the HPC 
>community saying "we'd like processors that offer only 1/2 the floating 
>point performance of Intel processors".

I suspect that is marketing speak, which roughly translates to not
that no one has asked for it, but rather requests haven't reached a
threshold where the requests are viewed as significant enough.

> Sure, AMD can offer more cores, 
>but with only AVX2, you'd need twice as many cores as Intel processors, 
>all other things being equal.

But of course all other things aren't equal.

AVX512 is a mess.

Look at the Wikipedia page(*) and note that AVX512 means different
things depending on the processor implementing it.

So what does the poor software developer target?

Or that it can for heat reasons cause CPU frequency reductions,
meaning real world performance may not match theoritical - thus easier
to just go with GPU's.

The result is that most of the world is quite happily (at least for
now) ignoring AVX512 and going with GPU's as necessary - particularly
given the convenient libraries that Nvidia offers.

> I compared a server with dual AMD EPYC >7H12 processors (128)
> quad Intel Xeon 8268 >processors (96 cores).

> From what I've heard, the AMD processors run much hotter than the Intel 
>processors, too, so I imagine a FLOPS/Watt comparison would be even less 
>favorable to AMD.

Spec sheets would indicate AMD runs hotter, but then again you
benchmarked twice as many Intel processors.

So, per spec sheets for you processors above:

AMD - 280W - 2 processors means system 560W
Intel - 205W - 4 processors means system 820W

(and then you also need to factor in purchase price).

>An argument can be made that for calculations that lend themselves to 
>vectorization should be done on GPUs, instead of the main processors but 
>the last time I checked, GPU jobs are still memory is limited, and 
>moving data in and out of GPU memory can still take time, so I can see 
>situations where for large amounts of data using CPUs would be preferred 
>over GPUs.

AMD's latest chips support PCI 4 while Intel is still stuck on PCI 3,
which may or may not mean a difference.

But what despite all of the above and the other replies, it is AMD who
has been winning the HPC contracts of late, not Intel.

* - https://en.wikipedia.org/wiki/Advanced_Vector_Extensions