[Beowulf] Clusters just got more important - AMD's roadmap

Wed Feb 8 06:01:30 PST 2012

On Feb 8, 2012, at 2:34 PM, Eugen Leitl wrote:

> On Wed, Feb 08, 2012 at 02:13:49PM +0100, Peter Kjellström wrote:
>
>>  * Memory bandwidth to all those FPUs
>
> Memory stacking via TSV is coming. APUs with their very apparent
> memory bottlenecks will accelerate it.
>
>>  * Power (CPUs in servers today max out around 120W with GPUs at  
>> >250W)
>
> I don't see why you can't integrate APU+memory+heatsink in a
> watercooled module that is plugged into the backplane which
> contains the switched signalling fabric.
>

Because also for the upcoming new Xbox they have the same power envelope
as they have in the 'highend' cpu's for the built in gpu.

So we do not speak yet about laptop cpu's as it'll be less there.

They have at most 18 watts for the built in GPU.

So first thing they do is kill all double precision of it.
Even if they wouldn't. the 6990 and the highend nvidia and the 7990
they are all 375 watt TDP on paper (in reality 450+ watt).

So whatever your 'opinion' is on how they design stuff,
it will always be factor 375 / 18 = 20.8 times slower than a GPU.

And they can get over the tpd with the gpu's easily as the pci-e  
connectors will easily
pump in more watts, with the built in gpu's they can't as the power  
doesn't come from the pci-e
but from more strict specs.

But now let's look at design.

They  cannot 'turn off' the AVX in the cpu, as then it doesn't  
support the
latest games, and cpu's nowadays are only about taking care you do  
better at the latest
game and nothing else matters, whatever fairy tale they'll tell you.

CPU's are an exorbitantly expensive part of the computer. They are so  
expensive those x64 cpu's,
because of the 'blessing' of patents. Only 2 companies are able to  
release x64 cpu's right now and
probably soon only 1, as we'll have to see whether AMD survives this.

One of those companies is not even in a hurry to release their 8 core  
Xeons in 32 nm, maybe they want to make
more profit with a higher yield cpu at 22 nm.

If we already know the gpu is crap in double precision because it  
just has 18 watts, and we also know
that the CPU has AVX, it's pretty useless to let the GPU do the  
double precision calculations.

So the obvious optimization is to kick out all double precision  
logics in the gpu, which doesn't save transistors as
some will tell you, as it usually is all the same chip, just they  
turn off the transistors, giving them higher yields,
so cheaper production price.

That's what they'll do if they want to make a profit and i bet their  
owners will be very unhappy if they do not make a profit.

So yes, in a nerd world it would be possible to just include a 2 core  
chippie that's just 32 bits x86 of a watt or 10,
and give majority of the power envelope to a double precision  
optimized gpu, maybe even 50 watts,
which makes it 'only' factor 8 slower, in theory, than a GPU card.

Yet that's not very likely to happen.

>> Either way we're in for an interesting future (as usual) :-)
>
> I don't see how x86 should make it to exascale. It's too
> bad MRAM/FeRAM/whatever isn't ready for SoC yet. Also, Moore
> should end by around 2020 or earlier, and architecture only
> pushes you one or two generations further at most. Don't see
> how 3D integration should be ready by then, and 2.5 D only
> buys you another one or two doublings at best. (TSV stacking
> is obviously off-Moore).
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf
>