[Beowulf] PCPro: AMD: what went wrong?

Tue Feb 21 01:32:33 PST 2012

On Feb 21, 2012, at 5:14 AM, Lux, Jim (337C) wrote:

>
>
> On 2/20/12 12:48 PM, "Vincent Diepeveen" <diep at xs4all.nl> wrote:
>
>>
>> On Feb 20, 2012, at 9:29 PM, Lux, Jim (337C) wrote:
>>
>>> Comments below about automated vs manual design..
>>>
>>> Granted modern place and route is very sophisticated, but
>>> ultimately, it's
>>> a heuristic process (Xilinx had simulated annealing back in the
>>> 80s, for
>>> instance) which is trying to capture routine guidelines and rules  
>>> (as
>>> opposed to trying guided random strategies like GA, etc.)
>>
>> Actually for hand optimization of yields at modern CPU's stuff like
>> simulated annealing is less popular.
>
> I don't know that anyone still uses simulated annealing..it was an  
> example
> of what kinds of strategies were used in the early days.  Back in  
> the late
> 70s, early 80s, I was looking into automated layout of PCBs.  It was
> pretty grim.. 80% routing, then it would die.   The computational
> challenge is substantial.
>
>
>
>>
>> You can actually also use lineair solvers for that, in order to
>> recalculate entire design and under the right
>> constraints it gives an optimal solution, which is not garantueed for
>> the non-lineair solving methods as
>> those also easily can pick a local maximum.
>
>
> I don't think a linear solver would work.

That's always the tool getting used.

A chip has too many parameters to approximate in the first place.

non-lineair approximation by randomness is popular more in what my  
expertise is -
parameter tuning for example for chessprograms, despite it's the  
chessprogram with
worlds largest evaluation function (and as programmed by 1 program -  
sure some parts
total outdated years 90 code), it has just some 20k tunable  
parameters or so, depending upon
how you count.

In CPU's that's different of course, so given constraints lineair  
solvers get used, at least for
the nanometers that the cpu's have right now. I'm not familiar with  
22 nm there.

>>
>> Stuff like simulated annealing is more popular at the non-lineair
>> problems such as in artificial intelligence.
>
> The place and route problem is highly nonlinear with a lot of weird
> interactions.
>
> I'll be the first to confess that I am pretty bad at PCB or ASIC  
> layout,

well realize i'm a software engineer, yet the optimization is 100%  
software.

Usually there is small companies around the big shots that deliver to  
AMD and Intel machines,
which is relative tiny companies more or less belonging to them,  
which try to improve yields.

See it as service software belonging to the machines delivered by the  
ASML's.

> but there's a lot of tricky constraints that aren't a linear  
> function of
> position in some form.   Imagine having a data bus with 32 lines  
> that you
> need to have minimal skew between, so it can be latched.
>

It's not artificial intelligence, it's in all cases something that  
eats 2 dimensional space, where the
relative primitive model of modelling the lines are in fact not  
straight lines but have roundings everywhere.

You can do this incredible complex with 0% odds you're gonna solve it  
optimal,
or you can make a simpler model and solve it perfectly.

Something that definitely would be tougher if within the component  
there would be moveable parts.

So they all choose to model it using lineair programming. That gives  
a perfect solution and within
just at most a few days of calculation.

Using approximation even a moderately CPU would need worlds largest  
supercomputer longer than we live.

>
> I suppose this is a kind of game playing application so move tree  
> search
> strategies might work.  Certainly it has EP aspects (or nearly EP),  
> so a

Tree search is far more complex than the above.

The above has a clear goal: yield optimization and everything is  
always possible to solve by taking
  a tad more space.

The nonlineair aspects at doing this, which the lineair model doesn't  
take into consideration is the
reason why throwing a couple of hundreds of engineers at hand  
optimizing the CPU is so effective.

> big parallel machine might help.  For all we know, Intel and AMD  
> have big
> clusters helping the designers out, running 1000 copies of timing
> simulators.  Does Cadence, Synopsis, etc. have parallel versions?

I don't speak for anyone here except myself.

For AMD it's easier to guess than for intel, as intel is moving to 22  
nm. I'm not familiar with 22 nm.
Each new generation machine has new problems, let me assure you that.

Realize how rather inefficient simulated annealing and similar  
methods are. Trying to progress using
randomness.

This is a problem already with 20k parameters.

Realize the big supercomputers thrown at improving parameters of  
smaller engines than Diep with just
at most a 2000 parameters or so. Initially some guys tried to get  
away by defining functions, reducing that
amount of parameters to 50-200. They still do.

Then N*SA throws a big supercomputer at the problem and that's how  
they do it.

At a 150 parameters you're already looking at an oracle size of 2+  
million instances and a multiple
of that in tuningssteps, this still tunes into a local maximum by the  
way, no garantuee for the optimum solution.

This local maximum already takes long.

For something with 2+ billion transistors, you're any close to  
realize the problem of doing that in a lossy manner,
risking to get in a local maximum? It means even a tiny modification  
takes forever to solve, just to keep the
same yields.

So in case of CPU's just redefine the entire problem to lineair,
OR try to create a lookuptable which you can insert in the lineair  
solver.

You can insert a non-lineair subproblem with just a handful of  
parameters then into a lineair solver,
if you have a table that lists all possibilities.

Any scientists who claims on paper that using approximation he has a  
non-lineair solver that will always
find the garantueed best optimum, the first question i ask myself :  
"what's the big O worst case to get *any*
reasonable solution at all?"

Most usually first go from total randomness and need to run through  
all steps and only when nearing the end
of the algorithm they start having *a solution* which still is not  
optimal.

You don't have time to wait for that *a solution* in fact.

>
>
>
>>>
>
>