OT: computationally hard or not (algorithm question)

Sat Mar 24 10:47:01 PST 2001

On Fri, Mar 23, 2001 at 02:36:23PM -0500, Robert G. Brown wrote:
[ . . . ]
> You indicate that you intend to pick a point and then look for its ten
> nearest neighbors (according to some metric you haven't mentioned, so
> I'll assume a flat Euclidean metric).  If I understand this correctly
> (and there aren't any complications that kick the problem into the
> "complex" category) a solution might look like:
> 
> typedef struct {
>    double coord[12];
> } element;
> 
> element elements[10000];
> double distance[10000]
> 
> then of course it is simple to pick an element (or a point), evaluate
> the distance of each element in the elements from the element or point
> (storing the result in e.g. distance[i]) and sort the indices of
> distance (not the list itself).  This scales no worse than N plus the
> scaling of the sort algorithm used, and the scaling has nothing to do
> with the number of points in the space.  Indeed, each element can
> contain real number coordinates (hence double coord[12]) instead of one
> of 12 possible integer values with some integer metric and it won't make
> a lot of difference in the time required for the computation and no
> difference at all in the scaling and a beowulf is likely not needed
> unless you plan to do this a LOT.

Hm, true. It's actually not as complicated as I imagined! I'll have to
play with this a bit more to figure out whether it would be sufficiently
efficient but I think it might actually be that simple.

> 
> Alternatively, the problem may be something where you have to store the
> elements in an array that maps to the spatial coordinates -- you have a
> ************Space array addressable as
> Space[a][b][c][d][e][f][g][h][i][j][k][l] where each array element
> contains either 0 or one of 12 values.  You then have to pick a point (a
> nontrivial operation in and of itself, as if a-l can be 1-12 each and
> there are only 10^4 slots that aren't 0, you'll have to pick random
> coordinates a lot of times to find one that doesn't contain 0.  Since
> this array is sparse, the sane thing would be to invert it as described
> above; however, one COULD always e.g. search concentric 12-cubes around
> the point until one finds ten entries.  This would scale terribly --
> remember the space is only 0.0000001% occupied (or something like that
> -- I'm not doing the arithmetic carefully) so you'd have to search
> rather large cubes around the point before you were at all "likely" to
> find 10 hits if the elements aren't spatially bunched in their
> distribution.

Someone else suggested this approach and I initially thought it would be
the solution. But you're obviously right in that the sparseness of the
array would make it very unlikely that you'd find anything nearby if the
distribution is random (which I'd expect it to be, more or less). This
_would_ scale terribly!

> 
> There are a number of physics and statistics problems that lie between
> these two extremes -- the dimensionality of the space is high (perhaps
> much higher than 12) but one cannot simply write down a way of
> manipulating the elements as a simple vector.  In that event methods of
> searching for optima in high dimensional spaces come into play --
> conjugate gradient methods or genetic optimization methods or simulated
> annealing methods.  A classic example is the travelling salesman
> problem, where there are e.g. 12 cities that must be visited in some

That was on mind when I posted the question. I wrote a little program in
Pascal a while back that implemented the subset-sum algorithm described
in Cormen, Leiserson and Rivest's Introduction To Algorithms, which is
in the same class of NP-hard problems as the travelling salesman problem.
This did scale terribly with the number of elements . . . 

Many thanks,

Frank