Nearest point to another point in the hypersphere

I have n (about 10 ^ 5) points on a hypersphere of dimension m (between 10 ^ 4 to 10 ^ 6).

I am going to make a bunch of queries of the form "given the point p, find the nearest of n points to p". I will make about n of these requests.

(Not sure if the fact of hypersphere helps at all.)

A simple naive algorithm to solve this is to compare p with all other n points for each request. Doing this n times ends with a runtime of O (n ^ 2 m), which is too large for me to calculate.

Is there a more efficient algorithm that I can use? If I could get it before O (nm) with some log factors that would be great.

+5
source share
2 answers

Probably not. Having many dimensions makes effective indexing extremely difficult. This is why people are looking for ways to reduce the number of dimensions to something controlled.

See https://en.wikipedia.org/wiki/Curse_of_dimensionality and https://en.wikipedia.org/wiki/Dimensionality_reduction for more.

+3
source

Divide your space into hypercubes - call these cells - with the selected edge size so that on average you have one point per cube. You will need a map from hypersexes to the set of points that they contain.

Then, given the point, check its hypersecond for the other points. If it is empty, look at the neighboring hyperelements (I would recommend a literal hypercube of hyperelements for simplicity, rather than some approximation to the hypersphere built from hyperelements). Check it out for other points. Keep repeating until you get the point. Assuming that your points are randomly distributed, the likelihood that you will find the second item within 1-2 extensions is high.

Once you find the point, check all hyperlinks that may contain a closer point. This is possible because the point you find may be in the corner, but there is some closer point outside the hypercube that contains all the hyperelements that you have checked so far.

0
source

Source: https://habr.com/ru/post/1264468/


All Articles