Kd trees: nearest neighbor search algorithm

This is my understanding of this: 1. Fold the tree by taking the left or right subtree according to whether the ELEMENT will lie in the right or right subtree, if it exists. 2. Set CURRENT_BEST as the first leaf of the node that you will reach. 3. When you regenerate the backup, check to see if ELEMENT is closer to the split hyperplane than to CURRENT_BEST. If so, set CURRENT_BEST as the current node.

This is the part that I got from Wikipedia and my class, and the part that I don’t understand: 4. Check if any node in another subtree of the split point selected in 3. is closer to ELEMENT than the split point.

I don’t understand why we need to do 4., since any point that can be in one subtree of the node splitting must necessarily be closer to splitting the node than any point of another subtree.

This is obviously my understanding of the algorithm, which is erroneous, so help would be greatly appreciated.

+4
source share
3 answers

Step 4 is the “else” in step 3, what do you do if the plane is closer than the point. Just because the point you find will be in the same rectangle as the point you find for the neighbor does not mean that it is closest.

Imagine the following scenario: you have two points in your kD-tree, A and B. A is in the middle of its rectangle, and B is just above the edge, in the area with the section next to A. If you are now looking for the nearest neighbor to the point C, which is next to B, but turns out to be the other side of the edge, and in the area of ​​section A, your first point you choose will be A because of the initial depth of the first search, which selects everything that will be in the same section as your search point. However, B is really closer, therefore, even if you chose A, you need to check if B is closer, otherwise your kD-tree will not actually give you the correct results.

A good way to visualize this is to cross out it:

a ------------- C-- | -IN

A is the first point we found in DFS, C is our point we want to find the nearest neighbor, B is the nearest nearest neighbor, | is our splitting plane.

Another way to think about it is to draw a circle with a radius dist (A, C) around point C. If any other rectangles have any part of themselves, fall into this circle, then there is a chance that they hold a point that may be closer to C than A, so they need to be checked. If you now find B, you can reduce the radius of your circle (because B is closer) so that fewer rectangles are able to intersect, and as soon as you check all the rectangles that intersect with your circle (decreasing the radius of the circle as your find closer neighbors) , you can finally say that there are no nearby points.

+7
source

I wrote a basic C ++ implementation on github . It has an iterative and recursive version.

+2
source
function kdtree (list of points pointList, int depth) { // Select axis based on depth so that axis cycles through all valid values var int axis := depth mod k; // Sort point list and choose median as pivot element select median by axis from pointList; // Create node and construct subtrees var tree_node node; node.location := median; node.leftChild := kdtree(points in pointList before median, depth+1); node.rightChild := kdtree(points in pointList after median, depth+1); return node; } 
-4
source

Source: https://habr.com/ru/post/1447653/


All Articles