What spatial data structure (algorithm) is best suited for (searching) a set of areas (spatial data)?

I have a set of areas (geo objects) that are polygons. This data set is fixed; therefore, there is no need to insert or delete data. What data structure can be used to search for areas where the query point is (longitude, latitude)?

Note. I implemented KD-Tree (actually a 2D tree) for a set of points. But this does not work for this problem. Then I implemented R-Tree; and it solves the problem, but it is slow (or my implementation sucks).

thanks

Note. I was working on an R-Tree implementation, and now it works great.

+4
source share
2 answers

A R-Tree data structure can be used for this problem.

+1
source

Since you are not inserting / deleting and apparently have enough time to preprocess the data, you can use some additional memory to speed up the calculations. The basic idea of ​​preprocessing:

  • Take all the points of the polygon and define the smallest axis-oriented bounding box that contains them all; basically it is min and max X and Y.
  • Select the separation factor dX and dY that you will use to create the search grid. Choosing two degrees for partioning coefficients can do for faster calculation later.
  • Translate the polygon data so that its minimum boundary rectangle matches (0,0) and expands the rectangle so that it is an integer multiple of the partition coefficient in each dimension.
  • Consider each square of the grid and make a list of polygons that intersect the square. Save this list for each square of the grid. Depending on the nature of the data (how many polygons you can ever expect to cross a square), you can optimize this for both storage and speed.

Now when you want to find areas containing a point:

  • Translate the point using the initial definition that we defined earlier, and determine the grid square containing the point (if you used force two, this is a shift operation, otherwise it is division.)
  • Look at the list of squared grids. If it is empty, then there is no polygon. If not, you should look at each of the polygons in the list and find the intersection.

This works well for common and mostly disjoint polygons, especially if you can choose a mesh size good enough to only have a few polygons per square. It will be slow when you click squares with a lot of intersecting polygons. Another optimization is to have a flag for each listed polygon on a square to indicate that the square is completely contained within the polygon; this allows to avoid a slow containment test in many cases due to one bit per polygon record. This is especially valuable if the distance between the grids is large compared to the size of the polygon, since most squares will not be at intersections or edges.

If you need even more speed, you can start storing edge information on each square with a polygon link. You only need to test against the edges of the polygon that actually intersect the square. This can reduce the burden of only a few edge tests per polygon.

+2
source

Source: https://habr.com/ru/post/1380164/


All Articles