Spatial Index for Rectangles with Quick Insert

Question

Spatial Index for Rectangles with Quick Insert

I am looking for a data structure that provides indexing for Rectangles. I need the insertion algorithm to be as fast as possible, as the rectangles will move around the screen (think about dragging the rectangle with the mouse to a new position).

I looked at R-trees, R + trees, kD-trees, ATVs and B-trees, but from my understanding the insertion is usually slow. I would rather have inserts with sublinear temporal complexity, so maybe someone can prove to me that I am mistaken in any of the data structures listed.

I should be able to query the data structure for which rectangles are at the point (x, y) or which rectangles intersect the rectangle (x, y, width, height).

EDIT: The reason I want to paste so fast is because if you are thinking of a rectangle moving around the screen, you need to delete them and then paste them again.

Thanks!

+4

algorithm data-structures indexing spatial-index

Thecloudlessky Apr 2 '10 at 22:52

source share

3 answers

I would use a multi-scale grid approach (equivalent to four trees in one form or another).

I assume that you are using integer coordinates (i.e. pixels) and have enough space to store all the pixels.

Have an array of rectangle lists, one for each pixel. Then, twice in two, and do it again. And again, and again, and again, until you have one pixel that covers everything.

Now the key is that you insert the rectangles at a level suitable for the size of the rectangle. It will be something like (pixel size) ~ = min (height, width) / 2. Now for each rectangle you have only a few attachments that you need to make in the lists (you could link them above a constant, for example, select something that has 4 to 16 pixels).

If you want to search for all the rectangles in x, y, you look in the list of the smallest pixel, and then in the list of the dual-core 2x2 pixel that contains it, and then at 4x4, etc .; you must have steps log2 (number of pixels) to view. (For large pixels, you need to check if (x, y) is really in the rectangle, you expect about half of them to be successful at the borders, and all of them will be successful inside the rectangle, so you expect no worse than 2 times more work than if you looked at the pixel right away.)

Now, what about the insert? It is very inexpensive - O (1) to come to the forefront of the list.

How about removal? It is more expensive; you need to view and heal each list for each pixel you enter. This is approximately O (n) in the number of rectangles overlapping in this position in space and approximately the same size. If you have a really large number of rectangles, you should use a different data structure to store them (hash set, RB tree, etc.).

(Note: if your smallest rectangle should be larger than a pixel, you do not need to actually form a multiscale structure down to the pixel level, just lower it until the smallest rectangle is hopelessly lost inside your binder pixel.)

+3

Rex kerr Apr 3 '10 at 5:26

source share

Perhaps this is an extended comment, not an answer.

I am a little puzzled by what you really want. I could suggest that you want the data structure to support quick answers to questions such as "Given the identifier of the rectangle, return its current coordinates." It is right?

Or do you want to answer "which rectangle is in position (x, y)"? In this case, you may need an array with dimensions corresponding to the height and width of your display, with each element in the array being a (presumably short) list of rectangles on that pixel.

But then you state that you need an insertion algorithm to deal with the constant movement of the rectangles as quickly as possible. If you only had, say, 10 rectangles on the screen, you could just have a 10-element array containing the coordinates of each of the rectangles. Then updating their positions will not require any insertions into the data structure.

How many rectangles? How fast are they created? and destroyed? How do you want to deal with ceilings? Is the rectangle just a border or does it include an interior?

+1

High performance mark Apr 2 '10 at 23:12

source share

mcdowella · Accepted Answer · 2010-04-03T05:26:24+0000

The data structures you mention are quite a mixed bag: in particular, B-trees should be fast (the cost of inserting increases with the logarithm of the number of elements present), but will not speed up your intersection requests.

Ignoring this - and hoping for the best - spatial data structures come in two parts. The first part tells you how to build a tree structure from data. In the second part, you will learn how to track information in each node that describes the elements stored below this node, and how to use it to speed up queries.

You can usually pinch ideas for tracking information on each node without using (expensive) ideas on how to build the tree. For example, you can create a key for each rectangle by bit-alternating the coordinates of its points, and then use a completely normal tree structure (for example, a B-tree or an AVL tree or a Red-Black tree) to store it, while preserving all node data. It can in practice speed up your queries enough, although you won’t be able to say this until you have implemented and tested it on real data. The goal of wooding instructions in most schemes is to provide performance guarantees.

Two postscripts:

1) I like Patricia's trees for this - they are easy enough to implement, and adding or deleting records does not greatly interfere with the tree structure, so you will not have too much work to update the information stored in the nodes.

2) The last time I looked at the window system, he did not worry about any of these clever things at all - he just kept a linear list of objects and searched all the way when necessary: it was fast enough.

Spatial Index for Rectangles with Quick Insert

More articles: