How to check if Cartesian coordinates of a rectangle are effective?

The situation is as follows:

  • There are N arrays.
  • Each array (0..N-1) stores (x, y) tuples (Cartesian coordinates)
  • The length of each array may be different.

I want to extract a subset of the coordinate combinations that make up a complete retouch of size N. In other words; all Cartesian coordinates are adjacent to each other.

Example:

findRectangles({ {*(1,1), (3,5), (6,9)}, {(9,4), *(2,2), (5,5)}, {(5,1)}, {*(1,2), (3,6)}, {*(2,1), (3,3)} }) 

gives the following:

 [(1,1),(1,2),(2,1),(2,2)], ..., ...(other solutions)... 

There are no two points from the same set.

At first, I simply calculated the Cartesian product, but it quickly becomes impracticable (my use case currently has 18 arrays of points, each of which contains about 10 different coordinates).

+6
source share
2 answers

Let XY be your collection of arrays. We construct two new sets X and Y, where X is XY with all arrays sorted by x-coordinate, and Y is XY with all arrays sorted by y-coordinate.

  • For each point (x0, y0) in any of the arrays in X: find each point (x0, y1) with the same x-coordinate and a different y-coordinate in the remaining arrays from X
  • For each such pair of points (if it exists): search Y for points (x1, y0) and (x1, y1)

Let C be the size of the largest array. Then sorting all the sets takes O (N * C * log (C)) time. In step 1, it takes O (N * log (C)) time to find a single match point, since all arrays in X are sorted. Find all such points in O (C * N), since in the general case there are at most C * N points. Step 2 takes O (N * log (C)) time, since Y is sorted.

Therefore, the general asymptotic runtime is in O (C * N ^ 2 * log (C) ^ 2).

For C == 10 and N == 18, you get approximately 10,000 operations. Multiply this by 2 since I dropped this coefficient due to Big-O-notation.

The solution has another advantage - to be extremely easy to implement. All you need is arrays, sorting, and binary search, the first two of which are most likely already built into the language, and binary search is extremely simple.

Also note that this is the worst-case execution time, when all the rectangles start at the same x coordinate. In the middle case, you are likely to be much better than that.

0
source

You can use hashing :

 hash each point (keeping track of which list it is in) for each pair of points (a,b) and (c,d): if (a,d) exists in another list, and (c,b) exists in yet another list: yield rectangle(...) 

When I say exists , I mean something like:

 hashesToPoints = {} for p in points: hashesToPoints.setdefault(hash(p),set()).add(p) for p1 in points: for p2 in points: p3,p4 = mixCoordinates(p1,p2) if p3 in hashesToPoints[hash(p3)] and {{p3 doesn't share a bin with p1,p2}}: if p4 in hashesToPoints[hash(p4)] and {{p4 doesn't share a bin with p1,p2,p3}}: yield Rectangle(p1,p2) 

This is O(#bins^2 * items_per_bin^2) ~ 30000, which is pretty fast in your case of 18 arrays and 10 items_per_bin - much better than the external approach to the product, which ... is much worse with O(items_per_bin^#bins) ~ 3trillion. =)


minor side effect:

You can reduce both the base and the exponent in your calculations by making a few passes of the β€œtrim”. eg.

 remove each point that is not corectilinear with another point in the X or Y direction then maybe remove each point that is not corectilinear with 2 other points, in both X and Y direction 

You can do this by sorting by the X coordinate, repeat for the Y coordinate, in O(P log(P)) time in terms of the number of points. You can do this at the same time as hashing. If a bad guy organizes your input, he can make this optimization inoperative altogether. But depending on your spread, you can see significant acceleration.

+5
source

Source: https://habr.com/ru/post/888410/


All Articles