Implement brute force algorithm for detecting a self-intersecting polygon

Question

Implement brute force algorithm for detecting a self-intersecting polygon

I originally implemented the Hoi Chamos algorithm, however it is too complicated for future maintainability (I am not talking about this) and it did not report correctly, therefore the optimized brute force algorithm is what I am going to use.

My question is: how can I optimize this code for use?

In a way, my code contains a nested for loop repeating the same list twice.

EDIT: rotated lines in a HashSet and used two foreach loops ... shaved off about 45 seconds from a scan of 10,000. This is still not enough.

foreach (Line2D g in lines) { foreach (Line2D h in lines) { if (g.intersectsLine(h)) { return false; } } } // end 'lines' for each loop

If I force my intersectsLine () method to return false (for testing purposes), it still takes 8 seconds to scan 10,000 records (I have 700,000 records). This is too long, so I need to optimize this piece of code.

I tried to remove the lines from the list after it was compared with all the other lines, however the problem with accuracy (I don’t know why) and the speed increase are hardly noticeable.

Here is my intersectsLine method. I found an alternative approach here , but it looks like it will be slower with all method calls and much more. The calculation of the slope does not seem to me as if it took up too much computing power (correct me if I am wrong?)

 public bool intersectsLine(Line2D comparedLine) { //tweakLine(comparedLine); if (this.Equals(comparedLine) || P2.Equals(comparedLine.P1) || P1.Equals(comparedLine.P2)) { return false; } double firstLineSlopeX, firstLineSlopeY, secondLineSlopeX, secondLineSlopeY; firstLineSlopeX = X2 - X1; firstLineSlopeY = Y2 - Y1; secondLineSlopeX = comparedLine.X2 - comparedLine.X1; secondLineSlopeY = comparedLine.Y2 - comparedLine.Y1; double s, t; s = (-firstLineSlopeY * (X1 - comparedLine.X1) + firstLineSlopeX * (Y1 - comparedLine.Y1)) / (-secondLineSlopeX * firstLineSlopeY + firstLineSlopeX * secondLineSlopeY); t = (secondLineSlopeX * (Y1 - comparedLine.Y1) - secondLineSlopeY * (X1 - comparedLine.X1)) / (-secondLineSlopeX * firstLineSlopeY + firstLineSlopeX * secondLineSlopeY); if (s >= 0 && s <= 1 && t >= 0 && t <= 1) { //console.WriteLine("s = {0}, t = {1}", s, t); //console.WriteLine("this: " + this); //console.WriteLine("other: " + comparedLine); return true; } return false; // No collision */ }

EDIT: The main bottleneck is large polygons! I excluded the use of polygons with more than 100 lines, and at 5:10 he used all 700,000+ polygons. What is in the acceptable range! Surely there is a way to check whether the polygon should be calculated before running this code? Do I have access to the XMin, Xmax, YMin, and YMax properties, if that helps?

Output another test that limits polygons to 1000 lines each. It took a little over an hour.

I removed all polygon restrictions and it worked for 36 hours ... still no results.

A few ideas that I have:

-When I create a hashset of my string, you have another hashset / List that has candidates for intersection. I would add only lines to this list if there is a chance of intersection. But how can I eliminate / add features? If there were only three lines that could overlap with the others, I would compare 4000 lines versus 3 instead of 4000. That alone could make a BIG difference.

-If the same point appears twice, except for the first and last point, skip the execution of the nested loop.

Edit:

Information about landfills: 700,000 total

There are more than four thousand polygons with 1000 or more points.

There are 2 polygons with 70,000 or more points.

I think this can be done up to fifteen minutes with a little creativity.

+3

c # algorithm arcgis shapefile polygons

Evan parsons Aug 22 '13 at 16:29

source share

3 answers

There are two things to check:

closed polygon consists of a cyclic sequence of points
- if the same point in this sequence is more than once than a self-intersecting polygon.
- beware that the first and last points may be the same (different from your polygonal representation), in which case this point should be there more than a tan twice.
check all non-adjacent lines of your polygon for intersection
- non adjacent lines do not split point
- if there is an intersection, then the polygon is self-intersecting

+1

Spektre Sep 16 '13 at 14:29

source share

Simple optimization, which should be half the time for non-overlapping polygons:

 int count = lines.Count(); for (int l1idx = 0; l1idx < count-1; l1idx++) for (int l2idx = l1idx+1; l2idx < count; l2idx++) { Line2D g = lines[l1idx]; Line2D h = lines[l2idx]; if (g.intersectsLine(h)) { return false; } }

+1

weston Oct 4 '13 at 11:32

source share

RBarryYoung · Accepted Answer · 2013-10-04T03:09:27+0000

Your current Brute-Force algorithm is O (n ^ 2). For your two 70,000 linear polygons that have nearly 10 billion operations, not to mention 700,000 other polygons. Obviously, no simple code optimization will be enough, so you will need some algorithmic optimization that can reduce it O (n ^ 2) without undue complication.

O (n ^ 2) comes from nested loops in the brute force algorithm, each of which is bounded by n , making it O (n * n). The easiest way to improve this is to find a way to reduce the inner loop so that it is not connected or not dependent on n . Thus, we need to find a way to arrange or reorder the inner list of strings in order to check the outer row so that only part of the complete list is checked.

The approach that I take uses the fact that if two segments of a line intersect, then the range of their X values should overlap. Keep in mind, this does not mean that they intersect, but if their X ranges do not overlap, then they cannot intersect, so they do not need to check them against each other. (This also applies to ranges of Y values, but you can only use one dimension at a time).

An advantage of this approach is that these X ranges can be used to arrange the endpoints of lines, which, in turn, can be used as starting and ending points for which lines should be checked for intersection.

So what exactly do we do is define a custom class ( endpointEntry ) that represents the High or Low X values for the two rows. These endpoints are all placed in one list structure and then sorted based on their X values.

Then we implement an external loop in which we look at the entire list in the same way as in the brute force algorithm. However, our inner cycle is significantly different. Instead of rescanning the entire list of lines to check for intersection, we would rather start scanning the sorted list of endpoints from the endpoint with the high X value of the outer loop line and finish it when we go below the same lower endpoint X of the endpoint. Thus, we only check lines whose value range X overlaps the outer contour line.

OK, here is a C # class demonstrating my approach:

 class CheckPolygon2 { // internal supporting classes class endpointEntry { public double XValue; public bool isHi; public Line2D line; public double hi; public double lo; public endpointEntry fLink; public endpointEntry bLink; } class endpointSorter : IComparer<endpointEntry> { public int Compare(endpointEntry c1, endpointEntry c2) { // sort values on XValue, descending if (c1.XValue > c2.XValue) { return -1; } else if (c1.XValue < c2.XValue) { return 1; } else // must be equal, make sure hi sort before lo's if (c1.isHi && !c2.isHi) { return -1; } else if (!c1.isHi && c2.isHi) { return 1; } else { return 0; } } } public bool CheckForCrossing(List<Line2D> lines) { List<endpointEntry> pts = new List<endpointEntry>(2 * lines.Count); // Make endpoint objects from the lines so that we can sort all of the // lines endpoints. foreach (Line2D lin in lines) { // make the endpoint objects for this line endpointEntry hi, lo; if (lin.P1.X < lin.P2.X) { hi = new endpointEntry() { XValue = lin.P2.X, isHi = true, line = lin, hi = lin.P2.X, lo = lin.P1.X }; lo = new endpointEntry() { XValue = lin.P1.X, isHi = false, line = lin, hi = lin.P1.X, lo = lin.P2.X }; } else { hi = new endpointEntry() { XValue = lin.P1.X, isHi = true, line = lin, hi = lin.P1.X, lo = lin.P2.X }; lo = new endpointEntry() { XValue = lin.P2.X, isHi = false, line = lin, hi = lin.P2.X, lo = lin.P1.X }; } // add them to the sort-list pts.Add(hi); pts.Add(lo); } // sort the list pts.Sort(new endpointSorter()); // sort the endpoint forward and backward links endpointEntry prev = null; foreach (endpointEntry pt in pts) { if (prev != null) { pt.bLink = prev; prev.fLink = pt; } prev = pt; } // NOW, we are ready to look for intersecting lines foreach (endpointEntry pt in pts) { // for every Hi endpoint ... if (pt.isHi) { // check every other line whose X-range is either wholly // contained within our own, or that overlaps the high // part of ours. The other two cases of overlap (overlaps // our low end, or wholly contains us) is covered by hi // points above that scan down to check us. // scan down for each lo-endpoint below us checking each // line for intersection until we pass our own lo-X value for (endpointEntry pLo = pt.fLink; (pLo != null) && (pLo.XValue >= pt.lo); pLo = pLo.fLink) { // is this a lo-endpoint? if (!pLo.isHi) { // check its line for intersection if (pt.line.intersectsLine(pLo.line)) return true; } } } } return false; } }

I'm not sure the true complexity of this algorithm, but I suspect that for most non-pathological polygons it will be close to O (n * SQRT (n)), which should be fast enough.

Explanation of the logic of the inner loop:

The inner loop simply scans the list of endpoints in the same ordered order as the outer loop. But it will start scanning from where the outer loop is where the outer loop is currently located (which is the hiX point of some string) and will only be checked until the endPoint values fall below the corresponding loX of the same string .

What a difficulty here is that this cannot be done with the Enumerator ( foreach(..in pts) outer loop), because it is not possible to list a list sublist or start an enumeration based on the current current position of the enumerations. So instead, I used the Forward and Backward Links properties (fLink and bLink) to create a double-linked structure that retains the sorted list order, but I can incrementally scan without listing the list:

 for (endpointEntry pLo = pt.fLink; (pLo != null) && (pLo.XValue >= pt.lo); pLo = pLo.fLink)

In this case, the old-style for loop specifier consists of three parts: initialization, condition, and increment-decrement. endpointEntry pLo = pt.fLink; initialization expression endpointEntry pLo = pt.fLink; initializes pLo direct link to the current point in the list. That is, the next point in the list, in descending sort order.

Then the body of the inner cycle is executed. Then the decrement increment pLo = pLo.fLink , which simply sets the current point of the inner contour ( pLo ) to the next lower point using its direct link ( pLo.fLink ), thereby promoting the loop.

Finally, it exits after testing the condition of the loop (pLo != null) && (pLo.XValue >= pt.lo) , which loops until the new point is zero (which will mean that we were at the end of the list ) and until the new XValue is still greater than or equal to the lower X value of the current point of the outer contour. This second condition ensures that the inner loop only considers lines that overlap the endpoint line of the outer contour.

Now it’s become clearer to me that I could probably get around this fLink-bLink inconvenience by treating the endPoints List as an array:

Fill the list with endpoints
Sort list by XValue
Outer Loop through the list in descending order using the index iterator ( i ), instead of the foreach enumerator
(A) The inner loop through the list, using the second iterator ( j ), starting with i and ending when it passed below pt.Lo

What I think would be a lot easier. I can publish a simplified version if I want.

Implement brute force algorithm for detecting a self-intersecting polygon

Explanation of the logic of the inner loop:

More articles: