Your current Brute-Force algorithm is O (n ^ 2). For your two 70,000 linear polygons that have nearly 10 billion operations, not to mention 700,000 other polygons. Obviously, no simple code optimization will be enough, so you will need some algorithmic optimization that can reduce it O (n ^ 2) without undue complication.
O (n ^ 2) comes from nested loops in the brute force algorithm, each of which is bounded by n , making it O (n * n). The easiest way to improve this is to find a way to reduce the inner loop so that it is not connected or not dependent on n . Thus, we need to find a way to arrange or reorder the inner list of strings in order to check the outer row so that only part of the complete list is checked.
The approach that I take uses the fact that if two segments of a line intersect, then the range of their X values โโshould overlap. Keep in mind, this does not mean that they intersect, but if their X ranges do not overlap, then they cannot intersect, so they do not need to check them against each other. (This also applies to ranges of Y values, but you can only use one dimension at a time).
An advantage of this approach is that these X ranges can be used to arrange the endpoints of lines, which, in turn, can be used as starting and ending points for which lines should be checked for intersection.
So what exactly do we do is define a custom class ( endpointEntry ) that represents the High or Low X values โโfor the two rows. These endpoints are all placed in one list structure and then sorted based on their X values.
Then we implement an external loop in which we look at the entire list in the same way as in the brute force algorithm. However, our inner cycle is significantly different. Instead of rescanning the entire list of lines to check for intersection, we would rather start scanning the sorted list of endpoints from the endpoint with the high X value of the outer loop line and finish it when we go below the same lower endpoint X of the endpoint. Thus, we only check lines whose value range X overlaps the outer contour line.
OK, here is a C # class demonstrating my approach:
class CheckPolygon2 { // internal supporting classes class endpointEntry { public double XValue; public bool isHi; public Line2D line; public double hi; public double lo; public endpointEntry fLink; public endpointEntry bLink; } class endpointSorter : IComparer<endpointEntry> { public int Compare(endpointEntry c1, endpointEntry c2) { // sort values on XValue, descending if (c1.XValue > c2.XValue) { return -1; } else if (c1.XValue < c2.XValue) { return 1; } else // must be equal, make sure hi sort before lo's if (c1.isHi && !c2.isHi) { return -1; } else if (!c1.isHi && c2.isHi) { return 1; } else { return 0; } } } public bool CheckForCrossing(List<Line2D> lines) { List<endpointEntry> pts = new List<endpointEntry>(2 * lines.Count); // Make endpoint objects from the lines so that we can sort all of the // lines endpoints. foreach (Line2D lin in lines) { // make the endpoint objects for this line endpointEntry hi, lo; if (lin.P1.X < lin.P2.X) { hi = new endpointEntry() { XValue = lin.P2.X, isHi = true, line = lin, hi = lin.P2.X, lo = lin.P1.X }; lo = new endpointEntry() { XValue = lin.P1.X, isHi = false, line = lin, hi = lin.P1.X, lo = lin.P2.X }; } else { hi = new endpointEntry() { XValue = lin.P1.X, isHi = true, line = lin, hi = lin.P1.X, lo = lin.P2.X }; lo = new endpointEntry() { XValue = lin.P2.X, isHi = false, line = lin, hi = lin.P2.X, lo = lin.P1.X }; } // add them to the sort-list pts.Add(hi); pts.Add(lo); } // sort the list pts.Sort(new endpointSorter()); // sort the endpoint forward and backward links endpointEntry prev = null; foreach (endpointEntry pt in pts) { if (prev != null) { pt.bLink = prev; prev.fLink = pt; } prev = pt; } // NOW, we are ready to look for intersecting lines foreach (endpointEntry pt in pts) { // for every Hi endpoint ... if (pt.isHi) { // check every other line whose X-range is either wholly // contained within our own, or that overlaps the high // part of ours. The other two cases of overlap (overlaps // our low end, or wholly contains us) is covered by hi // points above that scan down to check us. // scan down for each lo-endpoint below us checking each // line for intersection until we pass our own lo-X value for (endpointEntry pLo = pt.fLink; (pLo != null) && (pLo.XValue >= pt.lo); pLo = pLo.fLink) { // is this a lo-endpoint? if (!pLo.isHi) { // check its line for intersection if (pt.line.intersectsLine(pLo.line)) return true; } } } } return false; } }
I'm not sure the true complexity of this algorithm, but I suspect that for most non-pathological polygons it will be close to O (n * SQRT (n)), which should be fast enough.
Explanation of the logic of the inner loop:
The inner loop simply scans the list of endpoints in the same ordered order as the outer loop. But it will start scanning from where the outer loop is where the outer loop is currently located (which is the hiX point of some string) and will only be checked until the endPoint values โโfall below the corresponding loX of the same string .
What a difficulty here is that this cannot be done with the Enumerator ( foreach(..in pts) outer loop), because it is not possible to list a list sublist or start an enumeration based on the current current position of the enumerations. So instead, I used the Forward and Backward Links properties (fLink and bLink) to create a double-linked structure that retains the sorted list order, but I can incrementally scan without listing the list:
for (endpointEntry pLo = pt.fLink; (pLo != null) && (pLo.XValue >= pt.lo); pLo = pLo.fLink)
In this case, the old-style for loop specifier consists of three parts: initialization, condition, and increment-decrement. endpointEntry pLo = pt.fLink; initialization expression endpointEntry pLo = pt.fLink; initializes pLo direct link to the current point in the list. That is, the next point in the list, in descending sort order.
Then the body of the inner cycle is executed. Then the decrement increment pLo = pLo.fLink , which simply sets the current point of the inner contour ( pLo ) to the next lower point using its direct link ( pLo.fLink ), thereby promoting the loop.
Finally, it exits after testing the condition of the loop (pLo != null) && (pLo.XValue >= pt.lo) , which loops until the new point is zero (which will mean that we were at the end of the list ) and until the new XValue is still greater than or equal to the lower X value of the current point of the outer contour. This second condition ensures that the inner loop only considers lines that overlap the endpoint line of the outer contour.
Now itโs become clearer to me that I could probably get around this fLink-bLink inconvenience by treating the endPoints List as an array:
- Fill the list with endpoints
- Sort list by XValue
- Outer Loop through the list in descending order using the index iterator (
i ), instead of the foreach enumerator - (A) The inner loop through the list, using the second iterator (
j ), starting with i and ending when it passed below pt.Lo
What I think would be a lot easier. I can publish a simplified version if I want.