Python: optimizing pairwise overlap between intervals

I have a lot of intervals (from 5 to 10 thousand). These elements have a start and end position; (203, 405). The coordinates of the intervals are saved in the list.

I want to determine the coordinates and lengths of overlapping parts between each pair of intervals. This can be done as follows:

# a small list for clarity, with length normally around 5000s
cList = ((20, 54), (25, 48), (67, 133), (90,152), (140,211), (190,230)) 
for i, c1 in enumerate(cList[:-1]): # a linear pairwise combination
    for c2 in cList[i + 1:]:
        left =  max(c1[0], c2[0])
        right = min(c1[1], c2[1])
        overlap = right - left
        if overlap > 0:
            print "left: %s, right: %s, length: %s" % (left, right, overlap)

Result:

left: 25, right: 48, length: 23
left: 90, right: 133, length: 43
left: 140, right: 152, length: 12
left: 190, right: 211, length: 21

Apparently, this works ... since it can take quite a while (20 seconds), my question is: how would I optimize this? I tried to disable the second loop when the starting position of the second loop exceeds the first ending position:

if c1[1] < c2[0]:
    break

, , , , , . , , .

, - , .

+4
1

, , .

-, , :

def overlap( r1, r2 ):
    left =  max(r1[0], r2[0])
    right = min(r1[1], r2[1])
    over = right - left
    return (left, right, over) if over>0 else None

:

for i, c1 in enumerate(cList[:-1]): 
    for c2 in cList[i + 1:]:
        o = overlap(c1,c2)
        if not o is None:
            print "left: %s, right: %s, length: %s" % o

, "", , , "":

l= sorted(cList)
for i, c1 in enumerate(l[:-1]): 
    for c2 in l[i + 1:]:
        o= overlap(c1,c2)
        if o is None:
            break
        print "left: %s, right: %s, length: %s" % o

, ( ), .

, double for itertools.combinations. . , ,

from itertools import combinations
for c1,c2 in combinations(cList, 2):
    o= overlap(c1,c2)
    if not o is None:
        print "left: %s, right: %s, length: %s" % o

, , Interval Tree . python pypi.

+4

Source: https://habr.com/ru/post/1537495/


All Articles