Effectively detect overlapping intervals from the interval list

Question

Effectively detect overlapping intervals from the interval list

This is due to the search for overlapping intervals. I know how to do this, given the list of intervals (interval trees). I have a list of interval lists. For instance,

[2,6], [7,11] [1,3], [5,10], [11,13] [2,5], [6,8]

The result for this should be

[2,3], [7,8]

What I need to do is find a list of intervals that are common in all lists.

I see this problem similar to merging lists n . The problem is that I cannot apply pair merging of lists. Using this method may result in loss of overlap intervals. Therefore, I need to combine all the lists together, considering all of them at a time (instead of pairwise).

I can use interval trees. Insert the first intervals from each list into the interval tree and search for overlap. Removing the weakest interval from the tree and inserting the next interval from one of the lists. I have not yet fully understood how I can use this method, but it seems that it will become too expensive.

Is there an efficient algorithm for finding overlapping intervals from a list of interval lists.?

Additional Information: The intervals in the list are sorted. They do not overlap and do not form a sequence.

+4

list algorithm data-structures intervals interval-tree

akaHuman Aug 22 '13 at 6:49

source share

3 answers

My understanding of what you want to do is apply the intersection operation on the list of intervals. And you can do this in pairs, since the intersection is associative.

What I would do is something like

 Let S be the set of sets, R = s1, s1 in S for each set s2 in S / {s1} for each element e1 in R for each element e2 in s2 st e1.sup < e2.inf e1 <- intersection (e1, e2)

And the intersection operation between two intervals

  intersection (e1,e2): return new Interval(max(e1.inf, e2.inf), min (e1.sup, e2.sup));

+2

hpid91 Aug 22 '13 at 7:56

source share

You said that each individual list of intervals is sorted and not overlapping. Thus,

 Keep track of where you are in each list, starting at the beginning of each. While none of the lists has run out: If the current intervals (one from each list) all overlap: Output the intersection of the current intervals Find which of the current intervals has the earliest end point Advance one position within that list.

If the list of K intervals and N intervals in general, this should take O (NK) time if it is implemented in the simplest way, but you should be able to reduce this to O (N log K) by tracking information about the current intervals in the heap or in which is another priority queue.

+1

Chris okasaki Aug 22 '13 at 15:31

source share

btilly · Accepted Answer · 2013-08-22T07:38:46+0000

Create a single, sorted array of transitions. Each transition has a position and a cumulative number based on how many intervals you join or leave. When you go through the list, keep track of how many intervals you are on. When you are in as many intervals as a series, this is when you are in the total interval.

For your example, the transitions will be as follows:

 [2, 1], [6, -1], [7, 1], [11, -1], [1, 1], [3, -1], [5, 1], [10, -1], [11, 1], [13, -1] [2, 1], [5, -1], [6, 1], [8, -1]

which, after sorting by position and merge, collapses to:

 [1, 1], [2, 2], [3, -1], [5, 0], [6, 0], [7, 1], [8, -1], [10, -1], [11, 0], [13, -1]

which gives you transitions for doing the totals:

 [1, 1], [2, 3], [3, 2], [7, 3], [8, 2], [10, 2], [13, 1]

And then we can read the intervals where we are at level 3, starting at 2 and going to 3 , and the other starting at 7 and going to 8 . What answer.

The idea of creating one long list and sorting is, admittedly, extra work. Instead, you can create these lists and combine them on the fly. Savings are a factor in the log of the number of episodes, not the log of the number of events.

Effectively detect overlapping intervals from the interval list

More articles: