What is a data structure for quickly finding nonempty intersections of a list of sets?

Question

What is a data structure for quickly finding nonempty intersections of a list of sets?

I have a set of elements Nthat are sets of integers, let's say that it is ordered and calls it I[1..N]. Given the set candidate, I need to find a subset Ithat have non-empty intersections with candidate.

So, for example, if:

I = [{1,2}, {2,3}, {4,5}]

I want to determine valid_items(items, candidate)that:

valid_items(I, {1}) == {1}
valid_items(I, {2}) == {1, 2}
valid_items(I, {3,4}) == {2, 3}

I am trying to optimize for one given set Iand set of variables candidate. I am currently doing this by caching items_containing[n] = {the sets which contain n}. In the above example, this would be:

items_containing = [{}, {1}, {1,2}, {2}, {3}, {3}]

That is, 0 is not contained in any element, 1 is contained in paragraph 1, 2 is contained in it 1 and 2, 2 is contained in paragraph 2, 3 is contained in paragraph 2, and 4 and 5 are contained in paragraph 3.

, valid_items(I, candidate) = union(items_containing[n] for n in candidate).

( ) ? 2^N , N N*log(N) .

+3

optimization set data-structures set-theory mathematical-optimization

Andrey Fedorov 06 . '10 22:36

2

, , V n, V (i) 0, V . , , , , , .

+1

Tom Smith 06 . '10 23:07

Lie Ryan · Accepted Answer · 2010-04-06T23:11:03+0000

, , -, . , item_ .

. items_contain :

items_containing = [0x0000, 0x0001, 0x0011, 0x0010, 0x0100, 0x0100]

valid_items - :

int valid_items(Set I, Set candidate) {
    // if you need more than 32-items, use int[] for valid 
    // and int[][] for items_containing
    int valid = 0x0000;
    for (int item : candidate) {
        // bit-wise OR
        valid |= items_containing[item];
    }
    return valid;
}

Big-O.

What is a data structure for quickly finding nonempty intersections of a list of sets?

More articles: