Effective algorithm for removing any card that is contained in another card from a set of cards

I have a set of unique maps (Java HashMaps at present) and you want to remove from it any maps that are completely contained in some other map in the set (i.e. remove m from s if m.entrySet () is a subset of n.entrySet () for some other n in s.)

I have an n ^ 2 algorithm, but it is too slow. Is there a more efficient way to do this?

Edit:

the set of possible keys is small if that helps.

Here is an ineffective reference implementation:

public void removeSubmaps(Set<Map> s) {
    Set<Map> toRemove = new HashSet<Map>();
    for (Map a: s) {
        for (Map b : s) {
            if (a.entrySet().containsAll(b.entrySet()))
                toRemove.add(b);
        }
    }
    s.removeAll(toRemove);    
}
+3
source share
5 answers

, . , , . Mark Ransom .

: /, / . : , /; . ""; "", /. . , .

private <K, V>  void removeSubmaps(Set<Map<K, V>> maps) {
    // index the maps by key/value
    List<Map<K, V>> mapList = toList(maps);
    Map<K, Map<V, List<Integer>>> values = LazyMap.create(HashMap.class, ArrayList.class);
    for (int i = 0, uniqueRowsSize = mapList.size(); i < uniqueRowsSize; i++) {
        Map<K, V> row = mapList.get(i);
        Integer idx = i;
        for (Map.Entry<K, V> entry : row.entrySet()) 
            values.get(entry.getKey()).get(entry.getValue()).add(idx);
    }

    // find submaps
    Set<Map<K, V>> toRemove = Sets.newHashSet();
    for (Map<K, V> submap : mapList) {
        // find the smallest set of maps with a matching key/value
        List<Integer> smallestList = null;
        for (Map.Entry<K, V> entry : submap.entrySet()) {
            List<Integer> list = values.get(entry.getKey()).get(entry.getValue());
            if (smallestList  == null || list.size() < smallestList.size())
                smallestList = list;
        }

        // compare with each of the maps in that set
        for (int i : smallestList) {
            Map<K, V> map = mapList.get(i);
            if (isSubmap(submap, map))
                toRemove.add(submap);
        }
    }

    maps.removeAll(toRemove);
}

private <K,V> boolean isSubmap(Map<K, V> submap, Map<K,V> map){
    if (submap.size() >= map.size())
        return false;
    for (Map.Entry<K,V> entry : submap.entrySet()) {
        V other = map.get(entry.getKey());
        if (other == null)
            return false;
        if (!other.equals(entry.getValue()))
            return false;
    }
    return true;
}
0

, - , n ^ 2, , . . - .

+2

.

, , . . , / - . , . - , .

+1

: , , .

- HashMap, - . - -, -, set.containsAll() , .

. divisor.

- O (n ^ 2), , , O (n) ( ), , set.containsAll(), O (n ^ 2), n - -.

- , , /.

0

(O (N ** 2/log N)) : .

But if you know your data distribution, you can do much better in the middle case.

0
source

Source: https://habr.com/ru/post/1723692/


All Articles