Effectively iterate over all MATCHING keys in hashmap?

I have a HashMap with millions of entries.

It is necessary to get all the records whose keys correspond to a specific set of criteria (in this case, each key is an object with two integer properties, I need to get all the keys where each of these integers falls into the specified range).

What is the fastest and most efficient way to iterate over all such keys?

UPDATE: In this particular case, although I did not specify it in front, the first integer in the key has a natural priority over the second integer.

+4
source share
7 answers

Here is the solution using TreeMap :

 public static void main(String[] args) { Comparator<Foo> fooComparator = new Comparator<Foo>() { @Override public int compare(Foo o1, Foo o2) { return o1.compareTo(o2); } }; TreeMap<Foo, String> map = new TreeMap<Foo, String>(fooComparator); map.put(new Foo(1, 4), ""); map.put(new Foo(1, 3), ""); map.put(new Foo(2, 4), ""); map.put(new Foo(3, 4), ""); map.put(new Foo(8, 10), ""); map.put(new Foo(8, 17), ""); map.put(new Foo(10, 10), ""); int a = 2; int b = 5; for (Foo f : getKeysInRange(map, a, b)) { System.out.println(f); } } public static List<Foo> getKeysInRange(TreeMap<Foo, String> map, int low, int high) { Foo key1 = new Foo(low, low); Foo key2 = new Foo(high, high); Foo fromKey = map.ceilingKey(key1); Foo toKey = map.floorKey(key2); if (fromKey != null && toKey != null && fromKey.compareTo(toKey) < 0) return new ArrayList<Foo>(map.subMap(fromKey, true, toKey, true).keySet()); return new ArrayList<Foo>(); } public static class Foo implements Comparable<Foo> { private int i; private int j; private Foo(int i, int j) { super(); this.i = i; this.j = j; } public int min() { if (i < j) return i; else return j; } public int max() { if (i > j) return i; else return j; } @Override public String toString() { return "I=" + i + "J=" + j; } @Override public int compareTo(Foo o) { if (this.min() > o.min()) { return 1; } else if (this.min() < o.min()) return -1; else { if (this.max() > o.max()) return 1; else if (this.max() < o.max()) return -1; else return 0; } } } 
+3
source

A HashMap is not an effective data structure for finding keys that lie in a certain range. Generally, the only keys that you can find efficiently on the hash map are keys with the same hash as yours (i.e. Equal keys).

To find keys within a certain range, you better use a SortedMap , such as TreeMap, which can then be viewed using the SortedMap.subMap (low, high) method.

As for finding a key based on two keys, this is even more complicated. It is probably best to iterate over a subset of the range of the first integer, and then check each of them if the second integer falls into the specified range. This at least restricts scanning to keys that have one of the integers within the range. Try sorting a map based on an integer that has a more natural distribution of values ​​over the possible ranges that you might have to look for.

+7
source

You cannot do this without repeating the entire set of keys.

You can use TreeMap with a sorting criterion that will sort by any combination of two integer properties, if you are sure that you will not have other records that have the same value for these integer properties, and then you can find the first match directly, and then just go from there to the first mismatch. But you are unlikely to be able to achieve these conditions.

Since collections have fairly low overheads (everything is stored by reference), I would consider creating two sorted collections, possibly TreeSets, one of which was sorted by the first property and the other sorted by the second, and then select all the values ​​that match criteria from both collections and combine them together.

+1
source

The solution provided by bruno conde is a good start. However, the way I read the original question is that the key object contains two integers and that the question is about the fastest way to get all key / value pairs that correspond to one range for the first integer and correspond to the second range for the second integer . Bruno's solution assumes the keys are in a natural order, where the first integer always takes precedence over the second integer. It also assumes only one range.

For this more general case, I would: insert the key / values ​​into a TreeMap using a comparator that supports an integer value1 insert the same key / values ​​into a second TreeMap using a comparator that supports an integer2

You can then use subMap () on each TreeMap, using a range to get an ordered view of the underlying TreeMap. You can then create a new TreeSet result based on the intersection (keepAll ()) keySet () of these subframes.

+1
source

There probably won't be a faster solution than something like:

 for (final KeyObj key : map.keySet()) { // do work } 
0
source

If the TreeSet will not work for any reason, the standard way to iterate is with a set of records.

 for (Map.Entry<MyKeyType, MyValueType> entry : myMap.entrySet()) { MyKeyType key = entry.getKey(); if (isValid(key)) { // do whatever validList.add(entry.getValue()); } } 

Thus, you do not need to make an additional call to myMap.get(key) for valid keys.

0
source

You might want to consider some kind of SQL code, perhaps in memory, such as Derby or H2 . Much depends on how important it is and how important it is to be quick. Then you can do it in SQL and let the engine do all the optimization work.

0
source

Source: https://habr.com/ru/post/906114/


All Articles