Map Search Performance

I would like to do something using the card value for a given key only if the card contains the given key. I will naively write:

Map<String, String> myMap = ...; if(myMap.containsKey(key)) { String value = myMap.get(key); // Do things with value } 

The above code looks easy to understand, but in terms of performance, is the following code better?

 Map<String, String> myMap = ...; String value = myMap.get(key); if(value != null) { // Do things with value } 

In the second snippet, I don't like the fact that value declared with a wider scope.

How does the performance of these cases change regarding the implementation of the card?

Note. Assume zero values ​​are not allowed on the map.

+6
source share
3 answers

Map is an interface, therefore implementation classes have quite a bit of freedom in how they implement each operation (it is quite possible to write a class that buffers the last record, which can allow constant access time for get if it matches the last received object, which makes two are practically equivalent, with the exception of the supposedly necessary comparison).

For TreeMap and HashMap , for example, containsKey is essentially just a get operation (more specifically getEntry ) with a null check.

Thus, for these two containers, the first version should be approximately twice as long as the second (provided that you use the same type of Map in both cases).

Note that HashMap.get is O (1) (with a hash function well suited for data), and TreeMap.get is O (log n). Therefore, if you are doing significant work in a loop, and Map does not contain the order of millions of elements, the performance difference is likely to be negligible .

However, note the disclaimer for Map.get :

If this map is nullable, then a null return does not necessarily indicate that the map does not contain a mapping for the key; it is also possible that the map explicitly maps the key to null. The containsKey operation can be used to distinguish between the two cases.

+6
source

Obviously, the second version is more efficient: you only look at the key on the map once, while in the first version you look at it twice, therefore, calculate the key hash twice and look in the hash packets, assuming that you use a hash map , sure.

You may have a completely different implementation of the map interface, which could handle this kind of code better, if you recall the record of the map that was associated with the key in the last one, contains a method call if subsequent use uses the same key (using the = operator =), you can immediately return the associated value from the memorized card record.

However, in the second method there is a danger: what if I put this on the card:

 map.put("null", null); 

then map.get ("null") will return null, and you will treat it as "null" is not displayed, and map.contains ("null") will return true!

+1
source

To answer your question ,
"How does the effectiveness of these cases change regarding the implementation of the map?"
The difference in performance is negligible.

Comment on the comment
"In the second snippet, I don't like the fact that the value is declared with a wider scope."
Ok, you shouldn't. You see, there are two ways to get null returned from the map:

  • The key does not exist OR
  • The key exists, but its value is null (if the map implementation allows null values ​​such as HashMap).

Thus, two scenarios can have different results if the key existed with a zero value!

EDIT

I wrote the following code to test the performance of two scenarios:

 public class TestMapPerformance { static Map<String, String> myMap = new HashMap<String, String>(); static int iterations = 7000000; // populate a map with seven million strings for keys static { for (int i = 0; i <= iterations; i++) { String tryIt = Integer.toString(i); myMap.put(tryIt, "hi"); } } // run each scenario twice and print out the results. public static void main(String[] args) { System.out.println("Key Exists: " + testMap_CheckIfKeyExists(iterations)); System.out.println("Value Null: " + testMap_CheckIfValueIsNull(iterations)); System.out.println("Key Exists: " + testMap_CheckIfKeyExists(iterations)); System.out.println("Value Null: " + testMap_CheckIfValueIsNull(iterations)); } // Check if the key exists, then get its value public static long testMap_CheckIfKeyExists(int iterations) { Date date = new Date(); for (int i = 0; i <= iterations; i++) { String key = Integer.toString(i); if(myMap.containsKey(key)) { String value = myMap.get(key); String newString = new String(value); } } return new Date().getTime() - date.getTime(); } // Get the key value, then check if that value is null public static long testMap_CheckIfValueIsNull(int iterations) { Date date = new Date(); for (int i = 0; i <= iterations; i++) { String key = Integer.toString(i); String value = myMap.get(key); if(value != null) { String newString = new String(value); } } return new Date().getTime() - date.getTime(); } } 

I ran it and this was the result:

 Key Exists: 9901 Value Null: 11472 Key Exists: 11578 Value Null: 9387 

So, in conclusion, the performance difference is negligible.

+1
source

Source: https://habr.com/ru/post/952840/


All Articles