Isn't this repeated through entrySet (), creating too many Map.Entry instances?

Question

Isn't this repeated through entrySet (), creating too many Map.Entry instances?

I'm not sure if HashMap or TreeMap stores Map.Entry on its own. That is, most likely, it will return an instance of Map.Entry created on the fly when entrySet().iterator().next() called.

Personally, I think that in this form it might be better:

 class Entry { Object key; Object value; } interface InplaceIterator { boolean next(); } Entry entryBuf = new Entry(); InplaceIterator it = map.entrySet().inplaceIterator(entryBuf); while (it.next()) { // do with entryBuf... }

Thus, the creation of a record is prevented.

I don’t know how the Java Compiler works, can the Java Compiler optimize the creation of Map.Entry by analyzing the data flow and finding out that Map.Entry can be safely reused?

Or, did someone there already write another collection structure to enable inplace iteration?

+2

java performance

Xiè Jìléi Mar 28 '11 at 7:25

source share

3 answers

Usually small, short-lived objects are almost free. Consider f1 and f2

 static Entry f1(int i){ return new Entry(i); } static Entry entry = new Entry(0); static Entry f2(int i){ entry.i=i; return entry; } static class Entry { Entry(int i){ this.i=i; } int i; int get(){ return i; } }

This is a realistic test example of the problem you described - reusing the same object in an iteration, as well as creating a new object in an iteration. In both cases, some data is stored in the object and transferred to the call site to be read.

Allows you to profile it, extract a billion records and read the data stored in each, in three different ways.

  int r = 0; for(int i=0; i<1000000000; i++) { test0: r += i; test1: r += f1(i).get(); test2: r += f2(i).get(); } print(r);

The number I got, test2 is as fast as test0 ; test1 slower than test2 , only one processor cycle per iteration . (I think the difference is in several machine instructions, and the CPU pipelines them in one cycle)

If you still do not believe it, fully implement the proposed "effective" solution, compare it with the supposedly "wasteful" implementation and look at the difference for yourself. You will be amazed.

+1

irreputable Mar 28 '11 at 11:21

source share

The Google Collection ArrayListMultimap is quite efficient and not resource intensive, http://google-collections.googlecode.com/svn/trunk/javadoc/com/google/common/collect/ArrayListMultimap.html

Creating a multimap

 private Multimap<Integer, String> store = ArrayListMultimap.create();

Iteration multimap

 for (Map.Entry<Integer, String> entry: store.entries()) {}

And if you prefer to avoid Map.Entry, then extract the key set and from there:

 List<Integer> keys = new ArrayList<Integer>(store.keySet()); for(Long key : keys){ ArrayList<String> stored_strings = new ArrayList<String>(store.get(key)); }

0

Black box operations Mar 28 '11 at 7:36

source share

Paŭlo Ebermann · Accepted Answer · 2011-03-28T07:33:50+0000

What you described (having a local MaperatorEterry iterator and reusing it for all next() return values) is one possible implementation of a map, and I think some special maps use this.

For example, the implementation of EnumMap.entrySet().iterator() (here the version from OpenJDK, 1.6.0_20) simply uses the iterator object as the Entry object returned by the next() method:

 /** * Since we don't use Entry objects, we use the Iterator itself as entry. */ private class EntryIterator extends EnumMapIterator<Map.Entry<K,V>> implements Map.Entry<K,V> { public Map.Entry<K,V> next() { if (!hasNext()) throw new NoSuchElementException(); lastReturnedIndex = index++; return this; } public K getKey() { checkLastReturnedIndexForEntryUse(); return keyUniverse[lastReturnedIndex]; } public V getValue() { checkLastReturnedIndexForEntryUse(); return unmaskNull(vals[lastReturnedIndex]); } public V setValue(V value) { checkLastReturnedIndexForEntryUse(); V oldValue = unmaskNull(vals[lastReturnedIndex]); vals[lastReturnedIndex] = maskNull(value); return oldValue; } // equals, hashCode, toString private void checkLastReturnedIndexForEntryUse() { if (lastReturnedIndex < 0) throw new IllegalStateException("Entry was removed"); } }

This is possible since the Map.Entry specification states (highlighted by me):

Card recording (key-value pair). The Map.entrySet method returns a view of the map collection whose elements are in this class. The only way to get a link to a map entry is through an iterator of this type of collection. These Map.Entry objects Map.Entry valid only for the duration of the iteration ; more formally, the behavior of a map entry is undefined if the support map has been changed after the entry was returned by an iterator, except for the setValue operation at the entrance to the map.

If you need all the records at once, you will need to use map.entrySet().toArray() , which can create immutable copies of the records.

Here are some more notes about default maps (all in OpenJDK 1.6.0_20 found in the Ubuntu openjdk6-source package):

The general-purpose maps HashMap and TreeMap (as well as the Hashtable legacy) already use some kind of Entry objects as part of their internal structure (table or tree), so they just let these objects implement Map.Entry and return them. They are not created on the fly by the Iterator.
The same is true for WeakHashMap (where having an Entry object in a strong link does not escape its key in order to get garbage collection, if I understand correctly - but until you call next() on the iterator, the iterator contains the key in the current record )
IdentityHashMap internally uses a simple Object[] , with a variable key and value, so there are no input objects and, therefore, reusing the iterator as a record.
ConcurrentSkipListMap uses Node objects that do not implement anything, so its iterators return new AbstractMap.SimpleImmutableEntry<K,V>(n.key, v); . This means that you cannot use their setValue() method as described in the class documentation:
All Map.Entry pairs returned by the methods of this class, and its representations are snapshots of the mappings at the time they were made. They do not support the Entry.setValue method. (Note, however, that it is possible to change the display on the associated map using put , putIfAbsent or replace , depending on exactly which effect you need.)
ConcurrentHashMap internally uses the HashEntry class similar to HashMap, but it does not implement anything. In addition, there is an inner class WriteThroughEntry (extending AbstractMap.SimpleEntry ), the setValue() method delegates the put method of the map. The iterator returns new objects of this WriteThroughEntry class.

Isn't this repeated through entrySet (), creating too many Map.Entry instances?

More articles: