How an iterator fails in a parallel hash map

Question

How an iterator fails in a parallel hash map

How do I know that the iterator in CopyOnWriteArrayList is thread-safe due to the reference to the copy of arrayList during the creation of the iterator, and in this all the mutative operations (add, set, and so on) implemented by creating a new copy of the base array so that they do not affected the copy indicated by the snapshot link, as well as for CopyOnWriteArraySet ,

But struggling in the case of ConcurrentHashMap , so please share your ideas on how the iterator is fault tolerant in the case of ConcurrentHaspMap

+5

java collections

Tenacious Jan 7 '15 at 18:25

source share

6 answers

OldCurmudgeon · Answer 1 · 2015-01-12T10:59:00+0000

Your question is a bit ambiguous - you specify failsafe in the header, but thread-safe in the body. I assume that you mean thread safety.

From sample source in GrepCode

... Search operations (including get) are usually not blocked, so they may overlap with update operations (including put and remove). Retrievals reflect the results of the latest completed upgrade operations as they began. For cumulative operations, such as putAll and clear, fetching at the same time may reflect the insertion or deletion of only certain entries. Similarly, iterators and enumerations return elements that reflect the state of the hash table at some point in time or after the creation of the iterator / enumeration. They do not throw java.util.ConcurrentModificationException.

so the iteration is thread safe, but they define the contract as Iterators and Enumerations, return elements reflecting the state of the hash table at some point or after creating an iterator / enumeration .

Louis Wasserman · Answer 2 · 2015-01-07T18:33:56+0000

From the docs: "Similarly, iterators and enumerations return elements that reflect the state of the hash table at some point in time or after creating an iterator / enumeration."

Iterators generated using ConcurrentHashMap point to a hash table array since the iterator was created, so they do not take into account updates that cause the hash table to resize, which is allowed by their specification. Hash bucket iteration is safe because the hash bucket references are volatile .

Dima · Answer 3 · 2015-01-14T20:09:14+0000

Well, the entire hash map (at least conceptually, the actual implementation is a bit more complicated) is an array of linked lists. A safe ( not thread safe ) iterator can be implemented very trivially by going through linked lists one by one, starting from the first node to the last. Something like this will work:

 public boolean hasNext() { if(next != null) return true; currentNode = currentNode == null ? null : currentNode.next; // volatile while(currentNode == null && ++currentIndex < buckets.length) // assume the length is fixed for simplicity currentNode = buckets[currentIndex].head; // also volatile if(currentNode != null) next = currentNode.data; return currentNode != null; } public Object next { if(!hasNext()) throw new NoSuchElementException(); Object result = next; next = null; return result; }

This is not very specific to ConcurrentHashMap , they could implement it this way for regular cards, but they didn’t decide. What for? Well, because a “regular” card is not thread safe, so modifying it at the same time is not a good idea, so if this happens, it is most likely an error rather than an intentional occurrence, so it’s better to “quickly fail” as soon as the situation will be detected, but will not ignore it and continue, risking a potential subtle and difficult way to diagnose inconsistencies in the future.

If you ask me if I agree with this last expression, the answer is “no” :) But, apparently, there were enough people among java designers who at least came back when this decision was made.

HIREN011 · Answer 4 · 2015-01-15T22:02:44+0000

From comment comments in source ConcurrentHashMap.java.

The main strategy is to subdivide the table into segments, each of which is itself a readable hash table.

Sergey Alaev · Answer 5 · 2015-01-16T17:09:32+0000

Regarding the implementation of Java8 ConcurrentHashMap.

Despite the javadoc, the ConcurrentHashMap Iterator does NOT return a snapshot, it works in a live parallel hash table that handles all concurrency cases with optimism, just like all locked data structures.

Basically, it contains a link to the current hash table, the bin index in that hash table, and the last Node returned. Here are some of the tricks he uses to work with a concurrent hash table without blocking:

If the node that it currently holds has been deleted, go to the next bin (by increasing the index)
Optimistic "lock" (i.e. just try again) for cases where the hash table may be in an inconsistent state
Special node type (ForwardingNode) for rehash detection

See the ConcurrentHashMap.Traverser.advance () method.

Reza · Answer 6 · 2015-01-17T09:35:58+0000

The ConcurrentHashMap iterator is fault tolerant , which means that it does not throw a ConcurrentModificationException , even if the underlying ConcurrentHashMap changes after the iteration begins. Looking at the java code , it shows that its iterator extends the HashIterator (line 1251), which does not have any synchronization, but allows multiple iterators to view the map at the same time, and the map can be changed meanwhile without throwing any exceptions and not guaranteeing that the new the item to be added will be returned. Therefore, you can say that ConcurrentHashMap is fault tolerant !

How an iterator fails in a parallel hash map

More articles: