Why do C # iterators track thread creation using blocking?

This is just what puzzled me since I read about iterators on the Jon Skeet website .

A simple performance optimization implemented by Microsoft using their automatic iterators - the returned IEnumerable can be reused as an IEnumerator, preserving the creation of the object. Now, since IEnumerator must be monitoring the state, this is only true on the first retry.

What I cannot understand is why the development team adopted the approach they took to ensure thread safety.

Normally, when I am in a similar position, I would use what I think is simple Interlocked.CompareExchange to ensure that only one thread can change state from "available" to "in process".

Conceptually, it is very simple, one atomic operation, additional fields are not required, etc.

But are project teams suitable? Each IEnumerable saves the managed stream identifier field of the created stream, and then this stream identifier is checked when GetEnumerator is called against this field, and only if it is the same stream, and the first time it is called, IEnumerable can return itself as IEnumerator. It seems harder to reason, imo.

I'm just wondering why this approach was adopted. Lock operations are much slower than two calls to System.Threading.Thread.CurrentThread.ManagedThreadId, so that it justifies the extra field?

Or is there another reason for this, perhaps using memory models or ARM devices or something that I don't see? Maybe the specification has special requirements for implementing IEnumerable? Just sincerely puzzled.

+4
source share
1 answer

I can’t answer definitively, but as for your question:

Lock operations are much slower than two calls to System.Threading.Thread.CurrentThread.ManagedThreadId, so that it justifies the extra field?

Yes, blocked operations are much slower than two calls to start ManagedThreadId - blocked operations are not cheap, because multiprocessing systems are required to synchronize their caches.

From Understanding the Impact of Low-Lock Methods in Multithreaded Applications :

Interdependent instructions should ensure that caches are synchronized so that reading and writing do not seem to go past instructions. Depending on the details of the memory system and the amount of memory recently changed on different processors, this can be quite expensive (hundreds of learning cycles).

In Threading in C #, it lists overheads in the form of 10ns. While getting ManagedThreadId should be a normal non- ManagedThreadId reading of static data.

Now these are just my assumptions, but if you think about the normal use case, it will be a function call to retrieve IEnumerable and immediately iterate over it once. Thus, in the standard use case, the object:

  • Used once
  • Used in the same stream that was created
  • Short

Thus, this project does not impose any synchronization overhead and sacrifices 4 bytes, which are likely to be used only for a very short period of time.

Of course, in order to prove this, you will need to perform a performance analysis to determine the relative costs and code analysis to prove what the general case is.

+2
source

Source: https://habr.com/ru/post/1383371/


All Articles