Why is explicit flow control bad?

In the previous question I did a little fake pa. You see, I read about streams and gave the impression that they were the most delicious things with kiwi hello.

Imagine my confusion when I read things like this:

[T] hreads is a very bad thing. Or at least explicit flow control is bad.

and

Updating the user interface for threads is usually a sign that you are abusing threads.

Since I kill a puppy every time something bothers me, think that this is your chance to return your karma to black ...

How do i use a stream?

+43
multithreading c #
Jun 24 2018-10-10T00:
source share
12 answers

Enthusiam for studying penetration is excellent; Do not misunderstand me. Enthusiasm for using a large number of streams, by contrast, is a symptom of what I call the ā€œdisease of happinessā€.

Developers who have just learned about the strength of threads start asking questions such as "how many threads can you create in one program?" It is more like an English major asking, "how many words can I use in a sentence?" Typical advice for writers is to keep your sentences short and precise, rather than trying to squeeze as many words and ideas into one sentence as possible. Themes are the same; the right question is not "how much can I get off creation?" but rather, "how can I write this program so that the number of threads is the minimum necessary to complete the task?"

Themes solve many problems, it’s true, but they also present huge problems:

  • Performance analysis of multithreaded programs is often extremely complex and deeply controversial. I have seen real world examples in highly multi-threaded programs in which executing a function is faster without slowing down any other function or using more memory reduces the overall system bandwidth. What for? Because streams often look like streets in the city center. Imagine every street and magic to be shorter, without having to re-select a traffic light. Will traffic jams be better or worse? Writing faster functions in multi-threaded programs leads to processor overloads faster.

You want the flows to be like interstate highways: no traffic lights that are very parallel, intersecting in a small amount of very clearly defined, carefully designed points. It is very difficult to do. Most multi-threaded programs are more like dense city cores with freeze frames around the world.

  • Writing your own custom flow control is insanely complicated. The reason is that when you write a regular single-threaded program in a well-designed program, the amount of "global state" you should talk about is usually small. Ideally, you write objects that have well-defined boundaries and that don't care about the control flow that calls their members. You want to call an object in a loop, or a switch or something else, you go straight ahead.

Multithreaded programs with user-controlled flow control require a global understanding of everything that the thread will do, which may affect data that is visible from another thread. To a large extent, you should have the whole program in your head and understand all the possible ways of interaction between the two streams in order to get the right solution and prevent mutual blockages or data corruption. This is a large payment fee and is highly error prone.

  • In essence, threads make your methods false. Let me give you an example. Suppose you have:

    if (! queue.IsEmpty) queue.RemoveWorkItem (). Execute ();

Is this code correct? If it's single threaded, maybe. If it is multithreaded, then what stops the other thread from deleting the last remaining item after making an IsEmpty call? Nothing, that's what. This code, which looks just fine, is a bomb waiting to be released in a multi-threaded program. Basically this code is actually:

if (queue.WasNotEmptyAtSomePointInThePast) ... 

which is obviously pretty useless.

So, suppose you decide to fix the problem by blocking the queue. Is it correct?

 lock(queue) {if (!queue.IsEmpty) queue.RemoveWorkItem().Execute(); } 

This is also not correct. Suppose that execution causes code to run that waits for a resource that is currently blocked by another thread, but this thread expects blocking for the queue - what happens? Both threads wait forever. Including a lock in a piece of code requires that you know everything that the code could do with any shared resource, so that you can decide if there will be any deadlocks. Again, this is an extremely heavy burden to put on someone writing something that should be very simple code. (The right thing that can be done here is probably to extract the work item into the lock, and then execute it outside the lock. But ... what if the items are in the queue because they must be executed in a specific order The code is incorrect because other threads can then complete later tasks.)

  • Everything is getting worse. The C # language specification ensures that a single-threaded program will have observable behavior exactly the same as specified in the program. That is, if you have something like "if (M (ref x)) b = 10; then you know that the generated code will behave as if x is accessing M before b is written Now, the compiler, jitter and processor can freely optimize this. If one of them can determine that M will be true, and if we know that in this stream the value of b is not read after calling M, then b can be assigned before how access to x will be obtained. All that is guaranteed is that the single-threaded program works the way it was written.

Multithreaded programs do not provide this guarantee. If you examine b and x in another thread while this one is running, you can see the b change before x is available if this optimization is done. Reading and writing can logically move forward and backward in time with respect to each other in single-threaded programs, and these movements can be observed in multi-threaded programs.

This means that in order to write multi-threaded programs, where there is a dependence in logic on things that are observed, it happens in the same order as the code actually written, you must have a detailed understanding of the "memory" model of the language and runtime. You need to know exactly what guarantees are made regarding how access can move in time, and you cannot just test your x86 box and hope for the best; x86 chips have quite conservative optimization compared to some other chips there.

This is just a brief overview of several of the problems you encountered while writing your multi-threaded logic. There are many more. So, some tips:

  • Learn about streaming.
  • Do not try to write your own flow control in production code.
  • Use higher-level libraries written by experts to solve problems with threads. If you have a bunch of work that needs to be done in the background, and you want to process it for worker threads, use the thread pool, rather than create your own thread creation logic. If you have a problem that can be solved by several processors at the same time, use a parallel task library. If you want to lazily initialize a resource, use the lazy initialization class, rather than trying to write the lock code yourself.
  • Avoid sharing.
  • If you cannot avoid sharing, exchange an immutable state.
  • If you need to share volatile state, prefer to use locks for locking methods.
+105
Jun 24 '10 at 2:51 p.m.
source share

Explicit flow control is not, in essence, bad, but it is associated with dangers and should not be done if it is absolutely necessary.

Saying threads is very good, since saying that a propeller is very good: propellers work great on airplanes (when jet engines are not the best alternative), but it would not be a good idea to drive a car.

+11
Jun 24 2018-10-06T00:
source share

You cannot evaluate what problems streaming may occur if you are not debugging a trilateral deadlock. Or spent a month chasing a race condition that occurs only once a day. So, go ahead and jump with both legs and make all kinds of mistakes that you need to make in order to learn how to fear the beast and what to do in order to avoid trouble.

+8
Jun 24 2018-10-06T00:
source share

I could not offer a better answer than what is already here. But I can offer a concrete example of some multi-threaded code that we had at my work, which was disastrous.

One of my colleagues, like you, was enthusiastic about threads when I first found out about them. So, in the whole program, the following code appeared:

 Thread t = new Thread(LongRunningMethod); t.Start(GetThreadParameters()); 

Basically, he created threads everywhere.

So, in the end, another employee discovered this and told the responsible developer: do not do this! Creating threads is expensive, you must use a thread pool, etc. etc. So many places in the code, which originally looked like the above snippet, began to be rewritten as:

 ThreadPool.QueueUserWorkItem(LongRunningMethod, GetThreadParameters()); 

A big improvement, right? Everything is fine?

Well, except that LongRunningMethod was a special call in this LongRunningMethod that could block for a long time - . Suddenly, from time to time, we began to see that something our software had to react immediately ... it just wasn’t. In fact, he might not have responded for a few seconds (clarification: I work for a trading company, so it was a complete disaster).

What happened was that the thread pool was actually populating with long blocking calls, which led to another code that was supposed to get in line very quickly and not work until much later than it should.

The moral of this story, of course, is not that the first approach to creating your own threads is the right thing (it’s not). It is true that using streams is tough and error prone, and that, as others have said, you must be very careful when using them.

In our specific situation, many mistakes were made:

  • Creating new threads in the first place was wrong, because it was much more expensive than the developer implemented.
  • The queue of all background jobs in the thread pool was incorrect because it handled all background tasks indiscriminately and did not take into account the possibility of actually blocking asynchronous calls.
  • Having a long locking method in itself was the result of some careless and very lazy use of the lock keyword.
  • Little attention has been paid to ensuring that code that runs in the background thread is thread safe (this was not the case).
  • Not enough attention was paid to the question of whether or not to make a lot of the affected code multithreaded, even from the very beginning. In most cases, the answer was not so: multithreading, the complexity just introduced and errors made the code less clear, and (here kicker): performance suffered.

I am pleased to say that today we are still alive, and our code is in a much healthier state than it once was. And we use multithreading in a large number of places where we decided that this is appropriate and measured productivity gains (for example, reduced latency between receiving a ticket of market data and the presence of an outgoing quote confirmed by the exchange). But we learned some pretty important lessons. Most likely, if you are ever working on a large, multi-threaded system, you too.

+6
Oct 03 2018-10-10
source share

If you cannot write a full kernel scheduler, you will always get the wrong flow control.

Threads can be the most wonderful thing with hot chocolate, but concurrent programming is incredibly complicated. However, if you create your own streams to be independent, you cannot shoot yourself in the foot.

As a rule, for the thumb, if the problem is divided into threads, they should be as independent as possible, with the maximum possible, but clearly defined shared resources, with a minimalistic control concept.

+5
Jun 24 '10 at 13:40
source share

I think that the first statement is best explained by the following: now many modern APIs are available , manually writing your own stream code is almost never necessary. The new APIs are much easier to use, and much harder to mess up !. Whereas in the old tide you should be nice not to spoil. Old-style APIs ( Thread et al.) Are still available, but newer APIs ( parallel task library , Parallel LINQ , and Reactive Extensions ) are the way of the future.

The second statement refers to most of the design point of view, IMO. In a design that has a clean separation of concerns, the background task does not really have to penetrate directly into the user interface in order to report updates. There should be some separation using a template like MVVM or MVC.

+4
Jun 24 '10 at 13:31
source share

I would start by polling this perception:

I read about streams, and I got the impression that these were the most delicious things with kiwi jello.

Don't get me wrong - threads are a very versatile tool, but this degree of enthusiasm seems strange. In particular, this indicates that you can use threads in many situations where they just don't make sense (but again, I could just accept your enthusiasm).

As others have indicated, processing streams is additionally complex and complex. Thread wrappers exist, and only in rare cases should they be handled explicitly. For most applications, threads may be implied.

For example, if you just want to put the calculation in the background, leaving a responsive graphical interface, the best solution is to either use a callback (which makes you think as if the calculation is performed in the background at run time in the same thread) or using a convenient wrapper, such as BackgroundWorker , which accepts and hides all explicit thread processing.

The latter, creating a stream, is actually very expensive. Using a thread pool mitigates this cost because here the runtime creates several threads that are subsequently reused. When people say that explicit flow control is bad, that’s all they can mean.

+3
Jun 24 2018-10-06T00:
source share

Many advanced GUI applications usually consist of two threads, one for the user interface, one (or sometimes more) for processing data (copying files, doing heavy calculations, loading data from a database, etc.).

Processing threads should not update the user interface directly, the user interface should be a black box for them (check Wikipedia for encapsulation).
They simply say, ā€œI finished processingā€ or ā€œI completed task 7 of 9ā€ and called an event or other callback method. The user interface subscribes to the event, checks what has changed, and accordingly updates the interface.

If you upgrade the user interface from a workflow, you will not be able to reuse your code, and you will have big problems if you want to change parts of your code.

+2
Jun 24 '10 at 13:44
source share

I think you should experiment as much as possible with Threads and learn about the benefits and pitfalls of using them. Only through experimentation and use will your understanding of them increase. Read as much as possible on this.

When it comes to C # and the user interface (which is single-threaded, and you can only change the user interface elements in the code executed in the user interface thread). I use the following utility to keep myself sane and sleep soundly at night.

  public static class UIThreadSafe { public static void Perform(Control c, MethodInvoker inv) { if(c == null) return; if(c.InvokeRequired) { c.Invoke(inv, null); } else { inv(); } } } 

You can use this in any thread that needs to change a user interface element, for example:

 UIThreadSafe.Perform(myForm, delegate() { myForm.Title = "I Love Threads!"; }); 
+2
Jun 24 '10 at 2:09 p.m.
source share

Threads are a very good thing, I think. But working with them is very difficult and you need a lot of knowledge and training. The main problem is when we want to access shared resources from two other threads that can cause unwanted effects.

Consider a classic example: you have two threads that get some items from the general list and, after doing something, they remove the item from the list.

A thread method that is called periodically may look like this:

 void Thread() { if (list.Count > 0) { /// Do stuff list.RemoveAt(0); } } 

Remember that threads, in theory, can switch to any line of your code that is not synchronized. Therefore, if the list contains only one element, one stream can pass the condition list.Count , before the list.Remove the stream switch and the other stream pass list.Count (the list still contains one element). Now the first thread continues to list.Remove , and after that the second thread continues to list.Remove , but the last item has already been deleted by the first thread, so the second one fails. Therefore, it must be synchronized using the lock statement, so there can be no situation where two threads are in the if .

Therefore, this is precisely why a user interface that is not synchronized should always run in one thread, and no other thread should interfere with the interface.

In previous versions of .NET, if you want to update the interface in a different thread, you will have to synchronize using Invoke methods, but since it was quite difficult to implement, new versions of .NET come with the BackgroundWorker class, which simplifies things by wrapping all content and allowing you to do asynchronous stuff in the DoWork event and update the interface in the ProgressChanged event.

+1
Jun 24 2018-10-06T00:
source share

The great reason to try to keep the user interface thread and the processing thread as independent as possible is that if the user interface thread hangs, the user will notice and be unhappy. It is very important that the UI thread is blazing fast. If you start moving user interface components from the user interface stream or moving processing materials to the user interface stream, you run the risk of losing your application.

In addition, a lot of the code for the framework is intentionally written with the expectation that you separate the interface and processing; programs will work better when you separate the two outputs, and run into errors and problems when you do not. I don’t remember any specific problems that I encountered as a result of this, although in the past I had vague memories, trying to establish certain properties of the material that the user interface carries outside the user interface, and with the refusal of the code to work; I don’t remember if it compiled or threw an exception.

+1
Jun 24 '10 at 14:45
source share

When updating the user interface from a thread other than the UI, it’s important to note a few words:

  1. If you often use "Invoke", the performance of your non-UI thread can be severely affected if other things make the UI thread run sluggishly. I prefer to avoid using "Invoke" if a non-UI thread needs to wait for the UI action to complete before continuing.
  2. If you use BeginInvoke recklessly for things like tell-tale updates, an excessive number of call delegates may be in the queue, some of which may be pretty worthless at the time they actually appear.

My preferred style in many cases is to have each control state encapsulated in an immutable class, and then have a flag that indicates whether an update is required, not expected or not required, but not expected (the latter situation may occur if the request done to update the control to its full creation). The control update procedure should, if an update is required, start by clearing the update flag, capturing the state, and drawing the control. If the update flag is set, it should loop again. To request another thread, the routine must use Interlocked.Exchange to set the update flag for the update pending and - if it has not been delayed - try the BeginInvoke update procedure; if BeginInvoke does not work, set the update flag to "necessary, but do not expect".

, , , , , , , , . , , BeginInvoke.

0
24 . '10 16:18
source share



All Articles