Explanation / Solution Deadlock Delphi

In the server application, we have the following: A class called JobManager, which is a singleton. Another class is the Scheduler, which continues to check whether it is time to add any jobs to the JobManager.

When the time comes, the Scheduler will do something like:

TJobManager.Singleton.NewJobItem(parameterlist goes here...); 

At the same time, in the client application, the user is doing something that causes a call to the server. Inside, the server sends the message to itself, and one of the classes listening to this message is the JobManager. JobManager processes the message and knows that it is time to add a new job to the list by calling its own method:

 NewJobItem(parameter list...); 

In the NewJobItem method, I have something like this:

  CS.Acquire; try DoSomething; CallAMethodWithAnotherCriticalSessionInternally; finally CS.Release; end; 

It happens that at this moment the system reaches a dead end (CS.Acquire). The communication between the client and server applications is through Indy 10. I think the RPC call, which runs the server application method that sends the message to JobManager, works in the context of Indy Thread.

The Scheduler has its own thread, which makes a direct call to the JobManager method. Is this situation prone to deadlocks? Can someone help me understand why there is a dead end here?

We knew that sometimes, when the client performed a certain action, it caused the system to block, then I could finally find out this point where the critical section in the same class is reached twice, from different points (Scheduler and JobManager message handler method).

Additional Information

I want to add that (it might be stupid, but anyway ...) inside DoSomething there is still

  CS.Acquire; try Do other stuff... finally CS.Release; end; 

Does this internal CS.Release do anything for external CS.Acquire? If so, this could be the point at which the Scheduler enters the Critical section, and all locks and unlocks become riots.

+6
source share
1 answer

There is not enough information about your system to be able to tell you definitively if your JobManager and scheduler cause a lock, but if they both call the same NewJobItem method, then this should not be a problem, since they will both acquire locks in the same order .

For your question, if your NewJobItem CS.acquire and DoSomething CS.acquire interact with each other: it depends. If the lock object used in both methods is different, then the two calls should not be independent. If it is the same object, then it depends on the type of lock. If you lock the locks for re-entry (for example, they allow you to acquire multiple calls from a single thread and count how many times they have been acquired and released), then this should not be a problem. On the other hand, if you have simple lock objects that do not support reentry, then DoSomething CS.release can release your lock for this stream, and then CallAMethodWithAnotherCriticalSessionInternally will work without the CS lock protection that was acquired in NewJobItem.

Deadlocks occur when two or more threads start, and each thread waits for the other thread to complete the current job before it can continue to exist.

Example:

 Thread 1 executes: lock_a.acquire() lock_b.acquire() lock_b.release() lock_a.release() Thread 2 executes: lock_b.acquire() lock_a.acquire() lock_a.release() lock_b.release() 

Note that the locks in thread 2 are received in the opposite order from thread 1. Now, if thread 1 receives lock_a and then interrupts, and thread 2 now starts and receives lock_b, and then starts waiting for lock_a to be available before this can continue . Then thread 1 continues to work, and the next thing it does is try to acquire lock_b, but it is already busy with thread 2, and so it waits. Finally, we are in a situation where thread 1 waits for thread 2 to release lock_b, and thread 2 expects thread 1 to release lock_a.

This is a dead end.

There are several common solutions:

  • Use only one common global lock in all your code. Thus, it is not possible for two threads to expect two locks. This keeps your code waiting a long time for the lock to be available.
  • Only ever let your code hold one lock at a time. This is usually too difficult to control since you may not know or control the behavior of method calls.
  • Allows only your code to acquire several locks at the same time and release them all at the same time, as well as block new locks while you already have locks.
  • Make sure all locks are received in the same global order. This is a more common method.

In solution 4. you need to carefully program and always make sure that you acquire locks / critical sections in the same order. To help with debugging, you can place a global order on all locks on your system (for example, just a unique integer for each lock), and then throw an error if you try to get a lock with a lower rating that blocks the current thread (for example, if new_lock.id <lock_already_acquired.id then throw exception)

If you cannot enable global debugging help to help find which locks were acquired out of order, I would suggest that you find all the places in your code that you purchased and just print the debugging message with the current time, the method that causes the receipt / release, thread id, and lock id that is running. Also do the same with all release challenges. Then start your system until you get a dead end and find in your log file which locks were acquired by those threads and in what order. Then decide which thread is accessing it, blocking in the wrong order and changing it.

+2
source

Source: https://habr.com/ru/post/887783/


All Articles