This turned out to be an error in which a LeaveCriticalSection is called without the corresponding EnterCriticalSection. This caused the critical section to reduce LockCount and RecursionCount in the following state (by default, LockCount is -1 and RecursionCount is 0):
0:016> dt _RTL_CRITICAL_SECTION 1092318 _RTL_CRITICAL_SECTION +0x000 DebugInfo : 0x....... _RTL_CRITICAL_SECTION_DEBUG +0x004 LockCount : -2 +0x008 RecursionCount : -1 +0x00c OwningThread : (null) +0x010 LockSemaphore : 0x....... +0x014 SpinCount : 0
When the subsequent EnterCriticalSection was executed, it hung because the RecursionCount was nonzero - the thread can only accept part of the critical section if RecursionCount is 0. However, it increased LockCount (returning it to -1, seen in my original question), just to confuse the questions.
In general, if you see a critical section stopping a thread using LockCount and RecursionCount -1, it means there has been excessive unlocking.
As for the code calling it:
if (SysStringLen(bstrState) > 0) CHECKHR_CS( m_pStateManager->SetState(bstrState), &m_csStateManagerLock );
And the definition of the error checking macro:
#define CHECKHR_CS(x, cs) \ EnterCriticalSection(cs); \ if( FAILED(hr = (x)) ) { \ LeaveCriticalSection(cs); \ return hr; \ } \ LeaveCriticalSection(cs);
The macro lacks curly braces around its contents, so the if statement is not executed, it just skips EnterCriticalSection. Obviously a problem.
source share