Let's say that ten people had to share a pen (maybe they work in a company really burdened with money). Since they should write long documents with a pen, but most of the work of writing a document is just to think about what to say, they agree that everyone can use the pen to write one sentence of the document, and then make it available to the rest groups.
Now we have a problem: what if two people think about the next sentence, and both want to use the pen right away? We could just say that both people can grab a pen, but it's a fragile old pen, so if two people grab it, it will break. Instead, we draw a line of chalk around the pen. First you put your hand on the chalk line, then take a pen. If one person’s hand is inside the chalk line, no one else can put his hand in the chalk line. If two people try to transfer their hand to the chalk line at the same time, according to these rules, only one of them will first get inside the chalk line, so that the other should extend the hand and hold it outside the chalk line until the pen is accessible again .
Let this apply to mutexes. A mutex is a way to protect a shared resource (pen) for a short period of time called a critical sector (time of writing one sentence of a document). Whenever you want to use a resource, you agree to call mutex_lock
first (put your hand inside the chalk line). When you finish working with the resource, you agree to call mutex_unlock
(take your hand out of the chalk line area).
Now how to implement mutexes. A mutex is usually implemented with shared memory. There is some common opaque data object called a mutex, and the mutex_lock
and mutex_unlock
both have a pointer to one of them. The mutex_lock
function checks and modifies data inside the mutex using a sequence of atomic test and specified / loading / storage-conditional commands (often used on x86, xhcg
), and either “acquires the mutex” - sets the contents of the mutex object to point to other threads which critical section is blocked, or must wait. In the end, the thread receives the mutex, does the work inside the critical section, and calls mutex_unlock
. This function sets the data inside the mutex to mark it as accessible, and possibly wakes up sleeping threads that the mutex tried to receive (it depends on the mutex implementation - some mutex_lock
implementations simply drag out in tight xchg
browsing until the mutex is available, so no need mutex_unlock
notify anyone).
Why will a mutex lock be faster than going to memory? In short, caching. The processor has a cache that can be accessed very quickly, so xchg
operations xchg
not need to go completely into memory until the processor can verify that another processor is not accessing this data. But x86 has the concept of "owning" the cache line - if processor 0 owns the cache line, any other processor that wants to use the data in this cache line must go through processor 0. Thus, xhcg
not necessary to view any data outside the cache , and access to the cache tends to be very fast, so getting an unidentified mutex is faster than accessing memory.
There is one caveat in this last paragraph: the speed advantage is only supported for unprotected mutex locking. If two threads try to block the same mutex at the same time, the processors that work with these threads must interact and own the corresponding cache line, which significantly slows down the receipt of the mutex. In addition, one of the two threads will have to wait until the other thread executes the code in the critical section, and then release the mutex, which will further slow down the collection of the mutex for one of the threads.