Why does sleep (), after acquiring pthread_mutex_lock, block the entire program?

Question

Why does sleep (), after acquiring pthread_mutex_lock, block the entire program?

In my test program, I run two threads, each of which performs the following logic:

1) pthread_mutex_lock() 2) sleep(1) 3) pthread_mutex_unlock()

However, I found that after some time one of the two threads will be blocked on pthread_mutex_lock () forever, and the other thread is working fine. This is a very strange behavior, and I think it can be a serious problem. When using pthread_mutex_t with the Linux manual, sleep () is not prohibited. So my question is: is this a real problem or are there any errors in my code?

The following is a test program. In the code, the output of the first stream is directed to stdout, and the second to stderr. Therefore, we can check these two different outputs to see if the thread is blocked.

I tested it on the linux kernel (2.6.31) and (2.6.9). Both results are the same.

 //======================= Test Program =========================== #include <stdio.h> #include <stdlib.h> #include <errno.h> #include <pthread.h> #define THREAD_NUM 2 static int data[THREAD_NUM]; static int sleepFlag = 1; static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER; static void * threadFunc(void *arg) { int* idx = (int*) arg; FILE* fd = NULL; if (*idx == 0) fd = stdout; else fd = stderr; while(1) { fprintf(fd, "\n[%d]Before pthread_mutex_lock is called\n", *idx); if (pthread_mutex_lock(&mutex) != 0) { exit(1); } fprintf(fd, "[%d]pthread_mutex_lock is finisheded. Sleep some time\n", *idx); if (sleepFlag == 1) sleep(1); fprintf(fd, "[%d]sleep done\n\n", *idx); fprintf(fd, "[%d]Before pthread_mutex_unlock is called\n", *idx); if (pthread_mutex_unlock(&mutex) != 0) { exit(1); } fprintf(fd, "[%d]pthread_mutex_unlock is finisheded.\n", *idx); } } // 1. compile // gcc -o pthread pthread.c -lpthread // 2. run // 1) ./pthread sleep 2> /tmp/error.log # Each thread will sleep 1 second after it acquires pthread_mutex_lock // ==> We can find that /tmp/error.log will not increase. // or // 2) ./pthread nosleep 2> /tmp/error.log # No sleep is done when each thread acquires pthread_mutex_lock // ==> We can find that both stdout and /tmp/error.log increase. int main(int argc, char *argv[]) { if ((argc == 2) && (strcmp(argv[1], "nosleep") == 0)) { sleepFlag = 0; } pthread_t t[THREAD_NUM]; int i; for (i = 0; i < THREAD_NUM; i++) { data[i] = i; int ret = pthread_create(&t[i], NULL, threadFunc, &data[i]); if (ret != 0) { perror("pthread_create error\n"); exit(-1); } } for (i = 0; i < THREAD_NUM; i++) { int ret = pthread_join(t[i], (void*)0); if (ret != 0) { perror("pthread_join error\n"); exit(-1); } } exit(0); }

This is the conclusion:

On the terminal where the program is running:

 root@skyscribe :~# ./pthread sleep 2> /tmp/error.log [0]Before pthread_mutex_lock is called [0]pthread_mutex_lock is finisheded. Sleep some time [0]sleep done [0]Before pthread_mutex_unlock is called [0]pthread_mutex_unlock is finisheded. ...

On another terminal, to see the /tmp/error.log file

 root@skyscribe :~# tail -f /tmp/error.log [1]Before pthread_mutex_lock is called

And no new lines are output from / tmp / error.log

+4

linux unix pthreads

user1040933 Nov 11 '11 at 7:55

source share

2 answers

jilles · Answer 1 · 2011-11-11T12:49:56+0000

This is the wrong way to use mutexes. A thread should not contain mutexes longer than it does not belong, especially if it sleeps while holding the mutex. There is no FIFO guarantee for locking a mutex (for performance reasons).

More specifically, if thread 1 unlocks the mutex while thread 2 is waiting for it, it makes thread 2 operational, but this does not force the scheduler to not execute thread 1 or immediately start thread 2. Most likely, this will not happen because thread 1 has recently slept. When thread 1 subsequently reaches the pthread_mutex_lock() call, it will usually be allowed to immediately block the mutexes, even if there is a thread wait (and the implementation may know). When thread 2 wakes up after this, it will find a mutex that is already locked and will fall asleep again.

The best solution is not to keep the mutexes for long. If this is not possible, consider moving the lock operations to one thread (eliminating the need for blocking) or waking up the correct thread using condition variables.

chill · Answer 2 · 2011-11-11T19:34:16+0000

There is no problem or error in the code, but a combination of buffering and scheduling effects. Add fflush here:

  fprintf (fd, "[%d]pthread_mutex_unlock is finisheded.\n", *idx); fflush (fd);

and run

 ./a.out >1 1.log 2> 2.log &

and you will see fairly equal progress made by the two streams.

EDIT : and, like @jilles above, it is assumed that the mutex is a short wait lock, as opposed to long wait like waiting for a state variable, waiting for I / O or sleep. This is why a mutex is not a cancellation point either.

Why does sleep (), after acquiring pthread_mutex_lock, block the entire program?

More articles: