How to get notified when a thread was interrupted for some error

Question

How to get notified when a thread was interrupted for some error

I am working on a program with a fixed number of threads in C using posix threads.

How can I get a notification when a thread has been interrupted due to some error?

Is there a signal to detect it?

If so, can the signal handler create a new thread to keep the number of threads the same?

+6

c multithreading posix signals

Codered May 04 '12 at 9:23

source share

6 answers

Ed heal · Answer 1 · 2012-05-04T09:28:11+0000

Disable streams.
Get them to handle errors competently. those. close mutexes, files, etc.

Then you will not have any traffic jams.

Perhaps the fire signal USR1 in the main stream to say that things have become pear-shaped (I was going to say that boobs!)

Mahmoud Al-Qudsi · Answer 2 · 2012-05-04T09:41:17+0000

Create your threads by passing function pointers to an intermediate function. Run this intermediate function asynchronously and synchronously call the passed function. When a function returns or throws an exception, you can handle the results in any way.

Pavan manjunath · Answer 3 · 2012-05-04T10:53:54+0000

With the last data you entered, I suggest you do something like this to get the number of threads that started a particular process -

#include<stdio.h> #define THRESHOLD 50 int main () { unsigned count = 0; FILE *a; a = popen ("ps H `ps -A | grep a.out | awk '{print $1}'` | wc -l", "r"); if (a == NULL) printf ("Error in executing command\n"); fscanf(a, "%d", &count ); if (count < THRESHOLD) { printf("Number of threads = %d\n", count-1); // count - 1 in order to eliminate header. // count - 2 if you don't want to include the main thread /* Take action. May be start a new thread etc */ } return 0; }

Notes :

ps H displays all streams.
$1 prints the first column where the PID is displayed on my Ubuntu system. Column number may vary by system
Replace a.out its process name
Backticks will evaluate the expression inside them and give you the PID of your process. We use the fact that all POSIX threads will have the same PID.

Shahbaz · Answer 4 · 2012-05-04T11:40:19+0000

I doubt that Linux will signal you when a thread dies or exits for some reason. You can do it manually though.

First, consider two ways to end a stream:

He completes himself
He is dying

In the first method, the thread itself can tell someone (say, the thread manager) that it is ending. Then the thread manager will create another thread.

In the second method, the watchdog thread can monitor if the threads are alive. This is done like this:

 Thread: while (do stuff) this_thread->is_alive = true work Watchdog: for all threads t t->timeout = 0 while (true) for all threads t if t->is_alive t->timeout = 0 t->is_alive = false else ++t->timeout if t->timeout > THRESHOLD Thread has died! Tell the thread manager to respawn it

alk · Answer 5 · 2012-05-04T15:10:11+0000

If for some reason it was impossible to use Ed Heal "just work correctly" approach (this is my favorite answer to the OP question, by the way), a lazy fox can take a look at pthread_cleanup_push() and pthread_cleanup_pop() macros and think about including the whole body of the function flow between these two macros.

Alexis wilke · Answer 6 · 2016-05-10T00:20:03+0000

The pure way to find out if a thread is calling pthread_join() against that thread.

 // int pthread_join(pthread_t thread, void **retval); int retval = 0; int r = pthread_join(that_thread_id, &retval); ... here you know that_thread_id returned ...

The problem with pthread_join() is that if the thread never returns (continues to work as expected), then you are blocked. Therefore, this is not very useful in your case.

However, you can check if you can join (tryjoin) as follows:

 //int pthread_tryjoin_np(pthread_t thread, void **retval); int retval = 0; int r = pthread_tryjoin_np(that_thread_id, &relval); // here 'r' tells you whether the thread returned (joined) or not. if(r == 0) { // that_thread_id is done, create new thread here ... } else if(errno != EBUSY) { // react to "weird" errors... (maybe a perror() at least?) } // else -- thread is still running

There is also a synchronized connection that will wait for a certain amount of time, like a few seconds. Depending on the number of threads to check, and if your main process just sits elsewhere, this might be the solution. Block thread 1 for 5 seconds, then thread 2 for 5 seconds, etc., which will be 5000 seconds per cycle for 1000 threads (about 85 minutes to bypass all threads taking into account the time required to manage things .. .)

The manual page has sample code that shows how to use the pthread_timedjoin_np () function. All you have to do is set up a for loop to check each of your threads.

 struct timespec ts; int s; ... if (clock_gettime(CLOCK_REALTIME, &ts) == -1) { /* Handle error */ } ts.tv_sec += 5; s = pthread_timedjoin_np(thread, NULL, &ts); if (s != 0) { /* Handle error */ }

If your main process has other things to do, I would suggest that you don't use the temporary version and just go through all the threads as fast as you can.

How to get notified when a thread was interrupted for some error

More articles: