Try a tool that tracks system calls, such straceas on Linux or tuscon HP-UX. When a deadlock occurs, you should see that the process hangs in a blocking call. This is not positive evidence. It can be a regular block. Then you need to determine if the block can be allowed for a while or not. This requires knowledge of the resource that the process expects.
Example
There is ... a feature in RHEL4 ... that can lead to a dead end ctime. Find an example program that demonstrates this behavior below:
#include <sys/time.h>
#include <time.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
volatile char *r;
void handler(int sig)
{
time_t t;
time(&t);
r = ctime(&t);
}
int main()
{
struct itimerval it;
struct sigaction sa;
time_t t;
int counter = 0;
memset(&sa, 0, sizeof(sa));
sa.sa_handler = handler;
sigaction(SIGALRM, &sa, NULL);
it.it_value.tv_sec = 0;
it.it_value.tv_usec = 1000;
it.it_interval.tv_sec = 0;
it.it_interval.tv_usec = 1000;
setitimer(ITIMER_REAL, &it, NULL);
while(1) {
counter++;
time(&t);
r = ctime(&t);
printf("Loop %d\n",counter);
}
return 0;
}
This usually comes to a standstill after a couple of thousand iterations. Now attach straceso
strace -s4096 -p<PID>
PID - . , FUTEX_WAIT . ( , RHEL4, , ).