Build a plausible answer from the comments - so this is a Wiki community from the start. (If Olya gives an answer, vote for it!)
Oli Charlesworth gave what is probably the core of the problem:
- I suspect that you created a race condition in the opposite direction to what you expected. The child sent SIGUSR1 to the parent before the parent reached
pause() .
ouah clearly stated:
- An object shared by a signal handler and non-handler code (your logical objects) must be of type
volatile sig_atomic_t , otherwise the code is undefined.
However, POSIX allows a bit more weakness than the C standard for what can be done inside the signal handler. It can also be noted that C99 provides <stdbool.h> for determining the type of bool .
The original poster commented:
I don’t see how I can make sure that the parent enters the pause() call first without using sleep() in the child (which does not guarantee anything). Any ideas?
Suggestion: use usleep() (μ-sleep or sleep in microseconds) or nanosleep() (sleep in nanoseconds)?
Or use another synchronization mechanism, for example:
- the parent process creates a FIFO;
- fork();
- child opens FIFO for writing (lock until the reader appears);
- the parent opens the FIFO for reading (lock until a record appears);
- when unlocked, since
open() returned, both processes simply close the FIFO; - the parent deletes the FIFO.
Note that there is no data transfer through FIFO between the two processes; the code simply relies on the kernel to block processes until a reader and writer appear, so both processes are ready to go.
Another possibility is that the parent process could try if (siguser1setted == FALSE) pause(); reduce the race status window. However, it only reduces the window; it does not guarantee that a race condition cannot occur. That is, Murphy's law is applied, and a signal can come between the time the test completes and the time pause() .
All this suggests that signals are not a very good IPC mechanism. They can be used for IPC, but they are rarely used for synchronization.
By the way, there is no need to check the return value of any of the exec*() function families. If the system call returns, it failed.
And again he asked the question:
Isn't it better to use POSIX semaphores shared between processes?
Semaphores would certainly be another working mechanism for synchronizing two processes. Since I, of course, should look at the manual pages for semaphores, while I remember how to use FIFO without looking, I’m not sure that I really use them, but creating and deleting FIFOs has its own set of problems, so it’s not clear that it is in some way “better” (or “worse”); just different. mkfifo() , open() , close() , unlink() for FIFO compared to sem_open() (or sem_init() ), sem_post() , sem_wait() , sem_close() and possibly sem_unlink() (or sem_destroy() >) for semaphores. Perhaps you should consider registering the FIFO remove or Semaphore Cleanup atexit() with atexit() to make sure that the FIFO or semaphore is destroyed as many times as possible. However, perhaps OTT for the test program.