Dead end with pasture, fork and parental completion

I have a rather complicated python program. Inside, it has a logging system that uses the exclusive (LOCK_EX) fcntl.flock to manage the global lock. Effectively, whenever a log message is reset, a global file lock occurs, the message is sent to a file (other than a lock file), and the global file lock is released.

The program also deploys itself several times (after setting up log management). As a rule, everything works.

If the parent process is killed (and the children survive), I sometimes end up in a dead end. All programs are blocked on fcntl.flock () forever. Trying to get a castle outside also blocks forever. I have to kill programs for children to solve the problem.

What makes you wonder is that lsof lock_file does not show the process as commit! Therefore, I cannot understand why the file is blocked by the kernel, but the process is not reported as containing it.

Does the herd have problems with forking? Does the dead parent somehow hold the lock, even if it is no longer in the process table? How do I solve this problem?

+4
source share
1 answer

lsof almost certainly just does not show flock() locks, so without seeing that it says nothing about whether it exists.

flock() locks are inherited via the fd-sharing ( dup() system call or fork-and-exec, which leaves the file open), and anyone with a common handle can unlock the lock, but if the lock is already saved, any attempt to lock it will block again. So yes, it is likely that the parent blocked the handle and then died, leaving the handle locked. Then the child process also tries to lock and block because the handle is already locked. (The same thing happens if the child process locks the file and then dies.)

Since the `fcntl () 'locks are per-process, the dying process releases all its locks, so you can continue with what you need here.

+1
source

Source: https://habr.com/ru/post/1394330/


All Articles