I have a rather complicated python program. Inside, it has a logging system that uses the exclusive (LOCK_EX) fcntl.flock to manage the global lock. Effectively, whenever a log message is reset, a global file lock occurs, the message is sent to a file (other than a lock file), and the global file lock is released.
The program also deploys itself several times (after setting up log management). As a rule, everything works.
If the parent process is killed (and the children survive), I sometimes end up in a dead end. All programs are blocked on fcntl.flock () forever. Trying to get a castle outside also blocks forever. I have to kill programs for children to solve the problem.
What makes you wonder is that lsof lock_file does not show the process as commit! Therefore, I cannot understand why the file is blocked by the kernel, but the process is not reported as containing it.
Does the herd have problems with forking? Does the dead parent somehow hold the lock, even if it is no longer in the process table? How do I solve this problem?
source share