Here is the link http://semanchuk.com/philip/ with libraries implementing the posix and system V semaphores. You can use one of them. Beware, although in a situation where the semaphore process dies without it being released, everyone else is stuck. If you are afraid of this, you can use System V Semaphores with UNDO, but they are a bit slower. Also, if you have to use System V shared memory primitives - remember that they live in the kernel and continue to live after the process is completed - you need to explicitly remove them from the system.
If you are not afraid of dying processes and deadlocks of the whole system and processes are connected - you can use python semaphores (they are called posix semaphores.)
The page you linked as a related question (fcntl) does not mean that fcntl is not suitable for cross-thread blocking. He says fntl cares about fds. That way, you can use fcntl for interoperational and inter-threaded locking while you open the lock file and get a new fd for each lock instance.
You can also use a combination of fcntl for interoperation and a python semaphore for inter-thread locking.
And finally: rethink your architecture. Locking is usually bad. Pass the resource to a process that takes care of this without blocking. It will be much easier to maintain. Believe me.
source share