Why is this thread-safe doing lazy initialization in python?

I just read this recipe blog post to lazily initialize an object property. I am a restoring Java programmer, and if this code was translated into java, it would be considered a race condition (double check for blocking). Why does this work in python? I know python has a stream module. Are locks secretly added by the interpreter to make this thread safe?

What does canonical stream initialization look like in Python?

+6
source share
3 answers
  • No, locks are not added automatically.
  • This is why this code is not thread safe.
  • If this works in a multi-threaded program without problems, perhaps because of the Global Interpreter Lock , which makes the danger less likely to occur.
+5
source

This code is not thread safe.

Stream Security Definition

You can check the thread safety by going through the bytecode, for example:

from dis import dis dis('a = [] \n' 'a.append(5)') # Here you could see that it thread safe ## 1 0 BUILD_LIST 0 ## 3 STORE_NAME 0 (a) ## ## 2 6 LOAD_NAME 0 (a) ## 9 LOAD_ATTR 1 (append) ## 12 LOAD_CONST 0 (5) ## 15 CALL_FUNCTION 1 (1 positional, 0 keyword pair) ## 18 POP_TOP ## 19 LOAD_CONST 1 (None) ## 22 RETURN_VALUE dis('a = [] \n' 'a += 5') # And this one isn't (possible gap between 15 and 16) ## 1 0 BUILD_LIST 0 ## 3 STORE_NAME 0 (a) ## ## 2 6 LOAD_NAME 0 (a) ## 9 LOAD_CONST 0 (5) ## 12 BUILD_LIST 1 ## 15 BINARY_ADD ## 16 STORE_NAME 0 (a) ## 19 LOAD_CONST 1 (None) ## 22 RETURN_VALUE 

However, I have to warn that bytecode may change over time, and thread safety may depend on the python you use (cpython, jython, ironpython, etc.)

So, the general recommendation, if you ever need thread safety, uses synchronization mechanisms: locks, queues, semaphores, etc.

Thread-safe version of LazyProperty

Thread safety for the descriptor you mentioned can be summarized as follows:

 from threading import Lock class LazyProperty(object): def __init__(self, func): self._func = func self.__name__ = func.__name__ self.__doc__ = func.__doc__ self._lock = Lock() def __get__(self, obj, klass=None): if obj is None: return None # __get__ may be called concurrently with self.lock: # another thread may have computed property value # while this thread was in __get__ # line below added, thx @qarma for correction if self.__name__ not in obj.__dict__: # none computed `_func` yet, do so (under lock) and set attribute obj.__dict__[self.__name__] = self._func(obj) # by now, attribute is guaranteed to be set, # either by this thread or another return obj.__dict__[self.__name__] 

Canonical Stream Initialization

To canonically initialize a stream sequence, you need to encode a metaclass that receives a lock at creation time and is freed after instantiation:

 from threading import Lock class ThreadSafeInitMeta(type): def __new__(metacls, name, bases, namespace, **kwds): # here we add lock to !!class!! (not instance of it) # class could refer to its lock as: self.__safe_init_lock # see namespace mangling for details namespace['_{}__safe_init_lock'.format(name)] = Lock() return super().__new__(metacls, name, bases, namespace, **kwds) def __call__(cls, *args, **kwargs): lock = getattr(cls, '_{}__safe_init_lock'.format(cls.__name__)) with lock: retval = super().__call__(*args, **kwargs) return retval class ThreadSafeInit(metaclass=ThreadSafeInitMeta): pass ######### Use as follows ######### # class MyCls(..., ThreadSafeInit): # def __init__(self, ...): # ... ################################## ''' class Tst(ThreadSafeInit): def __init__(self, val): print(val, self.__safe_init_lock) ''' 

Something completely different than metaclass solutions

And finally, if you need a simpler solution, just create a generic init lock and instantiate using it:

 from threading import Lock MyCls._inst_lock = Lock() # monkey patching | or subclass if hate it ... with MyCls._inst_lock: myinst = MyCls() 

However, it is easy to forget, which can lead to very interesting debugging times. You can also code a class decorator, but, in my opinion, it would not be better than a metaclass solution.

+2
source

To expand on @ methodnev's answer, here's how to protect the lazy property :

 class LazyProperty(object): def __init__(self, func): self._func = func self.__name__ = func.__name__ self.__doc__ = func.__doc__ self.lock = threading.Lock() def __get__(self, obj, klass=None): if obj is None: return None # __get__ may be called concurrently with self.lock: # another thread may have computed property value # while this thread was in __get__ if self.__name__ not in obj.__dict__: # none computed `_func` yet, do so (under lock) and set attribute obj.__dict__[self.__name__] = self._func(obj) # by now, attribute is guaranteed to be set, # either by this thread or another return obj.__dict__[self.__name__] 
+1
source

Source: https://habr.com/ru/post/909422/


All Articles