Python: exec statement and garbage collection unexpected behavior

I found a problem with exec (This happened on a system that must be extensible with user-written scripts). I could reduce the problem to this code:

 def fn(): context = {} exec ''' class test: def __init__(self): self.buf = '1'*1024*1024*200 x = test()''' in context fn() 

I expected the memory to be freed by the garbage collector after calling the fn function. However, the Python process still consumes an additional 200 MB of memory, and I absolutely do not understand what is happening here and how to free the allocated memory manually.

I suspect that class definition inside exec is not a very bright idea, but first of all, I want to understand what happens in the above example.

It seems like instantiating the instance class in another function solves the problem, but what is the difference?

 def fn(): context = {} exec ''' class test: def __init__(self): self.buf = '1'*1024*1024*200 def f1(): x = test() f1() ''' in context fn() 

This is my version of the Python interpreter:

 $ python Python 2.7 (r27:82500, Sep 16 2010, 18:02:00) [GCC 4.5.1 20100907 (Red Hat 4.5.1-3)] on linux2 
+6
source share
2 answers

The reason you see it takes up to 200 MB of memory longer than you expect, because you have a reference loop: context is a dict referring to both x and test . x refers to an instance of test that refers to test . test has an attribute attribute test.__dict__ , which contains the __init__ function for the class. The __init__ function, in turn, refers to global variables that were defined with the - dict, which you passed to exec , context .

Python will break these reference loops for you (since nothing is related to the __del__ method), but this requires gc.collect() . gc.collect() will automatically start every N distributions ( gc.set_threshold() is defined), so the leak will disappear at some point, but if you want it to go away right away, you can run gc.collect() yourself or break the reference cycle before exiting the function. You can easily do the latter by calling context.clear() - but you should understand that this affects all instances of the class that you created in it.

+5
source

I do not think that the problem is related to exec - the garbage collector simply does not activate. If you extract exec'd code into the main application, both methods give the same behavior as with exec :

 class test: def __init__(self): self.buf = '1'*1024*1024*200 x = test() # Consumes 200MB class test: def __init__(self): self.buf = '1'*1024*1024*200 def f1(): x = test() f1() # Memory get collected correctly 

The differences between the two methods are that in the second case, the local area changes when f1() called, and I think the garbage collector lights up when x goes out of scope because the function returns control to return to the main script. If the scope does not change, the garbage collector waits until the difference between the number of distributions and the number of deallocations exceeds its threshold value (on my machine, the default threshold of 700 is to start Python 2.7).

We can understand a little what is happening:

 import sys import gc class test: def __init__(self): self.buf = '1'*1024*1024*200 x = test() print gc.get_count() # Prints (168, 8, 0) 

So we see that the garbage collector fires many times, but for some reason does not collect x . If you check with a different version:

 import sys import gc class test: def __init__(self): self.buf = '1'*1024*1024*200 def f1(): x = test() f1() print gc.get_count() # Prints (172, 8, 0) 

In this case, we know that he managed to collect x . Thus, it seems that when x declared in the global scope, it retains some circular reference to itself, which prevents its collection. We can always use del x for manual forced collection, but of course, this is not ideal. If you use gc.get_referrers(x) , we can see which objects are still related to x , and perhaps this will make it clear how to stop this.

I know that I really did not solve the problem, but I hope this helped you in the right direction. I will remember this problem if I find something later.

0
source

Source: https://habr.com/ru/post/890172/


All Articles