There is a class in my application that never creates garbage after creating an instance. How could this be?
So far I have ruled out:
- Object references (for example, caches, see the test below, which takes a snapshot of all living objects, then tries to find a backward link from leaking objects to any object in this snapshot)
- Irrevocable cycles of objects (
gc.garbage
empty and nothing should be in the graph __del__
, see below) - Coercing a collection (see test below;
gc.collect()
called until it returns 0
) - Features of the Django / Django model (I tested other models, and they do not flow like this one).
- C extensions or other low-level tricks (as far as I know, NumPy and Pandas are the only extensions that relate to this object graph)
The class in question is one of the main business logic models in the application, and there are a large number of reference cycles in the instances (the method recalculate()
creates a cyclic graph with ~ hundreds of nodes), but this should be understandable for garbage collection.
So! What am I missing? How could unregistered instances of this class remain?
Update : recount from sys.getrefcount(…) = 16
more than len(gc.get_referrers(…)) = 15
, so there should be a pointer outside the earth gc
(thanks for the suggestion, Thomas Wouters )
Test case that shows a leak:
import gc
import weakref
import objgraph
from myapp import BusinessClass
def find_live_objects(cls):
""" Returns all live objects of type ``cls``. """
return [
weakref.ref(o)
for o in gc.get_objects()
if type(o) == cls
]
def test():
a = BusinessClass.objects.get(id=1)
a.recalculate()
del a
live_now = list(gc.get_objects())
live_set = set(id(x) for x in live_now)
print "Live objects:", len(live_set)
a = BusinessClass.objects.get(id=2)
a.recalculate()
del a
print "gc:", gc.collect()
print "gc:", gc.collect()
print "gc:", gc.collect()
print "Garbage:", gc.garbage
live_list = find_live_objects(BusinessClass)
print "Found:", [x() for x in live_list]
live = live_list[1]
print "Searching for:", live()
chain = objgraph.find_backref_chain(
live(),
(lambda x: id(x) in live_set),
max_depth=999999,
)
print "Chain:", chain
test()
And at startup:
$ python find-leaks.py
Live objects: 132062
gc: 21
gc: 0
gc: 0
Garbage: []
Found: [BusinessClass(id=1), BusinessClass(id=2)]
Searching for: BusinessClass(id=2)
Chain: [BusinessClass(2)]
, .
:
$ python
Python 2.7.10 (default, Jul 14 2015, 19:46:27)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.39)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import django
>>> django.VERSION
(1, 6, 11, 'final', 0)