Python object does not collect garbage, but does not reference and has no ability

There is a class in my application that never creates garbage after creating an instance. How could this be?

So far I have ruled out:

  • Object references (for example, caches, see the test below, which takes a snapshot of all living objects, then tries to find a backward link from leaking objects to any object in this snapshot)
  • Irrevocable cycles of objects ( gc.garbageempty and nothing should be in the graph __del__, see below)
  • Coercing a collection (see test below; gc.collect()called until it returns 0)
  • Features of the Django / Django model (I tested other models, and they do not flow like this one).
  • C extensions or other low-level tricks (as far as I know, NumPy and Pandas are the only extensions that relate to this object graph)

The class in question is one of the main business logic models in the application, and there are a large number of reference cycles in the instances (the method recalculate()creates a cyclic graph with ~ hundreds of nodes), but this should be understandable for garbage collection.

So! What am I missing? How could unregistered instances of this class remain?

Update : recount from sys.getrefcount(…) = 16more than len(gc.get_referrers(…)) = 15, so there should be a pointer outside the earth gc(thanks for the suggestion, Thomas Wouters )

Test case that shows a leak:

import gc
import weakref
import objgraph
from myapp import BusinessClass

def find_live_objects(cls):
    """ Returns all live objects of type ``cls``. """
    return [
        weakref.ref(o)
        for o in gc.get_objects()
        if type(o) == cls
    ]

def test():
    # Load and delete a similar object in case there are
    # and class-specific caches hiding somewhere.
    # Note: the results are the same without this.
    a = BusinessClass.objects.get(id=1)
    a.recalculate()
    del a

    # Snapshot all live objects
    live_now = list(gc.get_objects())
    live_set = set(id(x) for x in live_now)
    print "Live objects:", len(live_set)

    # Create and delete the object we're interested in
    a = BusinessClass.objects.get(id=2)
    a.recalculate()
    del a

    print "gc:", gc.collect()
    print "gc:", gc.collect()
    print "gc:", gc.collect()
    print "Garbage:", gc.garbage

    live_list = find_live_objects(BusinessClass)
    print "Found:", [x() for x in live_list]

    live = live_list[1]
    print "Searching for:", live()

    chain = objgraph.find_backref_chain(
        live(),
        (lambda x: id(x) in live_set),
        max_depth=999999,
    )
    print "Chain:", chain

test()

And at startup:

$ python find-leaks.py
Live objects: 132062
gc: 21
gc: 0
gc: 0
Garbage: []
Found: [BusinessClass(id=1), BusinessClass(id=2)]
Searching for: BusinessClass(id=2)
Chain: [BusinessClass(2)]

, .

:

$ python
Python 2.7.10 (default, Jul 14 2015, 19:46:27) 
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.39)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import django
>>> django.VERSION
(1, 6, 11, 'final', 0)
+4

Source: https://habr.com/ru/post/1613600/


All Articles