Memory leak issue using pandas

I have a memory leak problem with pandas Dataframe. Apparently this is a problem with information: Memory leak using pandas dataframe

The tricks used in the answer (use gc.collectfor manual garbage collection and free memory) work, but rather slowly.

My problem is that I need to run this loop with a frequency of 500 Hz:

  • without garbage collector: memory leak, but 0.3-0.4ms / loop
  • with gc.collect () in the loop: 11ms / loop !!!

(tested for 1000 cycles, with time.time(): may be inaccurate, but gives a good idea of ​​the problem)

My question is: what are the other alternatives gc.collectthat works just fine, but too slow. I cannot call it once every 1000 cycles, because this particular cycle will be extremely slow and I need a reliable frequency.

The code I use for testing is as follows:

import pandas as pd
import os
import gc
from multiprocessing import Process,Pipe
import time

a,b=Pipe()

def sender(a): # this one does not leak
    print "sender :", os.getpid()
    while True:
        Data=pd.DataFrame([[1.,2.,3.]],columns=['a','b','c'])
        a.send(Data)


def main(b):  ### this one cause a memory "leak" !!!!! only when the pipe is on
    try:
        print "receiver :", os.getpid()
        i=0
        #t=time.time() # for timing purpose
        while True:
            Data=b.recv()
            cmd=Data['a'].values[0]
            i+=1
            #gc.collect() # remove the memory leak, but slooooooow
            #if i%1000==0: # loop for timing purpose
                #t1=time.time()
                #print i
                #print (t1-t)/1000
                #t=t1
    except (Exception,KeyboardInterrupt) as e:
        print "Exception : ", e
        raise

try:
    p=Process(target=main,args=(b,))
    q=Process(target=sender,args=(a,))

    p.start()
    q.start()

except (Exception,KeyboardInterrupt) as e:
    print "Exception in main : ", e
    p.terminate()
    q.terminate()
+4
source share

Source: https://habr.com/ru/post/1613528/


All Articles