Python os.walk memory issue

I have programmed a scanner that searches for specific files on all the hard drives of the system that is being scanned. Some of these systems are quite old, running Windows 2000 with 256 or 512 MB of RAM, but the file system structure is complex, as some of them serve as file servers.

I use os.walk () in my script to parse all directories and files.

Unfortunately, we noticed that after some scanning time, the scanner consumes a lot of RAM, and we found out that only the os.walk function uses about 50 MB of RAM after 2 hours of walking through the file system. This RAM usage increases over time. After 4 hours of scanning, we had about 90 MB of RAM.

Is there any way to avoid this behavior? We also tried "betterwalk.walk ()" and "scandir.walk ()". The result was the same. Do we need to write our own walk function that deletes already scanned directory and file files from memory so that the garbage collector can delete them from time to time?

resource usage over time - second row is memory

thanks

+6
source share
3 answers

Have you tried the glob module?

import os, glob def globit(srchDir): srchDir = os.path.join(srchDir, "*") for file in glob.glob(srchDir): print file globit(file) if __name__ == '__main__': dir = r'C:\working' globit(dir) 
+1
source

If you work in the os.walk loop, del() all that you no longer need. And try running gc.collect() at the end of each os.walk iteration.

0
source

Generators are the best solutions, since they do lazy calculations, here is one example of implementation.

 import os import fnmatch #this may or may not be implemented def list_dir(path): for name in os.listdir(path): yield os.path.join(path, name) #modify this to take some pattern as input def os_walker(top): for root,dlist,flist in os.walk(top): for name in fnmatch.filter(flist, '*.py'): yield os.path.join(root, name) all_dirs = list_dir("D:\\tuts\\pycharm") for l in all_dirs: for name in os_walker(l): print(name) 

Thanks David Bezley

0
source

Source: https://habr.com/ru/post/971482/


All Articles