I have programmed a scanner that searches for specific files on all the hard drives of the system that is being scanned. Some of these systems are quite old, running Windows 2000 with 256 or 512 MB of RAM, but the file system structure is complex, as some of them serve as file servers.
I use os.walk () in my script to parse all directories and files.
Unfortunately, we noticed that after some scanning time, the scanner consumes a lot of RAM, and we found out that only the os.walk function uses about 50 MB of RAM after 2 hours of walking through the file system. This RAM usage increases over time. After 4 hours of scanning, we had about 90 MB of RAM.
Is there any way to avoid this behavior? We also tried "betterwalk.walk ()" and "scandir.walk ()". The result was the same. Do we need to write our own walk function that deletes already scanned directory and file files from memory so that the garbage collector can delete them from time to time?

thanks
source share