Modified time-based caching of a downloaded file using dogpile

I am writing a program that downloads a large file (~ 150 MB) and analyzes the data into a more convenient text file. The loading process and especially the parsing is slow (~ 20 minutes), so I would like to cache the result.

The download result is a bunch of files, and the parsing result is a single file, so I can manually check if these files exist, and if so, check their changed time; however, since I am already using dogpile with the redis backis server to call web services elsewhere in the code, I was wondering if this dog can be used for this?

So my question is: can dogpile be used to cache a file based on its modified time?

+6
source share
1 answer

Why do not you want to divide the program into several parts:

  • Loader

  • parser and saver

  • worker with results

You can use the cache variable to store the value you need, which you will update when the file is updated.

import os import threading _lock_services=threading.Lock() tmp_file="/tmp/txt.json" update_time_sec=3300 with _lock_services: # if file was created more the 50min ago # here you can check if file was updated and update your cache variable if os.path.getctime(tmp_file) < (time.time() - update_time_sec): os.system("%s >%s" %("echo '{}'",tmp_file)) with open(tmp_file,"r") as json_data: cache_variable = json.load(json_data) 
0
source

Source: https://habr.com/ru/post/977704/


All Articles