I am running a Flask server that loads data into a MongoDB database. Since there is a lot of data and it takes a lot of time, I want to do this with a background job.
I use Redis as a message broker and Python-rq to implement job queues. All code runs on Heroku.
As I understand it, python-rq uses pickle to serialize the function being executed, including parameters, and adds it along with other values to the Redis hash value.
Since the parameters contain information that will be stored in the database, it is quite large (~ 50 MB), and when it is serialized and stored in Redis, it not only takes a noticeable amount of time, but also consumes a large amount of memory. Heroku Redis plans cost only $ 30 p / m for 100MB. In fact, I often get OOM errors, for example:
OOM command not allowed when used memory > 'maxmemory'.
I have two questions:
- Is python-rq well suited for this task, or maybe JSON celery serialization may be more appropriate?
- Is there a way not to serialize the parameter, but rather a link to it?
Your thoughts on the best solution are greatly appreciated!
source
share