I am using the Python GAE SDK.
I have some processing that needs to be done on 6000+ instances MyKind. It is too slow to make one request, so I use the task queue. If I make one task process with only one entity, then it only takes a few seconds.
the documentation says that only 100 tasks can be added to a “package”. (What do they mean by this? In one request? In one task?)
So, assuming “package” means “query”, I'm trying to figure out what is the best way to create a task for each object in the data warehouse. What do you think?
It is easier if I can assume that the order MyKindwill never change. (Processing will never change instances MyKind— it only creates new instances of other types.) I could just create many tasks, giving everyone an offset, where to start, at a distance of less than 100 from each other. Then each task can create separate tasks that perform the actual processing.
But what if there are so many objects that the original query cannot add all the necessary planning tasks? This makes me think that I need a recursive solution - each task looks at the range that it sets. If there is only one element in the range, it processes it. Otherwise, he further subdivides the range into subsequent tasks.
( ), , ? 1000 , .
, , ?