App Engine backup never ends, just a hint - a map error reduce worker_callback

Over the past few weeks, we have repeatedly failed to perform a full backup of the data warehouse using the data warehouse administration tool. We thought that the problems were due to the quota errors that we encountered, so we switched our application from a free to a paid application, and we still have problems.

Every time we try to back up to blobstore, and what happens, the process does not end. We see a backup in our list of pending backups, but it never completes. Now we have only 43 MB of data, so we do not see it as a data transfer problem. Looking at our default task queues, it shows that we have two tasks waiting: one is calling / _ah / mapreduce / controller_callback, and the other is calling / _ah / mapreduce / worker_callback

Callback worker counts the retry counter, and the only error we have is the Previous Run tab, on which the last HTTP response code should be 500. There is no error message, nothing is displayed in our error logs, it just keeps trying again and again.

We were able to narrow down the backup problems to a specific entity type for a particular namespace, but we cannot understand why this entity type fails, while others do not. The main difference is that the type of the object has a large number of built-in objects, but if the application engine is able to read / put these objects, we cannot understand why it has problems with its backups. The particular namespace in which the error occurs contains the largest data stored for this type of entity compared to the other settings that we set.

We think that if we see what error occurs in work_callback, we can understand why the backup fails, or what is wrong with our data that prevents the backup. Is there anything we need to configure / enable through settings / configuration files to provide us with more detailed backup information? Or is there some other way that we must study to figure out how to investigate / fix this problem?

I should mention that we use the Java SDK, as well as Objectify V3 ​​to work with the data warehouse. We also reserve data in Blobstore.

Thanks.

+4
source share
2 answers

Well, with the application development team team, we found out what the problem was, and we worked on the problem. I want to give details if anyone else comes across this problem.

From issue 8363 , the application development team indicated that they could see from their magazines that map reduction was not possible due to the large number of properties that are our kind of entity. The particular type of object that caused the failure had a large number of variable properties that generated errors when the map reduction attempted to write out a circuit. They indicated that the solution at their end was to ignore objects that looked like it in the backup in order to make the backup work successfully.

What we did to get around this problem and do the backup job has changed, as we said to objectify the storage of data. A large number of properties were created by using the @embedded keyword in the member field of the HashMap () class. Since the built-in keyword breaks classes into separate components, it generates a large number of properties. We changed the member field to @serialized, and then performed the conversion process to use the new serialized property. This again did the backup / restore operation.

You can learn more about the differences between embedded and serialized on objectify website.

+3
source

snielson, could you open the problem in our Publica tracker system here . Be sure to add your application identifier so that we can further debug this specific scenario.

Thanks!

0
source

Source: https://habr.com/ru/post/1441677/


All Articles