Are There Any Max Replays For Kubernetes Jobs?

I have batch jobs that I want to run on Kubernetes. As I understand it, Jobs:

If I choose restartPolicy: Never , this means that if the job fails, it will destroy the Pod and transfer to (possibly) another node. If restartPolicy: OnFailure , it will restart the container in the existing Pod. I believe that a certain number of failures cannot be repaired. Is there a way to prevent it from being rescheduled or restarted after a certain period of time and clearing incorrigible tasks?

My current thought on a workaround to this is to have some watchdog process that looks at retryTimes and clears jobs after a certain number of attempts.

+5
source share
1 answer

Summary of weak discussion:

No, there is no limit to repetition. However, you can set a deadline for a job from version 1.2 using activeDeadlineSeconds . The system should disable the reboot, and then stop working when it reaches the deadline.

+6
source

Source: https://habr.com/ru/post/1242051/


All Articles