What is the difference between the all_done and all_success airflow trigger rule?

One of the requirements in the workflow that I am working on is the expectation that an event will happen within a given time, if it does not, mark the task as failed, nevertheless, the task should be performed downstream.

I am wondering if "all_done" means all the dependency tasks, whether they were successful or not.

+12
source share
3 answers

https://airflow.incubator.apache.org/concepts.html#trigger-rules

all_done means that all operations are complete. Perhaps they succeeded, or maybe not.

all_success means that all operations completed without errors

So your guess is correct

+15
source

SUMMARY
All done tasks if the number of tasks SUCCESS, FAILED, UPSTREAM_FAILED, SKIPPED is greater than or equal to the number of all higher tasks.

Not sure why this will be more than? Perhaps the subdags are doing something weird with the counts.

Tasks are “successful” if the number of source tasks and the number of successful source tasks are the same.

DETAILS
The code for evaluating trigger rules is here https://github.com/apache/incubator-airflow/blob/master/airflow/ti_deps/deps/trigger_rule_dep.py#L72

  1. ALL_DONE

The following code starts qry and returns the first row (the query is an aggregate that in any case will return only one row) to the following variables:

 successes, skipped, failed, upstream_failed, done = qry.first() 

the "done" column in the query corresponds to this: func.count(TI.task_id) other words, counting all tasks that correspond to the filter. The filter indicates that it only considers the original tasks, from the current dag, from the current execution date, and this:

  TI.state.in_([ State.SUCCESS, State.FAILED, State.UPSTREAM_FAILED, State.SKIPPED]) 

So, done is a count of higher tasks with one of these 4 states.

There is this code later

 upstream = len(task.upstream_task_ids) ... upstream_done = done >= upstream 

And the actual trigger rule doesn't work on just that

 if not upstream_done 
  1. ALL_SUCCESS

The code is pretty simple, and the concept is intuitive.

 num_failures = upstream - successes if num_failures > 0: ... it fails 
+10
source

Try using ShortCircuitOperator for the purpose you specify.

+8
source

Source: https://habr.com/ru/post/1262858/


All Articles