I am just starting out with Airbnb airflow and I still don't understand how to do it / when the bay is running.
In particular, there are two use cases that confuse me:
If I run the airflow scheduler in a few minutes, stop it for a minute, then restart it again, my DAG seems to start additional tasks for the first 30 seconds or so, then it continues as usual (runs every 10 seconds). Are these additional tasks βbombardedβ with tasks that were not completed in an earlier mode? If so, how would I tell the airflow not to perform these tasks?
If I run airflow scheduler in a few minutes, run airflow clear MY_tutorial and then restart airflow scheduler , it seems to start TON of additional tasks. Are these tasks somehow "bombarded" with tasks? Or am I missing something.
I currently have a very simple dag:
default_args = { 'owner': 'me', 'depends_on_past': False, 'start_date': datetime(2016, 10, 4), 'email': [' airflow@airflow.com '], 'email_on_failure': False, 'email_on_retry': False, 'retries': 1, 'retry_delay': timedelta(minutes=5),
The only two things I changed in my airflow configuration:
- I switched from using sqlite db to using postgres db
- I am using
CeleryExecutor instead of SequentialExecutor
Thank you for help!
source share