Airflow: PythonOperator: why include 'ds' arg?

When defining a function that will later be used as python_callable, why is "ds" included as the first argument to the function?

For instance:

def python_func(ds, **kwargs): pass 

I looked through the Airflow documentation but did not find any explanation.

+5
source share
1 answer

This is due to the provide_context=True parameter. According to the Airflow documentation,

if set to true, Airflow will pass a set of keyword arguments that can be used in your function. This set of kwargs exactly matches what you can use in your jinja templates. To do this, you need to define ** kwargs in the function header.

ds is one of these keyword arguments and represents the due date in the format "YYYY-MM-DD". For parameters that are marked as (templated) in the documentation, you can use the default variable '{{ ds }}' to pass the execution date. Here you can learn more about default variables: https://pythonhosted.org/airflow/code.html?highlight=pythonoperator#default-variables

PythonOperator has no template parameters, so doing something like python_callable=print_execution_date('{{ ds }}') will fail. To print the execution date inside your PythonOperator called function, you will need to do this as

def print_execution_date(ds, **kwargs): print(ds)

or

def print_execution_date(**kwargs): print(kwargs.get('ds'))

Hope this helps.

+17
source

Source: https://habr.com/ru/post/1259550/


All Articles