Professional Documents
Culture Documents
Mike Metzger
Data Engineer
Operators
Represent a single task in a work ow.
DummyOperator(task_id='example', dag=dag)
bash_task = BashOperator(task_id='clean_addresses',
bash_command='cat addresses.txt | awk "NF==10" > cleaned.txt',
dag=dag)
Mike Metzger
Data Engineer
Tasks
Tasks are:
Instances of operators
example_task = BashOperator(task_id='bash_example',
bash_command='echo "Example!"',
dag=dag)
Are not required for a given work ow, but usually present in most
In Air ow 1.8 and later, are de ned using the bitshi operators
>>, or the upstream operator
Downstream means a er
task2 = BashOperator(task_id='second_task',
bash_command='echo 2',
dag=example_dag)
Mixed dependencies:
or:
Mike Metzger
Data Engineer
PythonOperator
Executes a Python function / callable
Keyword
sleep_task = PythonOperator(
task_id='sleep',
python_callable=sleep,
op_kwargs={'length_of_time': 5}
dag=example_dag
)
Sends an email
A achments
Does require the Air ow system to be con gured with email server details
email_task = EmailOperator(
task_id='email_sales_report',
to='sales_manager@example.com',
subject='Automated Sales Report',
html_content='Attached is the latest sales report',
files='latest_sales.xlsx',
dag=example_dag
)
Mike Metzger
Data Engineer
DAG Runs
A speci c instance of a work ow at a point in time
failed
success
1 h ps://air ow.apache.org/docs/stable/scheduler.html
end_date - Optional a ribute for when to stop running new DAG instances
An asterisk * represents running for every interval (ie, every minute, every day, etc)
@hourly 0 * * * *
@daily 0 0 * * *
@weekly 0 0 * * 0
@monthly 0 0 1 * *
@yearly 0 0 1 1 *
1 h ps://air ow.apache.org/docs/stable/scheduler.html
This means the earliest starting time to run the DAG is on February 26th, 2020