You are on page 1of 12

How Airflow

works?

How the components work together

www.marclamberti.com
Architecture Overview (Single Node)

Queueing Is part of executors


System
Node

Executor Web
Scheduler
(1 - n) Server

Metastore
DB

www.marclamberti.com
Architecture Overview (Multi Nodes)
Becomes an external queue
Queueing component ( RabbitMQ )
System
Worker Node 1 Worker Node 2 Worker Node N Master Node

Executor Executor Executor Web


... Scheduler
(1 - N) (1 - N) (N) Server

Metastore
Worker Node ≠ Worker (process) DB

www.marclamberti.com
How Your Work Gets Done?

Dag?
Queueing
Dag System
Folder

Executor Web
Scheduler
(1 - n) Server

Metastore
DB
n - worker (process)

www.marclamberti.com
How Your Work Gets Done?

Queueing
Dag System
Folder

Executor Web
Scheduler
(1 - n) Server
gRun in DB
Create a Da

Metastore DagRun
DB
Running

www.marclamberti.com
How Your Work Gets Done?

Queueing
Dag System
Folder

Executor Web
Scheduler
(1 - n) Server

Schedule
es to run
TaskInstanc
Metastore DagRun TaskInstance
DB
Running Scheduled

www.marclamberti.com
How Your Work Gets Done?
TaskInstance
e Queueing
t anc
Dag a s kIns System Queued
dT
Folder Sen

Executor Web
Scheduler
(1 - n) Server

Metastore DagRun TaskInstance


stance
Send TaskIn DB
Running Queued

www.marclamberti.com
How Your Work Gets Done?

l l s out Queueing
pu
Dag c u tor tance System
Folder Exe skIns
Ta

the Executor Web


t i ng Scheduler
c u (1 - n) Server
er exe tance
rk s
Wo TaskIn

TaskInstance es
u p dat e Metastore DagRun TaskInstance
r nc
x e cuto kInsta DB
Running E Tas
the Running Running

www.marclamberti.com
How Your Work Gets Done?
Queueing
Dag System
Folder

Executor Web
Scheduler
(1 - n) Server

TaskInstance
he
a t es t Metastore DagRun TaskInstance
Success d
t o r up ance DB
c u st
Exe TaskIn
Running Success

www.marclamberti.com
How Your Work Gets Done?

Queueing
Dag System
Folder

Executor Web
Scheduler
(1 - n) Server

one?
rk d Metastore DagRun TaskInstance
Wo
DB
Success Success

www.marclamberti.com
How Your Work Gets Done?
Queueing
Dag System
Folder

Executor Web
Scheduler
(1 - n) Server

te UI Metastore DagRun TaskInstance


Upda
DB
Success Success

www.marclamberti.com
Sum up

1. The Scheduler reads the DAG folder.


2. Your DAG is parsed by a process to create a DagRun based on the scheduling parameters of your DAG.
3. A TaskInstance is instantiated for each Task that needs to be executed and flagged to “Scheduled” in the
metadata database.
4. The Scheduler gets all TaskInstances flagged “Scheduled” from the metadata database, changes the state to
“Queued” and sends them to the executors to be executed.
5. Executors pull out Tasks from the queue ( depending on your execution setup ), change the state from
“Queued” to “Running” and Workers start executing the TaskInstances.
6. When a Task is finished, the Executor changes the state of that task to its final state (success, failed, etc) in
the database and the DAGRun is updated by the Scheduler with the state “Success” or “Failed”. Of course,
the web server periodically fetch data from the metadatabase to update the UI.

www.marclamberti.com

You might also like