Professional Documents
Culture Documents
Introduction To Dataflow
Introduction To Dataflow
In this lesson you learn about Dataflow. Dataflow is a fully managed service for
transforming and enriching data in stream (real time) or batch (historical) modes. This
means you don't have to do complex workarounds or compromise your pipeline design.
Dataflow uses a serverless approach to resource provisioning and management. You can
access virtually limitless capacity to solve your biggest data processing challenges, while
paying only for what you use.
Data
Warehouse
Datastore
BigQuery
Dataflow
Predictive
Apache Avro Process Analytics
Vertex AI
Dataflow pipelines are either batch (processing bounded input like a file or database table) or
streaming (processing unbounded input from a source like Pub/Sub).
The lab you do in this lesson is a batch pipeline. In a later module you create a streaming
pipeline.