You are on page 1of 2

MODERN DATA ENGINEERING

(Job Oriented Elective)

Course Code:20CS11Q3 L T P C
3 0 0 3
Course outcomes: At the end of the course students will be able to
CO1: Understand data lakes architectures and data engineering tools and services. (L2)
CO2: Explain architectures and pipelines to create data lakes. (L2)
CO3: experiment with delta lake tables (L3)
CO4: Build the data pipeline for the data curation stage. (L2)
CO5: Develop gold layer for data aggregation to meet customer expectations. (L3)

UNIT-I: (10 Lectures)


Discovering Storage and Compute Data Lakes: Introducing data lakes, Discovering data lake
architectures, Data Warehouse Vs Datalakes.
Data Engineering on Microsoft Azure: Introducing data engineering in Azure, Performing data
engineering in Microsoft Azure-Self-managed data, engineering services (IaaS), Azure-managed
data engineering services (PaaS), Data processing services in Microsoft Azure, Data engineering
as a service (SaaS), Data cataloging and sharing services in Microsoft Azure; Opening a free
account with Microsoft Azure. (Chapter 2,3)

Learning Outcomes: At the end of the module, students will be able to:

1. Explain introduction of data lakes. (L2)


2. Describe data engineering in Azure. (L2)
3. Summarize data engineering services.(L2)

UNIT-II: (10 Lectures)


Understanding Data Pipelines: Exploring data pipelines, Process of creating a data pipeline,
Running a data pipeline, Sample lakehouse project
Data Collection Stage – The Bronze Layer: Architecting the Electroniz data lake,
Understanding the bronze layer, Configuring data sources, Configuring data destinations,
Building the ingestion pipelines

Learning Outcomes: At the end of the module, students will be able to:

1. Understand the bronze layer. (L2)


2. Describe the process of configuring data sources. (L2)
3. Explain how to build ingestion pipelines.(L2)
UNIT-III: (10 Lectures)
Understanding Delta Lake: Understanding how Delta Lake enables the lakehouse,
Understanding Delta Lake, Creating a Delta Lake table, Changing data in an existing Delta Lake
table, Performing time travel, Performing upserts of data, Understanding isolation levels,
Understanding concurrency control, Cleaning up Azure resources

Learning Outcomes: At the end of the module, students will be able to:

1.summarize the process of clean the raw data (L2)


2. create a delta lake table (L3)
3. illustrate isolation levels and concurrency control(L2)

UNIT-IV: (10 Lectures)


Data Curation Stage – The Silver Layer: The need for curating raw data, The process of
curating raw data, Developing a data curation pipeline, Running the pipeline for the silver layer,
Verifying curated data in the silver layer, Cleaning up Azure resources. (Chapter 7)

Learning Outcomes: At the end of the module, students will be able to:

1.explain the need for curating the data. (L2)


2. Outline the process of curating the data. (L2)
3. Develop the data curation pipeline. (L2)

UNIT-V: (10 Lectures)


Data Aggregation Stage – The Gold Layer : The need to aggregate data, The process of
aggregating data, Developing a data aggregation pipeline, Running the aggregation pipeline,
Understanding data consumption, Verifying aggregated data in the gold layer, Meeting customer
expectations. (Chapter 8)

Learning Outcomes: At the end of the module, students will be able to:

1. Explain the need to aggregate data. (L2)


2. Build a data aggregation pipeline. (L3)
3. Interpret verification of aggregated data in the gold layer(L2)

Text Books:
1. Manoj Kukreja,, Data Engineering with Apache Spark, Delta Lake, and Lakehouse, Packt
Publishing, 2021.

References Books:
1. Scott Haines, Modern Data Engineering with Apache Spark: A Hands-On Guide for
Building Mission-Critical Streaming Applications, Apress, 2022.

Web References:
1. https://www.coursera.org/learn/introduction-to-data-engineering
2. https://www.coursera.org/professional-certificates/microsoft-azure-dp-203-data-engineeri
ng
3. https://aws.amazon.com/compare/the-difference-between-a-data-warehouse-data-lake-an
d-data-mart/

You might also like