0% found this document useful (0 votes)

74 views33 pages

Azure Data Factory

The document provides a comprehensive guide on using Azure Data Factory (ADF) for cloud-based data integration, focusing on creating a simple ETL pipeline that transfers data from Azure Blob Storage to Azure SQL Database. It covers the objectives, tools, key components, and detailed steps for creating linked services, datasets, and pipelines, along with monitoring and managing the data flow. The guide emphasizes the ADF UI's Author, Monitor, and Manage tabs, and includes specific configurations and examples for successful implementation.

Uploaded by

harshada choure

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

74 views33 pages

Azure Data Factory

Uploaded by

harshada choure

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

AZURE DATA FACTORY

Contents
AZURE DATA FACTORY.................................................................................1
Objective:...................................................................................................2
Tools & Environment:...............................................................................2
For Create a Data Factory & Explore the UI......................................2
For Create Linked Services...................................................................2
Azure Data Factory (ADF)- Conceptual Overview:...............................2
Key Components:...................................................................................2
ADF UI Overview....................................................................................2
Azure Blob Storage................................................................................3
Azure SQL Database..............................................................................3
Data Flow in This Project......................................................................4
Task:.........................................................................................................4
Value Creator................................................................................................4
Azure Data Factory (ADF).....................................................................4
Tasks........................................................................................................4
a) Create a data factory and explore the UI (Author, Monitor,
Manage tabs)..........................................................................................4
ADF UI Overview – Three Main Tabs....................................................7
b) Create Linked Services: Azure Blob Storage & Azure SQL
Server....................................................................................................10
Step 1.....................................................................................................10
Connector configuration details........................................................14
c) Create datasets for CSV files (Blob) and SQL tables................25
d) Build a simple pipeline that copies data from Blob Storage to
SQL Server.............................................................................................30
e) Trigger a pipeline manually and monitor the run......................32
Objective:
This task supports my Performance Improvement Plan (PIP) by
enhancing my skills in cloud-based data integration using Azure Data
Factory and Azure SQL Database. The goal is to design a simple ETL
pipeline that moves data from Azure Blob Storage to a relational database
and apply T-SQL queries for data manipulation and analysis.

Tools & Environment:

For Create a Data Factory & Explore the UI
 Azure Portal – Web-based interface to create and manage Azure
services

 Azure Data Factory (ADF) – Cloud-based data integration service

For Create Linked Services

 Azure Data Factory Studio – UI to create linked services, datasets,
and pipelines

 Azure Blob Storage – Used as a source or destination for files

 Azure SQL Database – Cloud-based relational database

 Authentication Methods – Account Key for Blob Storage, SQL

Authentication for Azure SQL

Azure Data Factory (ADF)- Conceptual Overview:

Azure Data Factory is a cloud-based data integration service that allows
you to create, schedule, and manage data pipelines. It provides a code-free
and code-friendly environment for building complex data workflows.

Key Components:
1. Pipelines: Logical containers that define the data flow process.

2. Activities: Tasks performed in a pipeline, like copy data,

transformation, etc.

3. Datasets: Represent data structures (e.g., files, tables) used by

activities.
4. Linked Services: Define connections to external data sources or
sinks.

5. Integration Runtime (IR): The compute environment for executing

activities.

ADF UI Overview
1. Author Tab

o Used to create and manage pipelines, datasets, and data flows.

o Includes a visual designer and code view.

o Supports drag-and-drop activities for building workflows.

2. Monitor Tab

o Used to track pipeline executions and monitor performance.

o Displays success/failure status, execution time, and error details.

3. Manage Tab

o Used to configure:

 Linked services (e.g., Azure Blob, SQL DB)

 Integration Runtimes

 Triggers (schedules)

 Global parameters

Azure Blob Storage

Azure Blob Storage is a scalable, cloud-based object storage system. In
this project, it serves as the source or landing zone for raw or semi-
structured data files.

Uses in this project:

 Stores source files (e.g., CSV or JSON) that will be loaded into Azure
SQL.

 Configured in ADF via a Linked Service using Storage Account keys or

SAS tokens.
 Accessed using binary, delimited, or JSON dataset formats in ADF.

Azure SQL Database

Azure SQL Database is a fully managed relational database in the cloud.
It is used in this project as the destination (sink) for structured data.

Uses in this project:

1. Acts as a central storage point for cleaned and structured data.

2. Receives data loaded via ADF’s Copy Data activity.

3. Can be queried using T-SQL in tools like Azure Data Studio or SSMS.

4. Supports features like indexes, views, joins, constraints, and stored

procedures.

Data Flow in This Project

1. Source: File in Azure Blob Storage (e.g., Employee data in CSV).

2. ADF Pipeline:

o Connects to Blob using a Linked Service

o Defines a dataset pointing to the file

o Uses a Copy Activity to move data

o Connects to Azure SQL using another Linked Service

o Defines a target SQL dataset

3. Destination: Azure SQL Database table (e.g., Employee)

4. Post-load: Data is verified and queried using T-SQL

Task:

Value Creator
Azure Data Factory (ADF)

Tasks

a. Create a data factory and explore the UI (Author, Monitor,

Manage tabs).
Installations:

Azure Portal – Web-based interface to create and manage Azure

services

Azure Data Factory (ADF) – Cloud-based data integration service

Step 1: Sign in to Azure Portal

 Open your browser and go to: https://portal.azure.com

 Sign in with your Azure credentials.

Step 2: Create a Data Factory

1. In the search bar at the top, type “Data Factory” and select it.

2. Click “+ Create”.
3. Fill in the required fields:

o Subscription: Choose your subscription.

o Resource Group: Select existing or create a new one.

o Region: Select a region (e.g., East US).

o Name: Provide a unique name for the Data Factory.

4. Select Version V2.

5. Click Review + Create, then Create.

6. Wait for deployment to complete, then click Go to resource.

Step 3: Exploring the Azure Data Factory UI

Once the deployment is complete:

1. Go to the Data Factory instance you created.

2. Click on "Launch Studio" – this opens the ADF UI.

ADF UI Overview – Three Main Tabs

1. Author Tab
This is where you design and build your data pipelines.

 Sections:

o Pipelines: Create ETL/ELT workflows.

o Datasets: Define data structures (input/output).

o Linked services: Connections to data sources like Azure Blob,

SQL, etc.

o Data flows: For visually transforming data (mapping data flows).

o Triggers: Schedule or event-based pipeline executions.

 Actions:

o Click + (Add resource) to create a pipeline, dataset, data flow,

etc.

o Use the drag-and-drop canvas to build pipeline workflows.

2. Monitor Tab
Used to track pipeline execution and debug issues.

 Sections:

o Pipeline runs: View history of pipeline executions (status,

duration, etc.).
o Trigger runs: Monitor trigger-based executions.

o Integration runtimes: View status of your compute

environment.

 Actions:

o Click on a failed pipeline to view activity details and

troubleshoot.

o Filter logs by status, date, or name.

3. Manage Tab
Used for configuration and administration.

 Sections:

o Linked services: Add or manage connections to external

systems.

o Integration runtimes: Manage self-hosted or Azure-hosted

compute.

o Triggers: Create/edit triggers.

o Git configuration: Integrate with Git for source control.

 Actions:
o Set up self-hosted IR for on-premises connectivity.

o Configure Git repository to track changes in your pipelines.

What is a Linked Service?

A Linked Service in ADF is like a connection string. It defines the connection

information needed for ADF to connect to external resources (e.g.,
databases, storage).

b) Create Linked Services: Azure Blob Storage & Azure SQL Server

Step 1: Launch Azure Data Factory Studio

1. Go to https://portal.azure.com

2. Open your Data Factory resource.

3. Click "Launch Studio".

Part 1: Create Linked Service for Azure Blob Storage

Steps:

1. Go to the Manage tab (gear icon on the left).

2. Under Connections, click Linked services.

3. Click + New.
4. In the New linked service pane:

o Search and select Azure Blob Storage.

5. Click Continue.

Configuration options:

 Name: e.g., AzureBlobStorage1

 Authentication method: Choose from:

o Account key (simplest for testing)

o Managed Identity (recommended for production)

o SAS token or Service Principal

 Storage account name: Select your Blob Storage account.

Connector configuration details

The following sections provide details about properties that are used to
define Data Factory and Synapse pipeline entities specific to Blob storage.

Linked service properties

This Blob storage connector supports the following authentication types. See
the corresponding sections for details.

1. Anonymous authentication
2. Account key authentication
3. Shared access signature authentication
4. Service principal authentication
5. System-assigned managed identity authentication
6. User-assigned managed identity authentication
1. Anonymous authentication
The following properties are supported for storage account key
authentication in Azure Data Factory or Synapse pipelines:
Property Description Required

type The type property must be set to AzureBlobStorage (suggested) Yes

or AzureStorage (see the following notes).

containerUr Specify the Azure Blob container URI that has enabled Anonymous Yes
i read access by taking this
format https://<AccountName>.blob.core.windows.net/<Container
Name> and Configure anonymous public read access for
containers and blobs

connectVia The integration runtime to be used to connect to the data store. No

You can use the Azure integration runtime or the self-hosted
integration runtime (if your data store is in a private network). If
this property isn't specified, the service uses the default Azure
integration runtime.

JSON CODE:
The following properties are supported for storage account key
authentication in Azure Data Factory or Synapse pipelines:

Property Description Requir

type The type property must be set Yes

to AzureBlobStorage (suggested)
or AzureStorage (see the following notes).

connectionStri Specify the information needed to connect Yes

ng to Storage for
the connectionString property.
You can also put the account key in Azure
Key Vault and pull
the accountKey configuration out of the
connection string. For more information,
see the following samples and the Store
credentials in Azure Key Vault article.

connectVia The integration runtime to be used to No

connect to the data store. You can use the
Azure integration runtime or the self-hosted
integration runtime (if your data store is in
a private network). If this property isn't
specified, the service uses the default
Azure integration runtime.

using Account key:

 Click “Test connection” (to verify access).

 Click Create.
Part 2: Create Linked Service for Azure SQL Server
Steps:

1. In the Linked services page, click + New again.

2. Search and select Azure SQL Database or SQL Server (based on
your setup).

3. Click Continue.
Configuration options:

 Name: e.g., AzureSQLDBLS

 Server name: e.g., yourserver.database.windows.net

 Database name: Enter your DB name.

 Authentication type: Choose from:

o SQL Authentication (username/password)

o Managed Identity

 Username: SQL admin username.

 Password: SQL password (stored securely).

 Encrypted connection: Usually enabled.

4. Click “Test connection”, then create once it succeeds.

SQL authentication

To use SQL authentication, in addition to the generic properties that are

described in the preceding section, specify the following properties:

Propert Description Requir

y ed

userNam The user name used to connect to the server. Yes

passwor The password for the user name. Mark this field Yes
d as SecureString to store it securely.
JSON CODE

"name": "AzureSqlDbLinkedService",

"properties": {

"type": "AzureSqlDatabase",

"typeProperties": {

"server": "<name or network address of the SQL server instance>",

"database": "<database name>",

"encrypt": "<encrypt>",

"trustServerCertificate": false,

"authenticationType": "SQL",

"userName": "<user name>",

"password": {

"type": "SecureString",

"value": "<password>"

"connectVia": {

"referenceName": "<name of Integration Runtime>",

"type": "IntegrationRuntimeReference"

}
C) Create datasets for CSV files (Blob) and SQL tables.
Firstly, Created linked services for blob storage and SQL server
After that,

Step 1: Create Dataset for CSV File (Blob Storage)

1. Go to the Author tab (pencil icon).

2. Expand Datasets > Click + > Add dataset.

3. Select Azure Blob Storage > DelimitedText > Click Continue.

4. Provide:
o Name: DS_CSVInput
o Linked Service: LS_BlobStorage
o File path: e.g., container-name/folder/filename.csv
5. Set:
o Column delimiter: Comma (,), or other if needed.
o First row as header: Check this box if applicable.
o Import schema or leave as none (you can define schema
manually).
6. Click OK or Create.
Create Dataset for SQL Table

1. Again, click + > Add dataset.

2. Select Azure SQL Database (or SQL Server) > Click Continue.

3. Provide:
o Name: DS_SQLTarget
o Linked Service: LS_SQLServer
o Table name: Browse or type (e.g., dbo.Employees)
4. Import schema or define manually.
5. Click OK.

d) Build a simple pipeline that copies data from Blob Storage to SQL
Server.

Create the Pipeline

 Go to Author > Pipelines
 Click New Pipeline
 Drag a Copy Data activity onto the canvas
 Configure:
o Source: Select Blob dataset
o Sink: Select SQL Server dataset
o Optionally: Configure mappings, pre/post SQL, etc.

Run & Monitor the Pipeline

 Publish the pipeline

 Trigger the pipeline manually or on a schedule

 Go to Monitor to check execution status and logs

Open Your Pipeline

 Go to the Author tab (pencil icon on the left)

 Find your pipeline under Pipelines

 Click to open the pipeline

Trigger the Pipeline Manually

 With the pipeline open, click "Add Trigger" (top menu)

 Select "Trigger Now"

 If your pipeline has parameters, a dialog will pop up—fill in required

values

 Click "OK" to trigger the pipeline run

e) Trigger a pipeline manually and monitor the run.

Monitor the Pipeline Run

 Switch to the Monitor tab (clock icon on the left)

 You’ll see a list of pipeline runs

 Find your pipeline by name and click on the latest Run ID

 This shows details such as:

o Status: Succeeded, Failed, In Progress

o Start/End Time

o Activities: Individual steps with status

o Output & Logs

View Activity-Level Details

 Click on the Copy Data activity in the pipeline run

 You'll see:

o Input/output datasets
o Number of rows read/written

o Any error messages if the run failed

Azure Data Factory Beginner's Guide
No ratings yet
Azure Data Factory Beginner's Guide
250 pages
Data Factory
100% (2)
Data Factory
26 pages
Adf Part-1
No ratings yet
Adf Part-1
5 pages
Azure Data Factory
100% (1)
Azure Data Factory
6 pages
ADF Copy Data
100% (1)
ADF Copy Data
81 pages
Detailed Azure Data Factory Presentation
No ratings yet
Detailed Azure Data Factory Presentation
30 pages
ADF Copy Data: Blob to SQL Guide
No ratings yet
ADF Copy Data: Blob to SQL Guide
85 pages
Adf Part 1
No ratings yet
Adf Part 1
7 pages
Capgemini Questionnaire
No ratings yet
Capgemini Questionnaire
11 pages
Azure Data Factory Overview and Features
100% (4)
Azure Data Factory Overview and Features
16 pages
Azure Data Factory Overview and Guide
No ratings yet
Azure Data Factory Overview and Guide
11 pages
ADF - Intro and Components
No ratings yet
ADF - Intro and Components
17 pages
Azure Data Factory Workshop
No ratings yet
Azure Data Factory Workshop
26 pages
Azure Interview Questions
No ratings yet
Azure Interview Questions
7 pages
Azure Data Factory Guide
No ratings yet
Azure Data Factory Guide
13 pages
Taking Interviw
No ratings yet
Taking Interviw
15 pages
Microsoft ADF
No ratings yet
Microsoft ADF
11 pages
Azure Data Factory: A Comprehensive Guide
No ratings yet
Azure Data Factory: A Comprehensive Guide
9 pages
Azure Data Factory Use Cases Explained
No ratings yet
Azure Data Factory Use Cases Explained
9 pages
Azure Data Factory Tutorial
No ratings yet
Azure Data Factory Tutorial
36 pages
1694639964-Module 3 Azure Data Factory
No ratings yet
1694639964-Module 3 Azure Data Factory
48 pages
Azure Data Factory Overview With Realtime Ex
No ratings yet
Azure Data Factory Overview With Realtime Ex
5 pages
ADF Interview Questions v2
No ratings yet
ADF Interview Questions v2
29 pages
Business Intelligence with Databricks SQL
No ratings yet
Business Intelligence with Databricks SQL
29 pages
Azure Data Factory Overview and Features
No ratings yet
Azure Data Factory Overview and Features
26 pages
Azure Data Factory V2 Preview Guide
No ratings yet
Azure Data Factory V2 Preview Guide
59 pages
Azure Data Factory Guide
No ratings yet
Azure Data Factory Guide
43 pages
Azure Data Factory Overview and Pipelines
No ratings yet
Azure Data Factory Overview and Pipelines
24 pages
Azure Data Factory Guide & Tutorials
No ratings yet
Azure Data Factory Guide & Tutorials
1,158 pages
Azure Data Factory Overview and Basics
No ratings yet
Azure Data Factory Overview and Basics
54 pages
Most Frequently Asked Azure Data Factory Interview Questions
0% (1)
Most Frequently Asked Azure Data Factory Interview Questions
5 pages
ADF Notes
No ratings yet
ADF Notes
1 page
Azure Data Factory Interview Q&A Guide
No ratings yet
Azure Data Factory Interview Q&A Guide
17 pages
Azure Data Factory
No ratings yet
Azure Data Factory
4 pages
Azure Data Factory Copy Activity Guide
No ratings yet
Azure Data Factory Copy Activity Guide
52 pages
Azure Data Factory Overview and Features
No ratings yet
Azure Data Factory Overview and Features
22 pages
Azure Data Factory Guide
No ratings yet
Azure Data Factory Guide
98 pages
Azure Data Factory: Key Concepts Explained
No ratings yet
Azure Data Factory: Key Concepts Explained
27 pages
Azure Data Factory Use Cases Explained
No ratings yet
Azure Data Factory Use Cases Explained
11 pages
Azure Data Factory Components Overview
No ratings yet
Azure Data Factory Components Overview
1 page
Use-Case 2: Utilize Azure Data Factory (ADF) To Ingest Orders and Customers Data, and Execute Fundamental Transformations On The Datasets
No ratings yet
Use-Case 2: Utilize Azure Data Factory (ADF) To Ingest Orders and Customers Data, and Execute Fundamental Transformations On The Datasets
36 pages
Bdcil 7 PDF
No ratings yet
Bdcil 7 PDF
10 pages
Types of Activities in ADF
100% (1)
Types of Activities in ADF
37 pages
How To Test Azure Data Pipeline
No ratings yet
How To Test Azure Data Pipeline
17 pages
Azure Data Factory Overview
No ratings yet
Azure Data Factory Overview
12 pages
What Is Azure Data Factory
No ratings yet
What Is Azure Data Factory
15 pages
Azure Data Factory Interview Questions and Aswers
No ratings yet
Azure Data Factory Interview Questions and Aswers
5 pages
Azure Data Factory Data Movement Lab
No ratings yet
Azure Data Factory Data Movement Lab
26 pages
Azure Data Factory
77% (13)
Azure Data Factory
52 pages
Azure Data Factory - A Complete Introduction
No ratings yet
Azure Data Factory - A Complete Introduction
72 pages
Load Data With Azure Data Factory
No ratings yet
Load Data With Azure Data Factory
4 pages
Azure Data Engineering Interview Handbook
No ratings yet
Azure Data Engineering Interview Handbook
44 pages
Azure Data Factory - Pratap - Qbex Technologies - 8886230001
No ratings yet
Azure Data Factory - Pratap - Qbex Technologies - 8886230001
4 pages
ADF Question Set2
No ratings yet
ADF Question Set2
2 pages
Azure Data Factory Presentation
No ratings yet
Azure Data Factory Presentation
30 pages
ADF Interview Preparation
No ratings yet
ADF Interview Preparation
19 pages
Azure Data Factory Overview and Components
No ratings yet
Azure Data Factory Overview and Components
4 pages
Adf 25 Questions
No ratings yet
Adf 25 Questions
16 pages
Azure Data Factory Full Notes
No ratings yet
Azure Data Factory Full Notes
4 pages
Mechanical Measurements Course Overview
No ratings yet
Mechanical Measurements Course Overview
3 pages
Upper Limb Bones
No ratings yet
Upper Limb Bones
16 pages
Blue Professional Modern CV Resume
No ratings yet
Blue Professional Modern CV Resume
1 page
HP Lubricants Product Guide
No ratings yet
HP Lubricants Product Guide
170 pages
Origami Valley Fold
No ratings yet
Origami Valley Fold
2 pages
Biostatistics & EBM Notes for MRCP
No ratings yet
Biostatistics & EBM Notes for MRCP
35 pages
Taungoo Empire: History and Legacy
No ratings yet
Taungoo Empire: History and Legacy
59 pages
JEE Chemistry Revision Questions
No ratings yet
JEE Chemistry Revision Questions
6 pages
Types of Nouns Explained
No ratings yet
Types of Nouns Explained
3 pages
Ophtal 2.0
No ratings yet
Ophtal 2.0
70 pages
Midazolam
100% (7)
Midazolam
2 pages
Sorting Skeletons Activity Sheet Black and White
No ratings yet
Sorting Skeletons Activity Sheet Black and White
3 pages
Electrical Engineer Resume Overview
No ratings yet
Electrical Engineer Resume Overview
2 pages
Maths Class Xi Straight Lines and Conic Sections Practice Paper 09
No ratings yet
Maths Class Xi Straight Lines and Conic Sections Practice Paper 09
3 pages
APV SPX Homogenizer General Brochure
No ratings yet
APV SPX Homogenizer General Brochure
16 pages
A Detailed Lesson 2 Plan in English 8
No ratings yet
A Detailed Lesson 2 Plan in English 8
6 pages
Lab 1
No ratings yet
Lab 1
17 pages
Session 01-Unit 05. Quinto
No ratings yet
Session 01-Unit 05. Quinto
5 pages
Policy On Retention of Medical Records Data and Information
No ratings yet
Policy On Retention of Medical Records Data and Information
2 pages
The New Normal: A Quantitative Research Study On Parents' Teaching Method and Behavior of Grade 3 Public School Pupils Taking Modular Modality
No ratings yet
The New Normal: A Quantitative Research Study On Parents' Teaching Method and Behavior of Grade 3 Public School Pupils Taking Modular Modality
79 pages
Beam and Block Flooring Solutions
No ratings yet
Beam and Block Flooring Solutions
8 pages
Clinical Pediatric Endocrinology 4th Edition Charles G. D. Brook Complete Edition
No ratings yet
Clinical Pediatric Endocrinology 4th Edition Charles G. D. Brook Complete Edition
377 pages
Solar Fencing for Farmers
No ratings yet
Solar Fencing for Farmers
22 pages
Impact of Diastolic and Systolic Blood Pressure On Mortality: Implications For The Definition of "Normal"
No ratings yet
Impact of Diastolic and Systolic Blood Pressure On Mortality: Implications For The Definition of "Normal"
6 pages
Gaoxin GSP GSPR Type Submersible Slurry Pump
No ratings yet
Gaoxin GSP GSPR Type Submersible Slurry Pump
5 pages
Basic Six Computing Exam Questions
No ratings yet
Basic Six Computing Exam Questions
5 pages
Preventing Waterhammer in Steam Systems
100% (1)
Preventing Waterhammer in Steam Systems
6 pages
Officers Nominated For Site Visit of The Potential Bidder - Tranche 2
No ratings yet
Officers Nominated For Site Visit of The Potential Bidder - Tranche 2
4 pages
HARMONY V4: Human Capital Management Suite
No ratings yet
HARMONY V4: Human Capital Management Suite
28 pages
AMP MIcro Project Final Report
No ratings yet
AMP MIcro Project Final Report
13 pages