Professional Documents
Culture Documents
------------------------About Project
Archeture:--------------------------------------------------
1. Getmetadata activity
2. Copy Activity
3. Lookup Activity
4. For each Loop
5. Stored Procedure Activity
6. Web Activity
7. Dataflows
------------------------------------------incremental loding of data in
piepline :-----
we are maintaining one control table inside azure sql database
This control table contains all pipeline names and lastruntime of each pipeline.
based on last runtime of pipeline , we are picking latest and modified records
greter then the last runtime of pipeline,
from oracle source tables and placing data into datalake gen2 like csv files.
Then by using polybase external table ,
We are comparing Target Tables with this source external tables and identifying
which are the records new and which are records modified
then new records and insert new record and updating old record in synapse data
warehousing.
-------------------------------------------integration runtimes in
adf :---------------------------------------
Integration Runtime (IR) is the compute infrastructure used by Azure Data Factory
to provide
data integration capabilities across different network environments.
There are 3 different kind of integration runtimes
-----------------------------------------------Triggers:--------------------------
To Run Azure Data Factory pipelines, we can use any of one triggers.
There are 3 different kind of triggers in ADF
---------------------------------------------
Limitations-------------------------------------------------------
you cannot place a For Each activity inside of another For Each activity
You cannot put a For Each activity or Switch activity inside of an If activity
You cannot use a Set Variable activity inside a For Each activity that runs in
parallel
You cannot nest If activities
You cannot nest Switch activities
You cannot put a For Each or If activity inside a Switch activity
You cannot use an expression to populate the pipeline in an Execute Pipeline
activity
The Lookup activity has a maximum of 5,000 rows and a maximum size of 4 MB
Max 40 activites per one pipeline
100 queued runs per pipeline and 1,000 concurrent pipeline activity runs per
subscription per Azure Integration Runtime region
-------------------------------------------Performence
Tuning--------------------------------------------------
For Copy Data Activites Incresing DIUs-Data Integration Units for Azure Integration
Run Times
If it is Self Hosted integration Runtimes increse No of nodes nothing but Machines
set parelle copy option on copy activity
use Azure Blob storage or Azure Data Lake Storage Gen2 as an staging store when
loading from on prem to cloud
------------------------------------------Copying Data
dynamically---------------------------------------------
Search for Webhook in the pipeline Activities pane, and drag a Webhook activity to
the pipeline canvas.
Select the new Fail activity on the canvas if it is not already selected, and its
Settings tab, to edit its details.
Specify a URL for the webhook, which can be a literal URL string, or any
combination of dynamic expressions, functions, system variables, or outputs from
other activities. Provide other details to be submitted with the request.
Use the output from the activity as the input to any other activity, and reference
the output anywhere dynamic content is supported in the destination activity.
web activity
Web Activity can be used to call a custom REST endpoint from an Azure Data Factory
or Synapse pipeline.
You can pass datasets and linked services to be consumed and accessed by the
activity.
Search for Web in the pipeline Activities pane, and drag a Web activity to the
pipeline canvas.
Select the new Web activity on the canvas if it is not already selected, and its
Settings tab, to edit its details.
Specify a URL, which can be a literal URL string, or any combination of dynamic
expressions, functions, system variables, or outputs from other activities. Provide
other details to be submitted with the request.
Use the output from the activity as the input to any other activity, and reference
the output anywhere dynamic content is supported in the destination activity.
Search for Lookup in the pipeline Activities pane, and drag a Lookup activity to
the pipeline canvas.
Select the new Lookup activity on the canvas if it is not already selected, and its
Settings tab, to edit its details.
Shows the UI for a Lookup activity.
Choose an existing source dataset or select the New button to create a new one.
The options for identifying rows to include from the source dataset will vary based
on the dataset type. The example above shows the configuration options for a
delimited text dataset. Below are examples of configuration options for an Azure
SQL table dataset and an OData dataset.
Shows the configuration options in the Lookup activity for an Azure SQL table
dataset.
Shows the configuration options in the Lookup activity for an OData dataset.
Supported capabilities
Note the following:
The Lookup activity can return up to 5000 rows; if the result set contains more
records, the first 5000 rows will be returned.
The Lookup activity output supports up to 4 MB in size, activity will fail if the
size exceeds the limit.
The longest duration for Lookup activity before timeout is 24 hours.
When you use query or stored procedure to lookup data, make sure to return one and
exact one result set. Otherwise, Lookup activity fails.
Search for Get Metadata in the pipeline Activities pane, and drag a Fail activity
to the pipeline canvas.
Select the new Get Metadata activity on the canvas if it is not already selected,
and its Dataset tab, to edit its details.
Choose a dataset, or create a new one with the New button. Then you can specify
filter options and add columns from the available metadata for the dataset.
Use the output of the activity as an input to another activity, like a Switch
activity in this example. You can reference the output of the Metadata Activity
anywhere dynamic content is supported in the other activity.
Shows the pipeline with a Switch activity added to handle the output of the Get
Metadata activity.
In the dynamic content editor, select the Get Metadata activity output to reference
it in the other activity.
Shows the dynamic content editor with the output of the Get Metadata activity as
the dynamic content.
5. what is Until activity?
It executes a set of activities in a loop until the condition associated with the
activity evaluates to true.
Search for Until in the pipeline Activities pane, and drag a Set Variable activity
to the pipeline canvas.
Select the Until activity on the canvas if it is not already selected, and its
Settings tab, to edit its details.
Shows the Settings tab of the Until activity in the pipeline canvas.
Enter an expression that will be evaluated after all child activities defined in
the Until activity are executed.
If the expression evaluates to false, the Until activity will execute all its child
activities again.
When it evaluates to true, the Until activity will complete.
The expression can be a literal string expression, or any combination of dynamic
expressions, functions, system variables, or outputs from other activities. The
example below checks the value of a previously defined pipeline array variable
called TestVariable to see if it evaluates to ['done'].
Define activities that the Until activity will execute by selecting the Edit
Activities button on the Until activity directly, or by selecting the Activities
tab to select it there. A new activities editor pane will be displayed where you
can add any activities for the Until activity to execute. In this example, a Set
Variable activity simply sets the value of the variable referenced in the
expression above to ['done'], so the Until activity's expression will be true the
first time it is executed, and the Until activity will stop. In your real-world
use, you can check any conditions required and the Until activity will continue to
execute its child activities each time the expression is evaluated, until the
conditions are met.
Shows the activities editor for an Until activity with a Set Variable activity
defined.
You can use any array type variable or outputs from other activities as the input
for your ForEach activity. To create an array variable, select the background of
the pipeline canvas and then select the Variables tab to add an array type variable
as shown below.
Shows an empty pipeline canvas with an array type variable added to the pipeline.
Search for ForEach in the pipeline Activities pane, and drag a ForEach activity to
the pipeline canvas.
Select the new ForEach activity on the canvas if it is not already selected, and
its Settings tab, to edit its details.
Select the Items field and then select the Add dynamic content link to open the
dynamic content editor pane.
Shows the Add dynamic content link for the Items property.
Select your input array to be filtered in the dynamic content editor. In this
example, we select the variable created in the first step.
Shows the dynamic content editor with the variable created in the first step
selected
Select the Activities editor on the ForEach activity to add one or more activities
to be executed for each item in the input Items array.
Shows the Activities editor button on the ForEach activity in the pipeline editor
window.
In any activities you create within the ForEach activity, you can reference the
current item the ForEach activity is iterating through from the Items list. You can
reference the current item anywhere you can use a dynamic expression to specify a
property value. In the dynamic content editor, select the ForEach iterator to
return the current item.
Shows the dynamic content editor with the ForEach iterator selected.