You are on page 1of 4

Experiment no.

3
(Part A)
Aim: - To Perform ETL operation on Excel file using Talend Open Source tool for Data
Integration
Student Name:- Khushi Jain
Roll No.:- A218
Performance Date:- 04/08/2020
Submission Date:- 04/08/2020

PART B
Objective:

 To use Component like tinputFile, tlogrow and toutputfile.


 To Understand the use of these components in ETL.
 T implement ETL operation on excel file.

Experiment Outcome:

 Succesfully implemented the ETL operation on the given data file.

Input: (Copy and paste input here)


Output: (Copy and paste output here)
Conclusion:-

 ETL processes can be performed through Talend.


 Output file has been generated through input excel file with help of tlogrow
function.

Questions:-
1. Why use Talend over other ETL tools available in the market.
Talend is considered to be the next generation leader in the cloud and Big Data integration
software. It helps companies in taking real-time decisions and become more data-driven.
Using this technology, data becomes more accessible, its quality enhances and it can be
moved quickly to the target systems.

2. Describe the ETL Process.


ETL is a process in Data Warehousing and it stands for Extract, Transform and Load. It is a
process in which an ETL tool extracts the data from various data source systems, transforms
it in the staging area and then finally, loads it into the Data Warehouse system.

3. What are the advantages of using the Talend?


• A graphical integrated development environment with an intuitive Eclipse-based
interface
• Drag-and-drop job design
• A unified repository for storing and reusing metadata
• The broadest data connectivity support of any data integration platform, with more
than 900 components and built-in connectors that let you quickly bridge between
databases, mainframes, file systems, web services, packaged enterprise applications, data
warehouses, OLAP applications, Software-as-a-Service and Cloud-based applications, and
more
• Advanced ETL, Big Data, MDM & ESB functionality including string manipulations,
automatic lookup handling, and management of slowly changing dimensions
• Support heavy data load performance with Spark and Map Reduce integration and
existing job conversion process with just a single click.
4. State the differences between Built-In and Repository in Talend?
• Built-in: all information is stored locally in the Job. You can enter and edit all
information manually.
• Repository: all information is stored in the repository.
• Use Built-In for information that you only use once or very rarely.
• Use Repository for information that you want to use repeatedly in multiple
components or Jobs, such as a database connection.

5. What are the difference between Business Model and Job Design operation in
Talend?
Talend's Business Models allow data integration project stakeholders to graphically
represent their needs regardless of the technical implementation requirements. Business
Models help the IT operation staff understand these expressed needs and translate them
into technical processes (Jobs).
A Job Design is the runnable layer of a business model. It is a graphical design, of one or
more components connected together, that allows you to set up and run dataflow
management processes.When you design a Job in Talend Studio, you can: put in place data
integration actions using a library of technical components.

You might also like