You are on page 1of 5

ETL Testing Interview Questions and

Answers
What is ETL?

In data warehousing architecture, ETL is an important component, which manages


the data for any business process. ETL stands for Extract, Transform and Load.
Extract does the process of reading data from a database. Transform does the
converting of data into a format that could be appropriate for reporting and analysis.
While, load does the process of writing the data into the target database.

Explain what are the ETL testing operations includes?

ETL testing includes

 Verify whether the data is transforming correctly according to business


requirements
 Verify that the projected data is loaded into the data warehouse without any
truncation and data loss
 Make sure that ETL application reports invalid data and replaces with default
values
 Make sure that data loads at expected time frame to improve scalability and
performance
Mention what are the types of data warehouse applications and what is
the difference between data mining and data warehousing?

The types of data warehouse applications are

 Info Processing
 Analytical Processing
 Data Mining
Data mining can be define as the process of extracting hidden predictive
information from large databases and interpret the data while data
warehousing may make use of a data mine for analytical processing of the
data in a faster way. Data warehousing is the process of aggregating data
from multiple sources into one common repository.

What are the various tools used in ETL?


 Cognos Decision Stream
 Oracle Warehouse Builder
 Business Objects XI
 SAS business warehouse
 SAS Enterprise ETL server
What is fact? What are the types of facts?

It is a central component of a multi-dimensional model which contains the


measures to be analysed. Facts are related to dimensions.

Types of facts are

 Additive Facts
 Semi-additive Facts
 Non-additive Facts. For more details ETL testing Training
Explain what are Cubes and OLAP Cubes?

Cubes are data processing units comprised of fact tables and dimensions
from the data warehouse. It provides multi-dimensional analysis.

OLAP stands for Online Analytics Processing, and OLAP cube stores large
data in muti-dimensional form for reporting purposes. It consists of facts called
as measures categorized by dimensions.

What are the names of the layers in ETL?

The first layer in ETL is the source layer, and it is the layer where data lands.
The second layer is the integration layer where the data is stored after
transformation. The third layer is the dimension layer where the actual
presentation layer stands.

What is meant by snapshots?

Snapshots are the copies of the read-only data that is stored in the master
table.
What are views?

Views are built using the attributes of one or more tables. View with a single
table can be updated, but those with multiple tables cannot be updated. More
info ETL Training

What is meant by a materialized view log?

A materialized view log is the pre-computed table with aggregated or joined


data from the fact tables, as well as the dimension tables.

What is a materialized view?

A materialized view is an aggregate table.

What is the difference between PowerCenter and PowerMart?

PowerCenter processes large volumes of data, whereas Power Mart


processes small volumes of data.

What is the exact purpose of an ETL according to you?

It is actually very beneficial for the extracting of data and from the systems
that are based on legacy.

Can you define measures in a simple statement?

Well, they can be called as the number data which is generally based on the
columns and is generally present in a fact table by default. Learn more from
ETL Testing Certification

When you will make use of the Lookup Transformation?

It is one of the finest and in fact one of the very useful approaches in the ETL.
It simply makes sure that the users can get a related value from a table and
with the help of a column value that seems useless. In addition to this, it
simply makes sure of boosting the performance of a dimensions table which is
changing at a very slow rate. Also, there are situations when the records
already remain present in the table. Dealing with such issues can be made
possible with the help of Lookup transformation.

What do you understand by Data Purging?


There are needs and situations when the data needs to be deleted from the
data warehouse. It is a very daunting task to delete the data in bulk. The
Purging is an approach that can delete multiple files at the same time and
enable users to maintain speed as well as efficiency. A lot of extra space can
be created simply with this.

Can you tell something about Bus Schema?

Dimension identification is something that is very important in the ETL and the
same is largely handled by the Bus Schema.

Are your familiar with the Dynamic and the Static Cache?

When it comes to updating the master table, the dynamic cache can be opted.
Also, the users are free to use it for changing the dimensions. On the other
side, the users can simply manage the flat files through the Static Cache. It is
possible to deploy both the Dynamic and the Static Cache at the same time
depending on the task and the overall complexity of the final outcome.

What do you mean by staging area?

It is an area which is used when it comes to holding the information or the


data temporary on the server that controls the data warehouse. There are
certain steps that are included and the prime one among them is Surrogate
assignments.

What are the types of Partitioning you are familiar with?

There are two types of Partitioning that is common in ETL and they are:

1. Hash

2. Round-robin

Can you tell a few benefits of using the Data Reader Destination
Adapter?

There are ADO record sets which generally consists of columns and records.
When it comes to populating them in a simple manner, the Data Reader
Destination Adapter is very useful. It simply exposes the data flow and let the
users impose various restrictions on the data which is required in many cases.
What are joiner and lookup?

The joiner is used to join two or more tables to retrieve the data from tables.

Lookup

is used to check and compare the source table and the target table.

You might also like