You are on page 1of 8

UNIT-2 ETL AND OLAP TECHNOLOGY

What is ETL – ETL Vs ELT – Types of Data warehouses - Data warehouse Design and Modeling -
Delivery Process - Online Analytical Processing (OLAP) - Characteristics of OLAP - Online
Transaction Processing (OLTP) Vs OLAP - OLAP operations- Types of OLAP- ROLAP Vs
MOLAP Vs HOLAP

No Questions RBT COs


1 Outline the three-layer architecture of an ETL cycle? U 2
 Extraction - The extraction layer is responsible for retrieving
data from various sources and extracting it into a format
suitable for further processing
 Transformation - The transformation layer processes the
extracted data to meet the desired target format and quality
standards
 Load- The loading layer focuses on storing the transformed
data into the target destination, typically a data warehouse,
database, or data mart.
2 Why multidimensional views of data and data-cubes are used? U 2
A multidimensional model views data in the form of a data-cube. A
data cube enables data to be modeled and viewed in multiple
dimensions. It is defined by dimensions and facts.
The dimensions are the perspectives or entities concerning which an
organization keeps records.
3 Outline the OLTP process with examples. U 2
On-Line Transaction Processing (OLTP) System refers to the system
that manage transaction oriented applications. These systems are
designed to support on-line transaction and process query quickly on
the Internet. For example: POS (point of sale) system of any
supermarket is a OLTP System
4 What are the steps followed in ETL testing process? U 2
Here are 8 steps of an effective ETL testing process:
1. Identify your requirements. ...
2. Assess your data sources. ...
3. Create test cases. ...
4. Extract that data. ...
5. Transform that data. ...
6. Load that data. ...
7. Document your findings. ...
8. Conclude testing and proceed with ETL.
5 What is the differences between roll-up and Pivot? U 2
Roll-up is a process that involves viewing data with decreasing
detail. Pivot, on the other hand, is a pin or central point on which
something rotates or oscillates.
6 List the types of OLAP and its operations. U 2
Here are the main types of OLAP systems:
 Multidimensional (MOLAP)
 Relational (ROLAP
 Hybrid (HOLAP)
Operations:
 Pivot
 Drill down
 Slice
 dice
7 Show how a data cube of three dimensions looks like. U 2

8 Outline OLTP and OLAP database systems? U 2


OLAP (Online
Analytical OLTP (Online
Category Processing) Transaction Processing)

It is well-known as
It is well-known as an
an online database
Definition online database
query management
modifying system.
system.

Consists of
Consists of only
Data source historical data from
operational current data.
various Databases.

It makes use of a
It makes use of a standard database
Method used
data warehouse. management system
(DBMS).

It is subject-
oriented. Used
for Data Mining, It is application-oriented.
Application
Analytics, Used for business tasks.
Decisions making,
etc.

9 Define data cube in your own words U 2


A data cube (also called a business intelligence cube or OLAP cube) is
a data structure optimized for fast and efficient analysis. It enables
consolidating or aggregating relevant data into the cube and then
drilling down, slicing and dicing, or pivoting data to view it from
different angles.
Essentially, a cube is a section of data built from tables in a database
that contains calculations. OLAP cubes are typically grouped
according to business function, containing data relevant to each
function.
10 List the characteristics of OLAP systems U 2
 Multidimensional conceptual view
 Multi-User Support.
 Accessibility.
 Storing OLAP results.
 Uniform documenting performance
 Fast
 Share
 Analysis
 Information
11 List the Types of Data warehouses - U 2
 Enterprise Data Warehouse (EDW)
 Operational Data Store (ODS)
 Data Mart
12 Define Operational Data Store (ODS) U 2
An operational data store (ODS) is a central database used for
operational reporting as a data source for the enterprise data warehouse
described above. An ODS is a complementary element to an EDW and
is used for operational reporting, controls, and decision making.
An ODS is refreshed in real-time, making it preferable for routine
activities such as storing employee records. An EDW, on the other
hand, is used for tactical and strategic decision support.
13 Define Data Mart U 2
A data mart is considered a subset of a data warehouse and is usually
oriented to a specific team or business line, such as finance or sales. It
is subject-oriented, making specific data available to a defined group of
users more quickly, providing them with critical insights. The
availability of specific data ensures that they do not need to waste time
searching through an entire data warehouse.
14 Define Enterprise Data Warehouse (EDW) U 2
An enterprise data warehouse (EDW) is a centralized warehouse that
provides decision support services across the enterprise. EDWs are
usually a collection of databases that offer a unified approach for
organizing data and classifying data according to subject.
15 What are the Types of OLAP U 2
ROLAP stands for Relational OLAP, an application based on
relational DBMSs.
MOLAP stands for Multidimensional OLAP, an application based on
multidimensional DBMSs.
HOLAP stands for Hybrid OLAP, an application using both relational
and multidimensional techniques.
16 Define Multidimensional OLAP (MOLAP) Server U 2
A MOLAP system is based on a native logical model that directly
supports multidimensional data and operations. Data are stored
physically into multidimensional arrays, and positional techniques are
used to access them.
17 Define Hybrid OLAP (HOLAP) Server U 2
HOLAP incorporates the best features of MOLAP and ROLAP into a
single architecture. HOLAP systems save more substantial quantities
of detailed data in the relational tables while the aggregations are
stored in the pre-calculated cubes. HOLAP also can drill through from
the cube down to the relational tables for delineated data.
The Microsoft SQL Server 2000 provides a hybrid OLAP server.
18 Define Relational OLAP (ROLAP) Server U 2
Relational OLAP (ROLAP) is the latest and fastest-growing OLAP
technology segment in the market. This method allows multiple
multidimensional views of two-dimensional relational tables to be
created, avoiding structuring record around the desired view.
19 Differences Between ETL and ELT U 2
Category ETL ELT
Extract, transform, and Extract, load, and
Stands for
load transform
Takes raw data, Takes raw data, loads it
transforms it into a into the target data
Process predetermined format, warehouse, then
then loads it into the transforms it just before
target data warehouse. analytics.
Transformation occurs in Transformation takes
Transformation and
a secondary processing place in the target data
load locations
server. warehouse.
Can handle structured,
Data compatibility Best with structured data. unstructured, and semi-
structured data.
ELT is faster than ETL
as it can use the internal
Speed ETL is slower than ELT.
resources of the data
warehouse.
Can be time-consuming
More cost-efficient
and costly to set up
Costs depending on the ELT
depending on ETL tools
infrastructure used.
used.
May require building You can use built-in
custom applications to features of the target
Security
meet data protection database to manage data
requirements. protection.
20 U 2
Define Data Warehouse Modeling

Data warehouse modeling is the process of designing the schemas of


the detailed and summarized information of the data warehouse. The
goal of data warehouse modeling is to develop a schema describing the
reality, or at least a part of the fact, which the data warehouse is needed
to support.
21 Define two approaches of Data warehouse deisgn U 2

There are two approaches


1. "top-down" approach
2. "bottom-up" approach

PART B – 16 Marks

1 OLAP Vs OLTP
OLAP (Online Analytical OLTP (Online Transaction
Category Processing) Processing)
It is well-known as an online
database query management It is well-known as an online database
Definition system. modifying system.
Consists of historical data Consists of only operational current
Data source from various Databases. data.
It makes use of a standard database
Method used It makes use of a data warehouse. management system (DBMS).
It is subject-oriented. Used for
Data Mining, Analytics, It is application-oriented. Used for
Application Decisions making, etc. business tasks.
In an OLAP database, tables are In an OLTP database, tables are
Normalized not normalized. normalized (3NF).
The data is used in planning,
problem-solving, and decision- The data is used to perform day-to-day
Usage of data making. fundamental operations.
It provides a multi-dimensional It reveals a snapshot of present
Task view of different business tasks. business tasks.
It serves the purpose to extract It serves the purpose to Insert, Update,
information for analysis and and Delete information from the
Purpose decision-making. database.
The size of the data is relatively small as
A large amount of data is stored the historical data is archived in MB,
Volume of data typically in TB, PB and GB.
Relatively slow as the amount of
data involved is large. Queries Very Fast as the queries operate on 5%
Queries may take hours. of the data.
The OLAP database is not often
updated. As a result, data The data integrity constraint must be
Update integrity is unaffected. maintained in an OLTP database.
Backup and It only needs backup from time to The backup and recovery process is
Recovery time as compared to OLTP. maintained rigorously
It is comparatively fast in processing
The processing of complex because of simple and straightforward
Processing time queries can take a lengthy time. queries.
This data is generally managed by This data is managed by clerksForex
Types of users CEO, MD, and GM. and managers.
Only read and rarely write
Operations operations. Both read and write operations.
Updates With lengthy, scheduled batch The user initiates data updates, which
operations, data is refreshed on a are brief and quick.
regular basis.
Nature of The process is focused on the
audience customer. The process is focused on the market.
Design with a focus on the Design that is focused on the
Database Design subject. application.
2 ETL VS ELT

3 Which OLAP type is a combination of multi dimensional and relational OLAP, explain in
detail and list the differences between all three types.

4 Apply the two dimensional in a data cube in OLAP operations and explain them in detail.
Refer Book technical publications
5 Analyze the type of approach that will create small data warehouses and them merge them all
to create large data warehouse
List the differences between Inmon and Kimball approaches in detail.

6 Analyze the OLAP type that is used BI to provide comprehensive insights through dashboards
and reports, enabling data-driven decision-making and list the differnces between
multidimensional and relational OLAPs

7 Apply the Codd’s rule and his 12 guidelines in Financial sector and explain their process in
detail.
OLAP was introduced by Dr.E.F.Codd in 1993 and he presented 12 rules regarding OLAP:
1. Multidimensional Conceptual View:
Multidimensional data model is provided that is intuitively analytical and easy to
use. A multidimensional data model decides how the users perceive business
problems.
2. Transparency:
It makes the technology, underlying data repository, computing architecture, and
the diverse nature of source data totally transparent to users.
3. Accessibility:
Access should provided only to the data that is actually needed to perform the
specific analysis, presenting a single, coherent and consistent view to the users.
4. Consistent Reporting Performance:
Users should not experience any significant degradation in reporting performance
as the number of dimensions or the size of the database increases. It also ensures
users must perceive consistent run time, response time or machine utilization
every time a given query is run.
5. Client/Server Architecture:
It conforms the system to the principles of client/server architecture for optimum
performance, flexibility, adaptability, and interoperability.

6. Generic Dimensionality:
It should be ensured that very data dimension is equivalent in both structure and
operational capabilities. Have one logical structure for all dimensions.
7. Dynamic Sparse Matrix Handling:
Adaption should be of the physical schema to the specific analytical model being
created and loaded that optimizes sparse matrix handling.
8. Multi-user Support:
Support should be provided for end users to work concurrently with either the
same analytical model or to create different models from the same data.
9. Unrestricted Cross-dimensional Operations:
System should have abilities to recognize dimensional and automatically perform
roll-up and drill-down operations within a dimension or across dimensions.
10. Intuitive Data Manipulation:
Consolidation path reorientation, drill-down, and roll-up and other manipulations
to be accomplished intuitively should be enabled and directly via point and click
actions.
11. Flexible Reporting:
Business user is provided capabilities to arrange columns, rows, and cells in
manner that gives the facility of easy manipulation, analysis and synthesis of
information.
12. Unlimited Dimensions and Aggregation Levels:
There should be at least fifteen or twenty data dimensions within a common
analytical model.

From the above axes, explain the OLAP operations in a detailed manner.
Refer Book technical publications
9 The store has to process everyday transactions, such as purchases, inventory, and so on to
collect data for analysis. This data is put through the systems to analyze previous purchases,
fast-moving products, price ranges, and other custom attributes to understand which products
might work best in a discount sale.
Explain the methodology used here and list out the differences between OLAP and OLTP

10 Apply the three dimensional in a data cube in OLAP operations and explain them in detail.
Refer Book technical publications
11

Analyze the process and explain the segments in detail

12 If two customers simultaneously place an order for the last item in stock, the customer whose
transaction processes first will receive the product. The system will update the inventory and
cash transactions regularly to ensure orders are placed only for products in stock.
Analyze the data processing method used here and explain the procedure and list out the
differences between analytical and transactional processing methods.

13 A team or project manager makes decisions, which then filter down through a hierarchical
structure. Managers gather knowledge, analyze it, and draw actionable conclusions. They then
develop processes that are communicated to and implemented by the rest of the team. You may
hear this style of management referred to as “command and control” or “autocratic leadership.”
Explain the Inmon approach for the above scenario in detail
14 Which OLAP type is a combination of multi dimensional and relational OLAP, explain in
detail and list the differences between all three types.
15

From the above data, explain the OLAP operations in a detailed manner
Refer Book technical publications
16 You are a Senior Analyst in the IT department of a company manufacturing automobile parts.
The marketing VP is complaining about the poor response by IT in providing strategic
information. Draft a proposal to him explaining the reasons for the problems and why a data
warehouse would be the only viable solution.

You might also like