You are on page 1of 45

simpo313@gmail.

com 02/07/2023 1
What is Business Intelligence

 The term Business Intelligence refers collectively to the tools and technologies used for

the collection, integration, analysis, and visualization of data. The raw data which we

collect from different data sources transform into comprehensible data or meaningful

information using BI technologies.

simpo313@gmail.com 02/07/2023 2
 To simplify the concept, we collect raw data from various sources and with the

help of Business Intelligence tools transform it into meaningful information. We

can store such data in data warehouses or data lakes in specific data structures

 From the data warehouses, we can retrieve stored data in the form of a report,

query, make a dashboard to conduct data analysis. We do this with the process

known as ETL (Extract, Transform, Load).

simpo313@gmail.com 02/07/2023 3
So What is Data Warehousing?

 Data warehousing is the process of storing data in data warehouses, which are

databases following the relational database model. Data is selected from

different data sources, aggregated, organized and managed to provide

meaningful insights into data for analysis & queries.

simpo313@gmail.com 02/07/2023 4
 A data warehouse is known by several other terms like Decision Support

System (DSS), Executive Information System, Management Information

System, Business Intelligence Solution, Analytic Application.

 We call it Decision Support System as it provides useful insights and patterns

shown by data as a result of the analysis which makes taking important

decisions in business easy and safe.

simpo313@gmail.com 02/07/2023 5
How does a Data Warehousing Work?
 In data warehousing, data is de-normalized i.e. it is converted to 2NF from 3NF and

hence, is called Big data. We call it big data because of data redundancy increases

and so, data size increases. The sole purpose of creating data warehouses is to

retrieve processed data quickly.

 Also, to provide aggregate data like totals, averages, general trends etc for

enterprises to analyze and make decisions good for their business and functioning in

the industry
simpo313@gmail.com 02/07/2023 6
Components of Data Warehouse

 Operational Systems: These are the different operational domains in an

enterprise which serve a unique purpose and contribute in their ways for the

proper functioning of the enterprise.

 Different operating systems can be marketing, sales, Enterprise Resource

Planning (ERP), etc. All of these systems have their own normalized database

simpo313@gmail.com 02/07/2023 7
Integration Layer:
 The normalized data is present in the operational systems must not be manipulated.

Instead, a copy of that we take data into an integration layer staging area where

manipulate and transform it in specific ways.

 One basic operation done is bringing the copied data into a single standardized format

because, in the operational systems, data is not present in the same format. For

instance, in a data field, the data can be in pounds in one table, and dollars in another.

simpo313@gmail.com 02/07/2023 8
Data Warehouse:

 The transformed and standardized data flows into the next element, known

as the data warehouse which is a very large database. So, the data stores

from all over the enterprise in this data vault in the second normal form

having a certain uniform format and structure

simpo313@gmail.com 02/07/2023 9
Data Marts:
 These are the purpose-specific sub-databases of the data warehouse containing only some

parts of the entire big data. In each data mart, only that data which is useful for a particular

use is available like there will be different data marts for analysis related to marketing,

finance, administration etc.

 Each of these databases does not coincide or share their data with each other and

operations performed in each of them does not influence the other. This makes fetching

data from the data marts much faster than doing it from the much larger data warehouse.

simpo313@gmail.com 02/07/2023 10
Business Intelligence and Data
Warehousing
 Data warehousing and Business Intelligence often go hand in hand, because the

data made available in the data warehouses are central to the Business

Intelligence tools’ use. 

 BI tools like Tableau, Sisense, Chartio, Looker etc, use data from the data

warehouses for purposes like query, reporting, analytics, and data mining.

simpo313@gmail.com 02/07/2023 11
In any enterprise, Business Intelligence plays a central
role in the smooth and cost-effective functioning of it

 BI is helpful in operational efficiency which includes ERP reporting, KPI

 tracking, risk management, product profitability, costing, logistics etc.

 helps in customer interaction which includes, sales analysis, sales forecasting,

segmentation, campaign planning, customer profitability etc.

simpo313@gmail.com 02/07/2023 12
 . From our prior discussions, we know that data warehouses store processed

and aggregated data. Business Intelligence tools require such data from the

data warehouses.

 The data is transported through the Online Analytical Processing (OLAP). Data

warehousing and OLAP has proved to be a much-needed jump from the old

decision-making apps which used OLTP.

simpo313@gmail.com 02/07/2023 13
simpo313@gmail.com 02/07/2023 14
– Architecture and Process of data
warehousing and BI

 In this section, we will see how to extract, transform and load raw data into

data warehouses. Also, we discuss how BI tools use it for analytical purposes.

Refer to the image given below, to understand the process better

simpo313@gmail.com 02/07/2023 15
simpo313@gmail.com 02/07/2023 16
 Step 1: Extracting raw data from data sources like traditional data, workbooks, excel

files etc.

 Step 2: The raw data that is collected from different data sources are consolidated

and integrated to be stored in a special database called a data warehouse The process

by which we fetch the data into data warehouses from the source is ETL (Extract,

Transform, Load). This extracts raw data from the original sources, transforms or

manipulates it different ways and loads it into the data warehouse.

simpo313@gmail.com 02/07/2023 17
 Step 3: If you wish to use data from the data warehouse for specific purposes

like marketing analysis, financial analysis etc., subsets of the data warehouse

are created known as data marts and data cubes. Data from the data

warehouse to the data marts also goes through the ETL.

simpo313@gmail.com 02/07/2023 18
 Step 4: From both data warehouse and data marts, data is redirected to data

or OLAP cubes which are multi-dimensional data sets whose data is ready to

be used by front-end BI tools or clients.

 At the front-end, exists BI tools such as query tools, reporting, analysis, and 

data mining. These BI tools query data from OLAP cubes and use it for

analysis.
simpo313@gmail.com 02/07/2023 19
Summary
 Thus, Business Intelligence and Data Warehousing are two important pillars in the

survival of an enterprise. It helps to keep a check on critical elements like CRM,

ERP, supply chain, products, and customers.

 The Business Intelligence and Data Warehousing technologies give accurate,

comprehensive, integrated and up-to-date information on the current situation of

an enterprise which supports taking required steps and making important

decisions for the company’s growth


simpo313@gmail.com 02/07/2023 20
Understanding Data Warehouse-its features

 Definition

 Data warehouse is Subject Oriented, Integrated, Time-Variant and Non-volatile

collection of data that support management's decision making process.

Food for thought.

 “what is e difference between data warehouses and Operational Databases?”

simpo313@gmail.com 02/07/2023 21
Data Warehouse—Subject-Oriented
 Organized around major subjects, such as customer, product, sales,
employees.

 This subject specific design helps in reducing the query response


time by searching through very few records to get an answer to the
user‟s question.

simpo313@gmail.com 02/07/2023 22
Data Warehouse—Integrated
 Constructed by integrating multiple, heterogeneous data sources

 relational databases, flat files, on-line transaction records

 Data cleaning and data integration techniques are applied.

 Ensure consistency in naming conventions, encoding structures, attribute measures, etc. among

different data sources

 E.g., When short listing your top 20 customers, you must know that “HAL” and “Hindustan Aeronautics

Limited” are one and the same. Much of the transformation and loading work that goes into the data

warehouse is centered on integrating data and standardizing it.

simpo313@gmail.com 02/07/2023 23
Data Warehouse—Time Variant/time
referenced data
 The time horizon for the data warehouse is significantly longer than
that of operational systems.
 Operational database: current value data.
 Data warehouse data: provide information from a historical perspective (e.g.,
past 5-10 years)
 For example, the user may ask “What were the total sales of product
A for the past three years on New Year’s Day across region Y ‟?”

simpo313@gmail.com 02/07/2023 24
And………

 Time-referenced data when analyzed can also help in spotting the hidden

trends between different associative data elements, which may not be

obvious to the naked eye. This exploration activity is termed “data mining”.

simpo313@gmail.com 02/07/2023 25
Data Warehouse—Non-Volatile

 Once data is in, it will not change, historical data in DW

should never be changed. This enables management to get a

consistent picture of the business

simpo313@gmail.com 02/07/2023 26
Metadata

 Metadata is simply defined as data about data. For example the index of a book serve

as metadata for the contents in the book. In other words we can say that metadata

is the summarized data that lead us to the detailed data.

 In terms of data warehouse we can understand metadata as following:

 Metadata is a road map to data warehouse.

 The metadata act as a directory. This directory helps the decision support system to

locate the contents of data warehouse.

simpo313@gmail.com 02/07/2023 27
Metadata Respiratory
 The Metadata Respiratory is an integral part of data warehouse system. The

Metadata Respiratory contains the following metadata:

 Business Metadata - This metadata has the data ownership information, business

definition and changing policies.

 Operational Metadata -This metadata includes currency of data and data lineage.

Currency of data means whether data is active, archived. Lineage of data means

history of data migrated and transformation applied on it.

simpo313@gmail.com 02/07/2023 28
 Data for mapping from operational environment to data warehouse -This

metadata includes source databases and their contents, data extraction, data

partition, cleaning, transformation rules, data refresh and purging rules.

 The algorithms for summarization - This includes dimension algorithms, data

on granularity, aggregation, summarizing etc.

simpo313@gmail.com 02/07/2023 29
Data cube
 Data cube help us to represent the data in multiple dimensions. The data cube is defined by
dimensions and facts.
Illustration of Data cube

 Suppose a company wants to keep track of sales records with help of sales data warehouse with

respect to time, item, branch and location. These dimensions allow to keep track of monthly

sales and at which branch the items were sold. There is a dimension table table associated

with each dimension. This dimension table further describes the dimensions. For example

"item" dimension table may have attributes such as item_name, item_type and item_brand.

simpo313@gmail.com 02/07/2023 30
 The following table represents 2-D view of Sales Data for a company with
respect to time, item and location dimensions

simpo313@gmail.com 02/07/2023 31
But here in this 2-D table we have records with respect to time and item only.

The sales for New Delhi are shown with respect to time and item dimensions

according to type of item sold.

If we want to view the sales data with one new dimension say the location

dimension. The 3-D view of the sales data with respect to time, item, and

location is shown in the table below:

simpo313@gmail.com 02/07/2023 32
simpo313@gmail.com 02/07/2023 33
The above 3-D table can be represented as 3-D data cube as
shown in the following figure:

simpo313@gmail.com 02/07/2023 34
Data mart

 Data mart contains the subset of organisation-wide data. This subset of data is valuable to

specific group of an organisation. in other words we can say that data mart contains only that

data which is specific to a particular group.

 For example the marketing data mart may contain only data related to item, customers

and sales. The data mart are confined to subjects.

simpo313@gmail.com 02/07/2023 35
Points to remember about data marts:

 Data mart are small in size.

 Data mart are customized by department.

 The source of data mart is departmentally structured data warehouse.

 Data mart are flexible.

simpo313@gmail.com 02/07/2023 36
Graphical Representation of data mart.

simpo313@gmail.com 02/07/2023 37
Process Flow in Data Warehouse:
The ETL Process

 Everyone understands the three letters:

 You get the data out of its original source location (E), you

do something to it(T), and then you load it (L) into a final

set of tables for the business users to query

simpo313@gmail.com 02/07/2023 38
THE ETL Process
 Extract, transform, and load (ETL) is a process in data warehousing that

involves:

 extracting data from sources systems; (these are the (OLTP) On Line Transaction

Processes)

 transforming the extracted data to match business needs.

 loading the transformed into the data warehouse

simpo313@gmail.com 02/07/2023 39
Extract

 The first part of an ETL process is to extract data from the source systems.

Data warehousing projects consolidate data from different source systems.

Each separate system may also use a different data format.

simpo313@gmail.com 02/07/2023 40
Transform

 This phase applies a series of rules or functions to the extracted data to

derive the required data format to be loaded in the data warehouse.

 Some data sources will require very little manipulation of data. In other

cases, one or more of the following transformations types may be required

simpo313@gmail.com 07/02/2023 41
POSSIBLE DATA TRANSFORMATIONS

1. Selecting only certain columns to load (or selecting null columns not to

load)

2. Translating coded values (e.g., if the source system stores M for male and F

for female, but the warehouse stores 1 for male and 2 for female)

3. Deriving a new calculated value (e.g., sale_amount = qty * unit_price)

simpo313@gmail.com 07/02/2023 42
 Summarizing multiple rows of data (e.g., total sales for each region)

 Joining together data from multiple sources (e.g., lookup, merge,

etc.)

 Splitting a column into multiple columns (e.g., putting a comma-

separated list specified as a string in one column as individual

values in different columns)

 Generating surrogate key values

simpo313@gmail.com 07/02/2023 43
Transformation types

 Data must be merged from different systems, e.g. one source may store the

same information with a different structure.

 Data must be scrubbed for inconsistencies in e.g. spelling errors or

variations. It is a good idea to use surrogate keys: keys maintained at the

data warehouse that are independent of keys from the data sources.

 Data must be pre-aggregated for faster analysis.

simpo313@gmail.com 02/07/2023 44
Load

 The load phase loads the data into the data warehouse. The data loaded can

be used to support BI eg for reporting purposes

simpo313@gmail.com 07/02/2023 45

You might also like