You are on page 1of 25

Unit 3: BI Definitions & Concepts

2-Marks

1. List components of Business layer.

Ans: There are 4 components in Business layer

a) Business requirements

b) Business value

c) Program Management

d) Development

2. List components of Administration & operation layers.

Ans: There are 4 components in this layer

a) BI architecture b) BI and DW operations

c) Data resource administration d) Business applications

3. List the components of BI framework.

Ans: 1) Business layer

2) Administration and Operation layer

3) Implementation layer

4. Define Data Warehouse.

Ans: A data warehouse is a type of data management system that is designed to enable and
support business intelligence (BI) activities, especially analytics.

5. List & Define BI users.

Ans: There are two types of BI users:

1) casual users

2) power users

1) Casual users:

They are the consumers of information who use the pre-existing reports created by power
users and make decisions / take actions.
2) Power users:

They are the producers of information.They use powerful analytical and authoring tools to
access data from data warehouses/ data marts and other sources from inside and outside the
organization.

6. List Technology solutions in Business Intelligence Applications.

Ans: a)DSS (decisions support systems)

b)EIS (executive information systems)

c)OLAP( online analytical processing)

d)Managed query and reporting

e)Data mining

7. List Business solutions in Business Intelligence Applications.

Ans: a) Customer analytics

b) Marketplace analysis

c) Performance analysis

d) Behavior analysis

e) Supply chain analytics

f) Productivity analysis

g) Sales channel analysis

8. Define Data Mart.

Ans: a data mart holds data and aggregations about one single subject area/ domain which
can be used for analysis , reporting or decision support.

9. Define ODS.

Ans: ODS(Operational data store) processes the operational data that is fed into data
warehouse and provides a homogeneous unified view which can be used for analysis and
reporting.
10. Define ETL.

Ans: ETL, which stands for extract, transform and load, is a data integration process that
combines data from multiple data sources into a single, consistent data store that is loaded
into a data warehouse or other target system.

11. Define Data Integrity.

Ans: It is the degree to which the attributes of data, associated with a certain entity
accurately describes the occurrence of that entity.

12. List advantages of Data Integration.

• Ans: it helps decision makes to quickly access and query information based on key
variable to gain meaningful insights

• helps reduce costs, overlaps, redundancies, minimize risks

better monitoring of key variables like trending patterns and customer behavior across
geographies which leads to reduced R&D costs

13. Define Data sources.

Ans: Data sources (transactional or operational, external) are the sources from which we
extract data.

14. List Advantages of Data Integration.

Ans: It helps decision makes to quickly access and query information based on key variable
to gain meaningful insights

It helps to reduce costs, overlaps, redundancies, minimize risks.

Better to monitoring of key variables like trending patterns and customer behavior across
geographies which leads to reduced R&D costs.

15. Mention the steps to draw ER Model.

Ans: Steps to draw a ER model:

• Identify entities

• Identify relationships between entities

• Identify key attributes

• Identify other relevant attributes for entities


• Draw ER diagram

• Review the ER diagram with business users and get their sign off.

16. Define Data Quality.

Ans: Data quality is measured with reference to appropriateness of purpose as defined by


business users and conformance to enterprise data quality standards as defined by system
architects and admins.

17. Define Data Profiling.


Ans: Data profiling is the process of statistically examining and analyzing the content in a
data source, and hence collecting information about that data.

5 or 10-Marks Questions
1. Explain briefly Business requirements.
Ans: It is a result of the 3 step process namely business drivers, business goals and business
strategies.

 Business drivers are factors that initiate the need to act. Ex: changing labour
laws, changing economy, workforce, technology etc.
 Business goals are the targets to be achieved in response to business drivers. Ex:
increased productivity, improved market share, profits, customer satisfaction, cost
reduction etc.
 Business strategies are the plan of action to achieve the set goals. Ex:
outsourcing, global delivery model, customer and employee retention programs
etc.

2. Explain Business value.


Ans: Business Value

 Value is a desirable result for a stakeholder in a context.


 Business value is measured in terms of ROI , ROA , TCO and TVO
 Return on Investment (ROI): It is a performance measure used to evaluate the
efficiency (benefits) of an investment. Ex: a electronics goods company invests it 10% of
daily revenue on social media to get new clients and increase its prospects. This is the
ROI from social media.
 Return on Asset (ROA): It is the earning generated from invested capital or assets. Ex:
If a company’s net income is 1 million and its assets are 5 million then ROA= (1/5)
*100 = 20%
 Total cost of ownership (TCO): It is the purchase price of an asset plus the costs of
operation.
 Ex: The TCO of a car is not only the purchase price but also the expenses
incurred through its use, such as repairs, insurance and fuel.
 Total value of ownership (TVO): It denotes the total financial value of a service or
product plus some subcategories like stock, undistributed dividends etc.
o TVO = Total assets – total liabilities.
3. List and Explain components of BI architecture.

Ans:
4. Explain with neat diagram Implementation layer of BI component framework.

• Ans: A sound architecture in a organization / enterprise is necessary to support the


technical, functional and data needs.

Implementation layer:

• It consists of technical components required to capture, transform, clean and convert data
into meaningful information and deliver it to meet business goals and bring value to
business.

• It includes a) data warehousing and

b) information services

a) Data ware housing:

• It is the process that prepares basic repository / data store from which data is extracted.

• it is built on a dimensional model schema optimized specially for data retrieval.

• the roles involved intake, integration, distribution, delivery, access.

refer fig 5.6 for an example on data warehouse for “AllGoods” store..
b) Information services:

• ensures that information produced is according to business requirements and produces


value for the company

• information is delivered in form of KPI’s, reports, charts, dashboards, scoreboards ,


analytics etc..

• data mining is used to increase the body of knowledge

• applied analytics is used drive action and produce outcomes.

5. Differentiate types of BI Users.

• Ans: There are two types of BI users:

1) casual users

2) power users

1) Casual users:

• They are the consumers of information who use the pre-existing reports created by power
users and make decisions / take actions.

• they do not create reports

• ex: executives, managers, field/operations workers, customers, suppliers.

2) Power users:
• They are the producers of information

• They use powerful analytical and authoring tools to access data from data warehouses/
data marts and other sources from inside and outside the organization.

• ex: developers, administrators, business analysts, IT professionals, analytical modelers.

• They take decisions on issues like

what information should be placed on report?

what is the best way to present information?

who should see what information?

how the information should be distributed?

6. Explain technology solutions in Business Intelligence Applications.

1) Ans: Technology solutions:

a) DSS (decisions support systems): help in decision making at operational and tactical
levels.

b) EIS (executive information systems): supports decision making at senior management


level by providing both internal and external information. It has a easy GUI with strong
reporting tools.

c) OLAP( online analytical processing):

• OLAP systems are built upon multidimensional data.

• OLAP tools allow slicing and dicing of data from various perspectives.

Refer fig:5.8 for OLAP architecture


c) OLAP( online analytical processing)

The OLAP has basically 3 tiers

i) data warehouse tier :

– it consists of enterprise wide DW.

– The source of data is multiple heterogeneous internal and external data sources

– Uses back end tools and utilities for ETL

– Data is refreshed periodically in response to the updates

– has a metadata repository for DW metadata

ii) OLAP server :

– It includes a ROLAP or MOLAP server

iii) Front end tools : it supports the user/client by providing reporting, analysis and
data mining tools.

7. Explain Business solutions in Business Intelligence Applications.

Ans: Business solutions:


a) Customer analytics:

• It captures data about customer behavior and predicts customer buying pattern.

• Helps make decisions like direct marketing , CRM, customer loyalty & satisfaction.

b) Marketplace analysis:

• helps understand customers, competitors, products and changing market dynamics.

• helps answer questions like: whether to launch product X in area A? should we


discontinue product Y?

c) Performance analysis:

• It optimizes the utilization of employees, finance, resources etc.

• Ex: employee performance analysis helps in retaining and rewarding employees.

• It provides insights on areas of concern that need immediate attention in business.

d) Behavior analysis:

• Helps to predict purchasing patterns and online buying patterns.

e) Supply chain analytics:

• Helps optimize process of planning  manufacturing  sales

• Helps to identify problems in functions like sourcing, inventory, manufacturing, sales,


logistics etc.

f) Productivity analysis:

• Productivity is the ratio of output produced per unit of input.

• This analysis aims to increase the enterprise profits

• It includes collecting data, performing aggregations and comparing the actual result
against the estimated /planned.

• It helps in evaluating performance of a enterprise

g) Sales channel analysis:

• It helps to decide the best channel to reach out the products/ services for the
customers/consumers.

• It follows 4 P’s of marketing :


product  decides on what product to produce

price decides on when to increase/ decrease the price placing decides on how to
reach the customers ( physical stores or online store)

promotion decides on personal selling , advertising also decides on which sales channel
should be discontinued

8. Explain briefly BI rules & responsibilities.

• Ans: BI roles are classified as

1) Program team roles:

• the program team prepares the strategy on how the BI project will execute.

• they are responsible for integration and co ordination

2) Project team roles:

• this team executes the program team’s strategy

1) BI Program team roles:

a) BI program manager:

• He is responsible for several projects

• He Aligns the BI project with the organizations strategic objectives

• Defines metrics to measure and monitor progress on each objective/goal

• Plans and budgets the projects and follows up the progress of each project

• Distributes tasks , allocates/ de-allocates resources

• Identifies and measures success/ ROI

b) BI data architect:

• He is accountable for enterprise data

• Optimizes current data usage and takes care of future data needs( design and content)

• Ensures proper definition, storage, distribution, archiving and management of data.

c) BI ETL architect:

• He determines the best way to obtain data from different operational sources/platforms.
trains the ETL specialists on data acquisition, transformation and loading.

d) BI technical architect:

• Interfaces with operations staff, technical staff, DBA staff

• Selects and evaluates BI tools( ETL, reporting etc..)

• Assesses current technical architecture and system capacity for long term processing
needs.

• Defines BI strategy or process for technical architecture, H/W , DBMS, middleware,


network, server and client configurations.

• Defines strategy for data back up and recovery and disaster recovery.

e) Metadata manager

• keeps track of structure of technical metadata

• levels of details of data

• when was ETL audit performed?

• when was data warehouse updated?

• who accessed the application metadata, when and what is the frequency of access?

f) BI Administrator:

• He is designer and architect of entire BI environment

• Architect of metadata layer

• Monitor the progress and security of entire BI environment

• Monitor ETL jobs, reports for business users

• Tune the performance of entire BI environment

• Maintain version control of all objects in BI environment

2) BI project team roles:

a) Business Manager:

• Monitors the project from user group perspective

• Monitors activities of project team


• Addresses the business issues identified by project managers

b) BI business specialists:

• They have good understanding of the business area of focus.

• He is the lead in data stewardship and quality

• Ensures that the information is identified correctly at all levels and accessed at all modes.

c) BI project Manager:

• He leads the project and ensures delivery of all project needs and assesses risks.

• Translates business needs into technical terms

• Ensures all business standards and BI processes are followed

• Analyzes DSS and EIS to understand their functionality

• Predicts what users may/ will want

• Motivates and evaluates and communicates with team members

• Understands technical and information architecture

• Coordinates with program managers and other project managers

Implements warehousing specific standards

d) Business Requirement Analyst:

• He communicates between end-users and BI project team and performs requirement


gathering.

• Questions the end users to determine requirements

• Transforms requirements into technical specifications working along with technical


architects

• Documents requirements

• Helps identify potential data sources

• Validates that BI meets requirements and service level agreements

coordinates prototype reviews and gathers feedback.

e) Decision Support Analyst:


• He is an expert on issues related to business objectives and problems and provides
required data to address these issues.

• Designs training infrastructure & material and trains BI users and educates users on
warehousing capabilities

• Maps and validates business requirements and production data

• Classifies business users by type

• Develops necessary reports, DS and EIS

• Develops security rules and standards

• Plans and executes acceptance tests and helps users find right information

f) BI Designer:

• He interprets the requirements and designs the data structure for optimal access,
performance and integration

• Creates the subject area model

• Creates a logical, structural and physical staging area model

• Creates a logical, structural and physical distribution model

• Creates a logical, structural and physical relational model

• Creates a logical, structural and physical dimensional model

validate models with production data

g) ETL specialist:

• Determines and implements the best technique for data extraction

• Understands and maps source and target BI systems

• Identifies and assess data sources

• Apply business rules as transformations

• Implement navigation methods / applications

• Design and specify data detection & extraction process

• Design and develop transformations of code/logic/programs for environment


• Design and develop data transport and population process for environment

h) Database administrator:

• He is the guardian of data and data warehouse

• Keeps check of the physical data appended to BI environment in current project cycle

• Designs, implements and tune database schemas

• Manages storage space and memory

• Create and optimize and administer physical tables, triggers and partitions

• Implement all models, indexing strategies

• Log technical action reports

• Document configuration and integration with applications and network resources

Maintain backup and recovery documentation.

9. Explain the need for Data Warehouse with ETL process diagram.

• Ans: Data from several heterogeneous sources ( like spreadsheets, Access database,
.CSV files, etc) can be extracted and archived in a data warehouse. ( refer fig: 6.1)

• One data warehouse can support information needs of several branches of an


organization.

• Data anomalies can be corrected through ETL package

• Missing and incomplete records can be detected and corrected

• Uniformity can be maintained over each attribute of table

• Easy data retrieval for analysis and generating reports

• Supports fact based decision making and predicting trends

supports ad-hoc queries


10. Explain briefly Ralph Kimballs Vs Inmon’s approach.

Ans: Ralph Kimball’s approach to building a data warehouse

• According to ralph kimball “ A data warehouse is made up of all the data marts in an
enterprise”

• It is a bottom up approach.

• Small organizations benefit from using this approach

• It is faster, cheaper and less complex

• The single version of truth might be compromised since several independent data marts
are likely to have multiple versions of same entity/data.

• The decisions made might lead to confusion

• Information is stored in dimensional model ( like star schema)


Bill Inmon’s approach to building a data warehouse

• According to Bill Inmon : “A data warehouse is a subject oriented, integrated, time


variant and non-volatile collection of data in support of management’s decision making
process”.

• It is a top down approach

• Large organizations benefit from using this approach.

• It is expensive, slower processing and complicated

• It achieves single version of truth for large organizations

• The decisions made are effective.

• In the data warehouse information is stored in 3NF.

11. Explain Goals of Data warehouse.

Ans: Goals of a Data warehouse

1) Information accessibility:

• Data in data warehouse must be easy to understand by users and developers

• It must be properly labeled for easy access

• Users must be allowed to slice and dice the data

2) Information credibility:

• The data must be credible, complete, consistent and of desired quality.


3) Flexible to change:

• Data warehouse must adapt to changes like business situations, user requirements,
technology, access tools etc.

• Adding of new data must not invalidate the existing data

4) Support for fact based decision making:

• Data should be relevant for more precise decision making and easily accessible to
business users

5) Support for data security:

• There must be security mechanisms enabled so that the confidential data must be
accessible only to valid authorized users.

6) Information consistency:

• The information provided to the users should maintain single/consistent version of truth

12. Explain Two Main Approaches of Data Integration.

Ans: Two main approaches to data integration

The two main approaches to data integration are:

1) Schema integration 2) Instance integration

1. Schema integration :

• It is developing a unified representation of semantically similar data which is structured


and stored differently in individual databases.

• Multiple data sources provide data on same entity type. Hence schema integration allows
applications a transparent view and ability to query the data as if it is from one uniform
data source.

• Consider a retail outlet that has two branches namely Branch A that stores transaction
data with following schema
• Branch B stores transaction data with following schema

• The schema from both branches is integrated by mapping the respective columns by
looking up the metadata information of schemas like column names, type, length,
constraints, domain of values, NULL, zero and blank values etc.

• The integrated schema is

2) Instance integration:

• Here the information is directly derived from the data to get accurate semantic
information on data content.

• It identifies and integrates all instances of the data items that represent the real world
entity.

• Ex: Consider a corporate house with 10,000 employees. they do not have a ERP. Instead
they have various applications like “projectAllocate”, “employeeLeave”
,”employeeAttendance”, “employeePayroll”

Now the company wants to consolidate all the details of every employee.

There is a employee named “fred Aleck”. All the applications have stored his name in
different way.
ProjectAllocate

EmployeeLeave

EmployeeAttendance

EmployeePayroll

• One solution to consolidate the data in above tables is, to look up all the records using
the primary key (ex: employeeNo or SSN) and then replace the column with different
data values for same attributes with a consistent value (ex: Fred aleck).

• ProjectAllocate
EmployeeLeave

EmployeeAttendance

EmployeePayroll

13. Explain Data Integration Technologies.

Ans: 1)Data interchange

2) Object brokering

3) Modeling techniques

a) ER modeling

b) Dimensional modeling

1) Data interchange:

• It is the structural transmission of organizational data/ electronic documents between


two or more organizations using electronic means. Ex: EDI

• It provides standards to exchange data electronically

2) ORB ( object request broker) :

• It is a middleware which allows program calls to be made from one computer to another
via a computer network, providing location transparency through remote procedure calls.
• It handles transformation of in-process data structure to and from the byte sequence.

Modeling techniques:

a) ER modeling:

• It is a logical design technique whose main goal is to reduce data redundancy and
hence solve the problems in insert, delete and update.

• It is used for transaction capture and helps in initial stages of data warehouse construction

Steps to draw a ER model:

• identify entities

• identify relationships between entities

• identify key attributes

• identify other relevant attributes for entities

• draw ER diagram

• review the ER diagram with business users and get their sign off.

b) Dimensional modeling:

• It is a logical design technique whose goal is to present data in a standard format to


the end users.

• It is used for data warehouse with star or snowflake schema

• It consists of one large table called as fact table and a number of relatively smaller tables
called dimensional tables.

• Each fact table has a mutli-part primary key and each table has a single-part primary key.

14. Explain several dimensions of Data Quality

Ans: Several dimensions of data quality:

1) Correctness / accuracy:

• It is the degree to which the captured data correctly reflects /describes the world
entity/object/event.

examples:

• The address of a customer maintained in customer database is real address


• The temperature recorded in thermometer is real temperature

• The age of patient in hospital database is his real age

• The bank balance in customer’s account is the real value customer deserves from the
bank

2) Consistency :

• It is about single version of truth

• The data throughout the enterprise must be in sync with each other

• Ex of consistent data: an employee has left a company and so his company email id is
made inactive

ex of inconsistent data : a customer has cancelled and surrendered his credit card. But
still his billing status reads due

3) Completeness :

• It is the extent to which the expected attributes of data are provided

• example on data completeness

data of all students of university is available

data of all patients of hospital is available

data of all clients of IT company is available

• example on complete yet inaccurate data

a customer provides his address details at a restaurant but those details may be
incorrect.

4) Timeliness :

• It is important to provide right data at right time to the right people in business

• Delayed supply of data becomes inconsequential and useless

• ex of timely data:

Airlines must provide most recent data to passengers.

The quarterly results must be published at end of a financial year.


• ex of non timely data: the population census results are published two years after the
census survey is completed

5) Metadata :

• It is data about the data.

• It helps in determining data usage

• Ex 1: for a relational database, the schema is its metadata

• Ex 2: the logical and conceptual models are also considered as metadata

15. Compare Entity Relationship & dimensional modeling.

Ans:

16. Draw ER diagram between Book and Issue_Return entities.


17. How to conduct Data Profiling?

Ans: Data profiling involves statistical analysis of source data and metadata.

Some examples of data analysis are:

1) Data quality : analyze the quality of data at data source

ex: a column containing phone.no must be numeric. Hence remove any characters in the
field.

2) NULL values: look for the number of null values in an attribute

3) Candidate keys: to select a candidate key, analysis of the extent to which certain
columns are distinct is done.

4) Primary key selection: check if the candidate key does not violate NOT NULL and
UNIQUE constraint

5) Empty string values: check a column for null values or empty strings, since they create
problems while cube creation.

6)String length: analyzing the largest, average and shortest string length helps decide what
data type is appropriate for that column

7) Numeric length and type: assessing the max and min possible values for a numeric
column helps decide what datatype is suitable for that column.

8) Identification of cardinality:

• The cardinality relationships are important for inner and outer joins wrt several BI tools.

• It is also important for design of fact-dimension relationship

9)Data format:

• Changing the data formats to make them more user friendly

• ex: marital status from “M” & “S” to “married “ and “single”

You might also like