Professional Documents
Culture Documents
College of Commerce
EMBA Master Program Cohort 5
IT Strategy Course
Business Intelligence
(BI)
Presented by:
Mohamed Salah Eladawy
Mostafa Samy Shaawat
Hasan Ibrahim Hasan
Supervised by:
Dr. Maged Farouk Elsayed
Outline
IT Strategy
What is Business
Intelligent ? (1-4)
Enables the business to make intelligent
and fact-based decisions.
Aggregate
Data
IT Strategy
Present
Data
Reporting Tools,
Dashboards, Static
Reports, Mobile Reporting,
OLAP Cubes
Enrich
Data
Inform a
Decision
What happened?
What is happening?
Why did it happen?
What will happen?
What do I want to happen?
Past
Present
Future
Data
IT Strategy
ERP
CRM
SCM
Business Intelligence
(BI)
3Pty
4
(3-4)
Business Intelligence
(BI)
IT Strategy
Business Intelligence
(BI)
(4-
IT Strategy
(1-10)
IT Strategy
(2-10)
IT Strategy
(3-10)
IT Strategy
(4-10)
10
IT Strategy
(5-10)
11
IT Strategy
(6-10)
12
IT Strategy
(7-10)
13
IT Strategy
(8-10)
14
IT Strategy
(9-10)
15
IT Strategy
(10-10)
16
IT Strategy
17
IT Strategy
18
Outline
IT Strategy
19
Data Marts:
Subset of data warehouse.
Summarized or highly focused portion of firms data.
Typically focuses on single subject or line of business.
IT Strategy
20
ETL
(2-4)
IT Strategy
21
IT Strategy
22
(4-
4)
IT Strategy
23
Outline
IT Strategy
24
What is OLAP ?
(1-15)
at one time.
IT Strategy
25
OLAP CUBE
(2-15)
IT Strategy
26
OLTP vs OLAP
(3-15)
27
OLAP
IT Strategy
(4-15)
28
OLAP
IT Strategy
(4-15)
28
item
Sales
Fact Table
time_key
item_key
branch_key
branch
location_key
branch_key
branch_name
branch_type
units_sold
dollars_sold
avg_sales
Measures
IT Strategy
(5-15)
item_key
item_name
brand
type
supplier_key
supplier
supplier_key
supplier_type
location
location_key
street
city_key
city
city_key
city
province_or_street
country
29
OLAP Operations
(6-15)
Roll Up
Drill Down
Single Cell
IT Strategy
Multiple Cells
Slice
Dice
30
OLAP 3-Dimentions
(7-15)
Year
Product
Product
City
Office
Month Week
Day
Month
IT Strategy
31
OLAP Pivot
(8-15)
Pivot (Rotate)
33
(9-15)
Product
Household
Telecomm
Video
Audio
eg
R
s
n
io
Europe
Far East
India
Retail Direct Special
IT Strategy
Sales Channel
34
(10-15)
IT Strategy
Sales Channel
Region
Country
State
Location Address
Sales
Representative
Business Intelligent (BI)
Drill-Down
Roll Up
Higher Level of
Aggregation
Low-level
Details
35
Roll UP
(11-15)
IT Strategy
36
Drill Down
(12-15)
IT Strategy
37
(13-15)
Further drilled down to just stores in California
IT Strategy
38
(14-15)
39
(15-15)
being
being
being
being
being
being "average daily sales for Los Angeles Store for Febbeing "average daily sales for Los Angeles Store for year
Outline
IT Strategy
41
Data Mining
Finds hidden patterns, relationships in large
databases and infers rules to predict future
behavior
E.g., Finding patterns in customer data for oneto-one marketing campaigns or to identify
profitable customers.
Types of information obtainable from data mining
IT Strategy
Associations
Sequences
Classification
Clustering
Forecasting
Business Intelligent (BI)
42
Knowledge
Data Mining
Selection and
Transformation
Data Warehouse
Cleaning and
Integration
IT Strategy
Databases
Flat
Files
43
Data Mining is
IT Strategy
44
Data Mining
Data mining is also called knowledge
discovery and data mining (KDD)
Data mining is extraction of useful
patterns from data.
IT Strategy
45
Association Rules
(1-8)
Association rules:
80% of customers who buy cheese and
milk also buy bread, and 5% of
customers buy all of them together
Cheese, Milk Bread [sup =5%,
confid=80%]
IT Strategy
46
Association Rules
(2-8)
Some concepts
Market-basket model
Association Rules
(3-8)
Deviation detection:
discovering the most significant changes
in data
48
IT Strategy
49
Association Rules
IT Strategy
(5-8)
50
Association Rules
IT Strategy
(6-8)
51
(7-8)
IT Strategy
52
(8-8)
Association Rule
An implication expression of the form
X Y, where X and Y are itemsets
Example:
{Milk, Diaper} {Beer}
Example:
Confidence (c)
s
c
IT Strategy
53
Classification
(1-4)
Introduction
Classification is the process of learning a model
that describes different classes of data, the
classes are predetermined
The model that is produced is usually in the
form of a decision tree or a set of rules
married
Yes
no
salary
Acct balance
>5k
<20k
Poor risk
>=20k
<50k
>=50
<5k
Poor risk
age
<25
Fair risk
>=25
Good risk
RID
Married
Salary
Acct balance
Age
Loanworthy
No
>=50
<5k
>=25
Yes
Yes
>=50
>=5k
>=25
Yes
Yes
20k..50k
<5k
<25
No
No
<20k
>=5k
<25
No
No
<20k
<5k
>=25
No
Yes
20k..50k
>=5k
>=25
Yes
Expected information
I ( S1 , S 2 ,...S n ) pi log 2 pi
i 1
I(3,3)=1
<20k
20k..50k
age
Class is no {4,5}
Entropy
n
E ( A)
S j1 ... S jn
j 1
* I ( S j1 ,..., S jn )
E(Married)=0.92
Gain(Married)=0.08
E(Salary)=0.33
Gain(Salary)=0.67
E(A.balance)=0.82
Gain(A.balance)=0.18
E(Age)=0.81
Gain(Age)=0.19
Salary
Class
attribute
<25
>=50k
Class is yes {1,2}
>=25
Information gain
Gain(A) = I-E(A)
IT Strategy
55
Classification
(2-4)
Direct Marketing
Goal: Reduce cost of mailing by targeting a set of
consumers likely to buy a new cell-phone
product.
Approach:
Use the data for a similar product introduced before.
We know which customers decided to buy and which
decided otherwise. This {buy, dont buy} decision
forms the class attribute.
Collect various demographic, lifestyle, and companyinteraction related information about all such
customers.
Type of business, where they stay, how much they
earn, etc.
IT Strategy
56
Classification
(3-4)
Fraud Detection
Goal: Predict fraudulent cases in credit card
transactions.
Approach:
Use credit card transactions and the information on
its account-holder as attributes.
When does a customer buy, what does he buy, how
often he pays on time, etc.
57
Neural network
(4-4)
Basic NN unit
x1
x2
x3
w1
w2
w3
IT Strategy
o ( wi xi )
i 1
1
( y)
1 e y
x1
x2
x3
Output nodes
Hidden nodes
58
Clustering
(1-4)
59
Clustering
(2-4)
Introduction
The previous data mining task of classification
deals with partitioning data based on a preclassified training sample.
Clustering is an automated process to group
related records together.
Related records are grouped together on the
basis of having similar values for attributes.
The groups are usually disjoint.
IT Strategy
60
Clustering
(3-4)
Some concepts
An important facet of clustering is the
similarity function that is used
When the data is number, a similarity
function based on distance is typically
used
Euclidean metric (Euclidean distance),
metric, Manhattan metric.
IT Strategy
61
K-Mean Clustering
IT Strategy
(4-4)
62
IT Strategy
63
Discussion
THANK
YOU!