Professional Documents
Culture Documents
(Lecture #8)
Introduction to Data Mining, Pearson Education 2006 by Tan P. N., Steinbach M & Kumar V.
Predictive Analytics and Data Mining: Concepts and Practice with RapidMiner by Vijay Kotu and
Bala Deshpande Morgan Kaufmann Publishers © 2015
Source Courtesy: Some of the contents of this presentation are sourced from materials provided by publishers of prescribed books
3/5/2023 BA ZG522 - Business Data Mining 2
Name Blood Type Give Birth Can Fly Live in Water Class
hawk warm no yes no ?
grizzly bear warm yes no no ?
9
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
How does Rule-based Classifier Work?
R1: (Give Birth = no) (Can Fly = yes) → Birds
R2: (Give Birth = no) (Live in Water = yes) → Fishes
R3: (Give Birth = yes) (Blood Type = warm) → Mammals
R4: (Give Birth = no) (Can Fly = no) → Reptiles
R5: (Live in Water = sometimes) → Amphibians
Name Blood Type Give Birth Can Fly Live in Water Class
lemur warm yes no no ?
turtle cold no no sometimes ?
dogfish shark cold yes no yes ?
• Exhaustive rules
• Classifier has exhaustive coverage if it accounts for every possible
combination of attribute values
• Each record is covered by at least one rule
NO YES
Name Blood Type Give Birth Can Fly Live in Water Class
turtle cold no no sometimes ?
3/5/2023 BA ZG522 - Business Data Mining
15
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Rule Ordering Schemes
• Rule-based ordering
• Individual rules are ranked based on their quality
• Class-based ordering
• Rules that belong to the same class appear together
16
3/5/2023 BA ZG522 - Business Data Mining
• Direct Method:
• Extract rules directly from data
• e.g.: RIPPER, CN2, Holte’s 1R
• Indirect Method:
• Extract rules from other classification models (e.g.
decision trees, neural networks, etc).
• e.g: C4.5rules
P
No Yes
Q R Rule Set
P( A | C ) P(C )
P(C | A) =
P( A)
P( S | M ) P( M ) 0.5 1 / 50000
P( M | S ) = = = 0.0002
P( S ) 1 / 20
• Approach:
• compute the posterior probability P(C | A1, A2, …, An) for all
values of C using the Bayes theorem
P ( A A A | C ) P (C )
P (C | A A A ) = 1 2 n
P( A A A )
1 2 n
1 2 n
(120 −110) 2
1 −
P( Income = 120 | No) = e 2 (3179)
= 0.0069
2 (56.38)
3/5/2023 BA ZG522 - Business Data Mining 28
N ic
Original : P ( Ai | C ) =
Nc
c: number of classes
N ic + 1
Laplace : P ( Ai | C ) =
Nc + c
30
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Example of Naïve Bayes Classifier
Name Give Birth Can Fly Live in Water Have Legs Class
human yes no no yes mammals
A: attributes
python no no no no non-mammals M: mammals
salmon no no yes no non-mammals
whale yes no yes no mammals N: non-mammals
frog no no sometimes yes non-mammals
komodo no no no yes non-mammals
6 6 2 2
bat
pigeon
yes
no
yes
yes
no
no
yes
yes
mammals
non-mammals
P ( A | M ) = = 0.06
cat yes no no yes mammals
7 7 7 7
leopard shark yes no yes no non-mammals 1 10 3 4
turtle no no sometimes yes non-mammals P ( A | N ) = = 0.0042
penguin no no sometimes yes non-mammals 13 13 13 13
porcupine yes no no yes mammals
7
P ( A | M ) P ( M ) = 0.06 = 0.021
eel no no yes no non-mammals
salamander no no sometimes yes non-mammals
gila monster no no no yes non-mammals 20
platypus no no no yes mammals
13
owl
dolphin
no
yes
yes
no
no
yes
yes
no
non-mammals
mammals
P ( A | N ) P ( N ) = 0.004 = 0.0027
eagle no yes no yes non-mammals 20
Step 1:
Create Multiple D1 D2 .... Dt-1 Dt
Data Sets
Step 2:
Build Multiple C1 C2 Ct -1 Ct
Classifiers
Step 3:
Combine C*
Classifiers
35
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Why does it work?
i
i =13
(1 − ) 25 −i
= 0.06
3/5/2023 BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
How does it work?
• f is binary or nominal:
dij(f) = 0 if xif = xjf , or dij(f) = 1 otherwise
• f is numeric: use the normalized distance
• f is ordinal
• Compute ranks rif and zif = r − 1 if
• Treat z as interval-scaled M −1 f
if
• Retailer can take advantage of this association for bundle pricing, product
placement, and even shelf space optimization within the store layout.