L11-12 Qualitative Association Rule Mining

BITS Pilani
BITS Pilani Dr.Aruna Malapati

Asst Professor
Hyderabad Campus Department of CSIS
BITS Pilani
Hyderabad Campus
Extended Association Rule Mining

Today’s Learning objective
• Generate quantitative association rules when the items

are categorical or continious
BITS Pilani, Hyderabad Campus

Continuous and Categorical
Attributes
How to apply association analysis formulation to non-
asymmetric binary variables?
Session Country Session Number of
Browser
Id Length Web Pages Gender Buy
Type
(sec) viewed
1 USA 982 8 Male IE No
2 China 811 10 Female Netscape No
3 USA 2125 45 Female Mozilla Yes
4 Germany 596 4 Male IE Yes
5 Australia 123 9 Male Mozilla No
… … … … … … …
10
Example of Association Rule:

{Number of Pages [5,10)  (Browser=Mozilla)}  {Buy = No}

Handling Categorical
Attributes
• Transform categorical attribute into asymmetric binary
variables
• Introduce a new “item” for each distinct attribute-value

pair
– Example: replace Browser Type attribute with
• Browser Type = Internet Explorer
• Browser Type = Mozilla
• Browser Type = Mozilla

Handling Categorical
Attributes
• Potential Issues
– What if attribute has many possible values?
• Example: attribute country has more than 200 possible
values
• Many of the attribute values may have very low support
– Potential solution: Aggregate the low-support attribute
values
– What if distribution of attribute values is highly skewed?
• Example: 95% of the visitors have Buy = No
• Most of the items will be associated with (Buy=No) item
– Potential solution: drop the highly frequent items

Handling Continuous
Attributes
• Mining continuous attributes may reveal interesting rules like
• Users whose annual income is more than $120K belong to 45-60 age group
• Association rules that contain continuous attributes are known as
quantitative association rules.
• Different methods:
– Discretization-based
– Statistics-based
– Non-discretization based
• minApriori

Discretization-based
Discretization groups
methods adjacent values of
continuous attribute
into a finite number
of intervals.

Discretization-based
methods
• A key parameter in attribute discretization is the number of
intervals used to partition each attribute.
• Usually provided by users in any of the following forms
• Bin Witdh
• Bin Frequency
• Number of clusters
• Assume the following rules with support=5% and
confidence=65%
• If bin width is more for ex for age of the bin width is 24 the
following rules will emerge as interesting

Statistics-based Methods
• Quantitative association rules can be used to infer the

statistical properties of a population.
• Rule consequent consists of a continuous variable,

characterized by their statistics
– mean, median, standard deviation, etc.

Approach:
– Withhold the target variable from the rest of the data
– Apply existing frequent itemset generation on the rest of the data
– For each frequent itemset, compute the descriptive statistics for
the corresponding target variable
• Frequent itemset becomes a rule by introducing the target
variable as rule consequent
– Apply statistical test to determine interestingness of the rule

• How to determine whether an association rule interesting?

– Compare the statistics for segment of population covered
by the rule vs segment of population not covered by the
rule:
• A  B:  versus A  B: ’
 '   
– Statistical hypothesis testing: Z
s12 s22
• Null hypothesis: H0: ’ =  +  
n1 n2
• Alternative hypothesis: H1: ’ >  + 
• Z has zero mean and variance 1 under null hypothesis

Example:
r: Browser=Mozilla  Buy=Yes  Age: =23
– Rule is interesting if difference between  and ’ is greater than 5
years (i.e.,  = 5)
– For r, suppose n1 = 50, s1 = 3.5
– For r’ (complement): n2 = 250, s2 = 6.5
 '    30  23  5
Z   3.11
2 2 2 2
s s 3.5 6.5
1
 2

n1 n2 50 250
– For 1-sided test at 95% confidence level, critical Z-value for rejecting
null hypothesis is 1.64.
– Since Z is greater than 1.64, r is an interesting rule

Mining Multiple-Level
Association Rules
• Items often form hierarchies
• For example Heritage 2% milk -> Britannia white wheat bread
• Flexible support settings
– Items at the lower level are expected to have lower support
• Exploration of shared multi-level mining (Agrawal &
Srikant@VLB’95, Han & Fu@VLDB’95)
uniform support reduced support

Level 1
Milk Level 1
min_sup = 5%
[support = 10%] min_sup = 5%
Level 2 2% Milk Skim Milk Level 2

min_sup = 5% [support = 6%] [support = 4%] min_sup = 3%

Multi-level Association: Flexible
Support and Redundancy filtering
• Flexible min-support thresholds: Some items are more valuable but less frequent
– Use non-uniform, group-based min-support
– E.g., {diamond, watch, camera}: 0.05%; {bread, milk}: 5%; …
• Redundancy Filtering: Some rules may be redundant due to “ancestor”
relationships between items
– milk  wheat bread [support = 8%, confidence = 70%]
– 2% milk  wheat bread [support = 2%, confidence = 72%]
– The first rule is an ancestor of the second rule
• A rule is redundant if its support is close to the “expected” value, based on the
rule’s ancestor

Non-Discretization methods
• In some applications it is interesting to find associations among

continuous attributes rather than discrete intervals.
• For e.g word associations in documents.
TID W 1 W 2 W 3 W 4 W 5
D1 2 2 0 0 1
D2 0 0 1 2 2
D3 2 3 0 0 0
D4 0 0 1 0 1
D5 1 1 1 0 2
• Data contains only continuous attributes of the same “type”
– e.g., frequency of words in a document

Non-Discretization methods
• How to determine the support of a word?
– If we simply sum up its frequency, support count will
be greater than total number of documents!
• Normalize the word vectors – e.g., using L1 norm
• Each word has a support equals to 1.0
TID W 1 W 2 W 3 W 4 W 5 TID W1 W2 W3 W4 W5
D1 2 2 0 0 1 D1 0.40 0.33 0.00 0.00 0.17
D2 0 0 1 2 2 Normalize D2 0.00 0.00 0.33 1.00 0.33
D3 2 3 0 0 0 D3 0.40 0.50 0.00 0.00 0.00
D4 0 0 1 0 1 D4 0.00 0.00 0.33 0.00 0.17
D5 1 1 1 0 2 D5 0.20 0.17 0.33 0.00 0.33
Compute word associations
TID W1 W2 W3 W4 W5
D1 0.40 0.33 0.00 0.00 0.17
D2 0.00 0.00 0.33 1.00 0.33
D3 0.40 0.50 0.00 0.00 0.00
D4 0.00 0.00 0.33 0.00 0.17
D5 0.20 0.17 0.33 0.00 0.33
Support(W1,W2) = (0.44+0.33)/2+0+(o.40+0.50)/2+0+(0.20+0.17)/2 = 1
• Since every word frequency is normalized to 1 makes their support

=1
• Hence all item set will be frequent hence we need a modification to
our support.

Min-Apriori
New definition of support:
sup(C )   min D(i, j )

iT jC
TID W1 W2 W3 W4 W5 Example:
D1 0.40 0.33 0.00 0.00 0.17
Sup(W1,W2,W3)
D2 0.00 0.00 0.33 1.00 0.33
D3 0.40 0.50 0.00 0.00 0.00 = 0 + 0 + 0 + 0 + 0.17
D4 0.00 0.00 0.33 0.00 0.17 = 0.17
D5 0.20 0.17 0.33 0.00 0.33

Anti-monotone property of
Support
TID W1 W2 W3 W4 W5
D1 0.40 0.33 0.00 0.00 0.17
D2 0.00 0.00 0.33 1.00 0.33
D3 0.40 0.50 0.00 0.00 0.00
D4 0.00 0.00 0.33 0.00 0.17
D5 0.20 0.17 0.33 0.00 0.33
Example:
Sup(W1) = 0.4 + 0 + 0.4 + 0 + 0.2 = 1
Sup(W1, W2) = 0.33 + 0 + 0.4 + 0 + 0.17 = 0.9
Sup(W1, W2, W3) = 0 + 0 + 0 + 0 + 0.17 = 0.17
Mining Multi-Dimensional
Association
• Single-dimensional rules:
• buys(X, “milk”)  buys(X, “bread”)
• Multi-dimensional rules:  2 dimensions or predicates
– Inter-dimension assoc. rules (no repeated predicates)
• age(X,”19-25”)  occupation(X,“student”)  buys(X, “coke”)
– hybrid-dimension assoc. rules (repeated predicates)
• age(X,”19-25”)  buys(X, “popcorn”)  buys(X, “coke”)
• Categorical Attributes: finite number of possible values, no ordering
among values—data cube approach
• Quantitative Attributes: Numeric, implicit ordering among values—
discretization, clustering, and gradient approaches

Take home message
• Categorical attributes is done by creating each class and

binarizing the data.
• Continuous attributes can be converted by discretization of
the class intervals.
• Statistical methods are interesting to understand the
properties of a population. Confidence measure is modified.
• Non discretization methods are used to handle continuous
attributes and the support measure is modified.

L11-12 Qualitative Association Rule Mining

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

L11-12 Qualitative Association Rule Mining

Uploaded by

Copyright:

Available Formats

BITS Pilani

BITS Pilani Dr.Aruna Malapati

Extended Association Rule Mining

• Generate quantitative association rules when the items

BITS Pilani, Hyderabad Campus

Example of Association Rule:

BITS Pilani, Hyderabad Campus

• Introduce a new “item” for each distinct attribute-value

BITS Pilani, Hyderabad Campus

BITS Pilani, Hyderabad Campus

BITS Pilani, Hyderabad Campus

BITS Pilani, Hyderabad Campus

BITS Pilani, Hyderabad Campus

• Quantitative association rules can be used to infer the

• Rule consequent consists of a continuous variable,

BITS Pilani, Hyderabad Campus

BITS Pilani, Hyderabad Campus

• How to determine whether an association rule interesting?

BITS Pilani, Hyderabad Campus

BITS Pilani, Hyderabad Campus

uniform support reduced support

Level 2 2% Milk Skim Milk Level 2

BITS Pilani, Hyderabad Campus

BITS Pilani, Hyderabad Campus

• In some applications it is interesting to find associations among

BITS Pilani, Hyderabad Campus

• Since every word frequency is normalized to 1 makes their support

BITS Pilani, Hyderabad Campus

New definition of support:

sup(C )   min D(i, j )

BITS Pilani, Hyderabad Campus

BITS Pilani, Hyderabad Campus

• Categorical attributes is done by creating each class and

BITS Pilani, Hyderabad Campus

You might also like