You are on page 1of 20

6.

Association Analysis
Rules
Rules

{Item A } - > {Item B }

{News, Finance } - > {Sports }


Rules

{Item A } - > {Item B }

{News, Finance } - > {Sports }


Steps
Data format
Support and confidence of the rule

The support of an item is simply the relative frequency of occurrence of an


item

The confidence of a rule measures the likelihood of occurrence of the


consequent of the rule out of all the transactions that contain the antecedent
of the rule.
Item set tree
null

News Finance Sports Entertainment

News News News Finance Finance Sports


Finance Sports Entertainment Sports Entertainment Entertainment

News News News Finance


Finance Finance Sports Sports
Sports Entertainment Entertainment Entertainment

News
Finance
Sports
Entertainment
Apriori algo If an item set is frequent, then all its subset items will be frequent.

null

News Finance Sports Entertainment


subsets

News News News Finance Finance Sports


Finance Sports Entertainment Sports Entertainment Entertainment

If {News, Finance, News News News Finance


Sports} is frequent, Finance Finance Sports Sports
all subsets will be Sports Entertainment Entertainment Entertainment
frequent

News
Finance
Sports
Entertainment
Apriori algo
Conversely, if the item set is infrequent, then all its super sets will be infrequent.

null

If Entertainment is
News Finance Sports Entertainment infrequent, all
supersets will be
infrequent

Supersets
News Finance Finance Sports
News Finance News Sports
Entertainment Sports Entertainment Entertainment

News News News Finance


Finance Finance Sports Sports
Sports Entertainment Entertainment Entertainment

News
Finance
Sports
Support Calculation
Elimination
null

If Entertainment is
News Finance Sports Entertainment infrequent, all
0.83 0.67 0.33 supersets will be
infrequent

Super
News News News Finance Finance Sports

sets
Finance Sports Entertainment Sports Entertainment Entertainment
0.67 0.33 0.33

News News News Finance


Finance Finance Sports Sports
Sports Entertainment Entertainment Entertainment

0.33
News
Finance
Sports
Entertainment
Final rules

{News, Sports}->{Finance} – 0.33 / 0.33 = 1.0


{News, Finance}->{Sports} – 0.33 / 0.67 = 0.5
{Sports, Finance}->{News} – 0.33 / 0.33 = 1.0
{News}->{Sports, Finance} – 0.33 / 0.83 = 0.4
{Sports}->{News, Finance} – 0.33 / 0.33 = 1.0
{Finance}->{News, Sports} – 0.33 / 0.67 = 0.5
FP Growth

The Frequent Pattern (FP)-Growth algorithm provides an alternative way of calculating a frequent item set by
compressing the transaction records using a special graph data structure called FP-Tree
Transaction 1

Null

News (1)

Finance (1)
Transaction 1,2,3

Null

News (3)

Finance (3)

Sports (1)
Transaction 1-6

Null

News (5) Sports(1)

Finance (4)

Entertainment(1)

Sports (2)
Trimmed Tree

Null

News (5) Sports(1)

Finance (4)

Sports (2)
Conditional FP Tree

Null

News (2)

Finance (2)
Kotu, V., & Deshpande, B. (2014). Predictive analytics and data mining: concepts and practice with rapidminer. Morgan Kaufmann.

You might also like