You are on page 1of 5

Data Mining

Homework 1 Solution

Dr. Jason T.L. Wang, Professor


Department of Computer Science
New Jersey Institute of Technology
http://web.njit.edu/~wangj/
http://cs.njit.edu/cs634/
Homework 1 Solution
(Association Rule Mining)
Given the following database, find all association rules
with minimum support = 50% and
minimum confidence = 70%.

TID Items
1001 ink, pen,cheese,bag
1002 milk,pen,juice,cheese
1003 milk,juice
1004 juice,milk,cheese

Data Mining (c) Jason Wang 2


Min. support 50%
Database D Min. confidence 70%

itemset sup.
TID Items {bag} 1
C1 itemset sup.
1001 ink, pen,cheese,bag {ink} 1 L1 {pen} 2
1002 milk,pen,juice,cheese
Scan D {pen} 2 {milk} 3
{cheese} 3
1003 milk,juice {milk} 3
{juice} 3
{cheese} 3
1004 juice,milk,cheese
{juice} 3

itemset itemset sup


C2 itemset sup
{pen milk} 1
C2 {pen milk} L2 {pen cheese} 2
{pen juice} Scan D {pen juice} 1
{pen cheese} {pen cheese} 2 {milk cheese} 2
{milk cheese} {milk cheese} 2 {cheese juice} 2
{cheese juice} {cheese juice} 2 {milk juice} 3
{milk juice} {milk juice} 3

Data Mining (c) Jason Wang 3


Min. support 50%
Min. confidence 70%

C3 Scan D
itemset
{milk cheese juice}

L3
itemset sup
{milk cheese juice} 2

Data Mining (c) Jason Wang 4


Min. support 50%
Min. confidence 70%

Only the following association rules have confidence >= 70%:


{pen} =>{cheese}
{milk}=>{juice}
{juice}=>{milk}
{cheese juice}=>{milk}
{milk cheese}=>{juice}

Data Mining (c) Jason Wang 5

You might also like