You are on page 1of 3

Homework3

AssociationRuleMining

DataMining2/2553.CE,KMITL

1. TracetheresultsofusingtheApriorialgorithmonthegroceryshopwithsupportthreshold
33.34%andconfidencethreshold60%.Showthecandidateandfrequentitemsetsforeach
databasescan.Enumerateallthefinalfrequentitemsets.Alsoindicatetheassociationrulesthat
aregeneratedandhighlightthestrongones,sortthembyconfidence.

TransactionID
Items
T1
HotDogs,Buns,Ketchup
T2
HotDogs,Buns
T3
HotDogs,Coke,Chips
T4
Chips,Coke
T5
Chips,Ketchup
T6
HotDogs,Coke,Chips

2. TracetheresultsofusingtheApriorialgorithmonthecomputershopwithsupportthreshold
70%andconfidencethreshold80%.Showthecandidateandfrequentitemsetsforeach
databasescan.Enumerateallthefinalfrequentitemsets.Alsoindicatetheassociationrulesthat
aregeneratedandhighlightthestrongones,sortthembyconfidence.

TransactionID
Items
T1
Tripod,Lens,bag
T2
Camera,Lens,bag
T3
Camera,Tripod,Lens,Memorycard
T4
Camera,Tripod,Lens,bag
T5
Lens,Memorycard,bag

3. Describetheimportantofsupportandconfidencethresholdsinfindingassociationrules?And
whatshouldbetheirmostappropriatevalues?

AssociationRuleMiningwithWEKA
Aprioriworkswithcategoricalvaluesonly.Therefore,ifadatasetcontainsnumericattributes,
theyneedtobeconvertedintonominalbeforeapplyingtheApriorialgorithm.Hence,data
preprocessingmustbeperformed.Repeathomework2(DataPreprocessing),ifyoudontknowhowto
dealwithnumerictonominalconversion.
weather.nominal.arff
1. Loadweather.nominal.arffintoatexteditorandanalyzetheattributetypesandvalues.
2. Isthisdatasetappropriateforassociationrulemining?ifnot,modifyit.Youmayuse
WEKAsPreprocessingcapability.
3. ApplyApriorialgorithmtothedataset.
a. GotoAssociationtab
b. ChooseAprioriasAssociator
c. Acceptalldefaultvalues.YoumayclickonMorebuttontoseethesynopsisforthe
differentparameters.

d. ClickonStartbuttontorun
4. Studytheoutputintherightpanel.Itshouldlooksomethingsimilartothefollowing:
Apriori
=======
Minimum support: 0.15
Minimum metric : 0.9
Number of cycles performed: 17
Generated sets of large itemsets:
Size of set of large itemsets L(1): 12
Size of set of large itemsets L(2): 47
Size of set of large itemsets L(3): 39
Size of set of large itemsets L(4): 6
Best rules found:
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.

outlook=overcast 4 ==> play=yes 4


conf:(1)
temperature=cool 4 ==> humidity=normal 4
conf:(1)
humidity=normal windy=FALSE 4 ==> play=yes 4
conf:(1)
outlook=sunny play=no 3 ==> humidity=high 3
conf:(1)
outlook=sunny humidity=high 3 ==> play=no 3
conf:(1)
outlook=rainy play=yes 3 ==> windy=FALSE 3
conf:(1)
outlook=rainy windy=FALSE 3 ==> play=yes 3
conf:(1)
temperature=cool play=yes 3 ==> humidity=normal 3
conf:(1)
outlook=sunny temperature=hot 2 ==> humidity=high 2
conf:(1)
temperature=hot play=no 2 ==> outlook=sunny 2
conf:(1)

5. Canyouexplainwhattheoutputsays?
6. Tryvaryvalueofparameters;forexample,minimumsupport,minimumconfidenceand
numberofrules.
7. Whatdoyoufind?

WEKAsApriori(ref:web.mac.com)
ThedefaultvaluesforNumberofrules,thedecreaseforMinimumsupport(deltafactor)and
minimumConfidencevaluesare10,0.05and0.9.RuleSupportistheproportionofexamplescoveredby
theLHSandRHSwhileConfidenceistheproportionofexamplescoveredbytheLHSthatarealso
coveredbytheRHS.Soifarule'sRHSandLHScovers50%ofthecasesthentherulehas0.5support,if
theLHSofarulecovers200casesandofthesetheRHScovers50casesthentheconfidenceis0.25.
WithdefaultsettingsAprioritriestogenerate10rulesbystartingwithaminimumsupportof100%,
iterativelydecreasingsupportbythedeltafactoruntilminimumnonzerosupportisreachedorthe
requirednumberofruleswithatleastminimumconfidencehasbeengenerated.IfweexamineWeka's
output,aMinimumsupportof0.15indicatestheminimumsupportreachedinordertogeneratethe10
ruleswiththespecifiedminimummetric,hereconfidenceof0.9.Theitemsetsizesgeneratedare
displayed;e.g.thereare6fouritemsetshavingtherequiredminimumsupport.Bydefaultrulesare
sortedbyconfidenceandanytiesarebrokenbasedonsupport.Thenumberpreceding==>indicatesthe

numberofcasescoveredbytheLHSandthevaluefollowingtheruleisthenumberofcasescoveredby
theRHS.Thevalueinparenthesisistherule'sconfidence.
bank.arff
1. Loadbank.arffintoatexteditorandanalyzetheattributetypesandvalues.
2. Isthisdatasetappropriateforassociationrulemining?ifnot,modifyit.Youmayuse
WEKAsPreprocessingcapability.
3. ApplyApriorialgorithmtothedataset.
4. Studytheoutputintherightpanel.
5. Checkoutoutputfromvariousdifferentsetsofparameters.
6. Isitsomethingyouexpected?

marketbasket.arff
1. Performsimilarstepsagainstmarketbasket.arff.

Youdonthavetoturninanything.However,bepreparedtodiscussresultsandfindingsinclass
individually. Iwillrandomlycallyouguystogiveexplanation.