Professional Documents
Culture Documents
Assignment 1
Ans- One of the major disadvantages of association rules is that they do not take sequential
information into account that is available in the data. Association rule helps to analyze the
degree of associations among the purchased demographic data or company data of a customer
i.e., items which are listed in the columns of the above data sample and does not associate
items between rows i.e., between the customers. Therefore, due to this demerit Association
rules is not the appropriate model for identifying relations between potential customers. Infact,
to address this shortcoming associated with association rule, we use cluster analysis which
aims at sorting different objects into groups and maximizing the degree of association between
two objects belonging to the same group, otherwise it aims at minimizing the association.
Hence, after considering all factors we believe, Cluster analysis would be preferable method
to determine the degree of associations among customers.
Q.2 Consider the data in the file coursetopics.xls. These data are for purchases of online
statistics courses at statistics.com. Each row represents the courses attended by a single
Assignment-1 Group 13 Page | 2
customer. The firm wishes to assess alternative sequencings and combinations of courses.
Use Association Rules to analyse these data, and interpret several of the resulting rules
Ans-
Here is the R code for the above given data set coursetopics.xls
If the course Intro, regression and forecast are taken then the student also takes Datamining as
subject.
Confidence is 71 percent and lift is 4, as lift is greater than 1 so we can say that this courses
are taken together quite often.
Rule 2
If intro, survey and DOE are taken then a student is likely to take Cat.Data
Confidence is 80 percent and lift is 3, as lift is greater than 1 so we can say that this courses
are taken together quite often.
Rules 3
If Intro, Datamining and Cat. ata are taken by a student then student also takes the course of
Regression
Confidence is 75 percent and lift is 3, as lift is greater than 1 so we can say that this courses
are taken together quite often.
Q3. The data shown in Figure 11.7 are a subset of a dataset on cosmetic purchases given
in binary matrix form. The complete dataset (in the file Cosmetics.xls) contains data on
the purchases of different cosmetic items at a large chain drugstore. The store wants to
analyze associations among purchases of these items for purposes of point-of-sale
display, guidance to sales personnel in promoting cross sales, and guidance for piloting
an eventual time-of purchase electronic recommender system to boost cross sales.
Consider first only the subset shown in Figure 11.7 (CH10-Assoc-Exer_Cosmetics-
small.eps)
Ans. In the 1st Transaction, blush, nail polish, brushes, concealer and bronzer have been
purchased together from the drugstore whereas bag and eyebrow pencils have not been
purchased.
In the 5th transaction, only blush, concealer and bronzer have been purchased at a time.
Q3 b.
i) For the first row, explain the “Conf. %” output and how it is calculated.
It means that for each time that bronzer and nail polish are present together in a market basket,
there is a chance of 60.19 % that brushes and concealer are also present in that transaction.
ii) For the first row, explain the “Support(a), Support(c) and Support(a U c) output and
how it is calculated.
Support (a U c) = N(a U c)
In the 1st row, support (a U c) = 62 suggests that there are 62 such transactions where if brushes
and concealer have been purchased, bronzer and nail polish have also been purchased in the
same transaction.
Assignment-1 Group 13 Page | 5
iii) For the first row, explain the “Lift Ratio” and how it is calculated.
It shows that if bronzer and nailpolish are purchased together in the same transaction then the
likelihood of brushes and concealer being purchased in that transaction increases 3.909 times
iv) For the first row, explain the rule that is represented there in words.
Here the rule is that if bronzer and nail polish are present in a market basket, then brushes and
concealer will also be purchased in the same transaction.
R1: If brushes are purchased, then the likelihood of nailpolish also being purchased is 3.571.
Since the confidence is 1, therefore we can infer that everytime brushes has been purchased
nailpolish have also been purchased in the same transaction.
R2 : If Blush, Concealer, Eye.shadow are purchased, then the likelihood of Mascara also
being purchased is 2.688 . We can deduce that everytime Blush, Concealer, Eye.shadow has
been purchased mascara have also been purchased 96% of the times in the same transaction.
R3 : If Blush, Eyeshadow are purchased, then the likelihood of Mascara also being purchased
is 2.601 . We can deduce that every time Blush and Eyeshadow has been purchased mascara
have also been purchased 92.9% of the times in the same transaction.
R4 : If Blush, Eye.shadow are purchased, then the likelihood of Mascara also being
purchased is 2.545 . We can deduce that every time Blush and Eyeshadow has been
purchased mascara have also been purchased 90.8% of the times in the same transaction.
R5 : If Concealer and Eyeshadow are purchased, then the likelihood of Mascara also being
purchased is 2.495 . We can deduce that everytime Concealer and Eyeshadow has been
purchased mascara have also been purchased 89.1 % of the times in the same transaction.
vii. Reviewing the first couple of dozen rules, comment on their redundancy, and how
you would assess their utility.
Ans:
R5: {Concealer, Eye.shadow} => {Mascara}
R7: {Concealer, Eye.shadow, Eyeliner} => {Mascara}
R2: {Blush, Concealer, Eye.shadow} => {Mascara}
R3: {Blush, Eye.shadow} => {Mascara}
Assignment-1 Group 13 Page | 9
Among all the association rules that have been found, a number of them are found to be
redundant. Here, we have taken two such examples.
Rule 5 and Rule 2 are redundant in this case, since their characteristic is already being
represented by Rule 7 and Rule 3 respectively. Thus we can say, that Rule 5 and Rule 2 are
subsets of Rule 7 and Rule 3 respectively.
In Rule 2 and Rule 3, we can see that if along with blush and eyeshadow, concealer is also
purchased in a particular transaction, then the likeliness of mascara being purchased increases
by 0.08.
In Rule 5 and Rule 7, we can see that if along with concealer and eyeshadow, eyeliner is also
purchased in a particular transaction, then the likeliness of mascara being purchased
decreases by 0.039.
Therefore, even if some of the rules are redundant, but they can be used to compare the
change in the purchase behavior of a customer with the inclusion of a particular object in the
antecedent or consequent of the rules, thereby helping the retailer to decide upon the
combinations of objects which provide maximum sales and profits.
If Blush and Eyeshadow is bought together, it is found that people are more likely to buy
Mascara than Concealer in the same transaction. But if they somehow buy Concealer along
with blush and Eyeshadow, then they are more likely to buy Mascara as well. Concealer is
more pricy than Mascara. Therefore, if the store is looking for a sales target, then mascara
should be placed closer to blush and eyeshadow since the data ensures more likeliness of
mascara to be sold in the same transaction in this case. But if we are targeting more profit
margin, we should push for the sale of Concealers. In this case, sales personnel can convince
the customers who are buying Blush and Eyeshadow by promoting concealers. This is because
if they are able to convince them to buy Concealer, then they are also seen to be buying mascara
most of the times. This would help the store to ensure high number of sales of both concealer
and mascara as well as more profit.
Observation 2:
{Concealer, Mascara} => {Eye.shadow} 2.303
{Concealer, Eye.shadow} => {Mascara} 2.495
{Mascara, Eye.shadow, Eyeliner} => {Concealer} 1.708
In this case, we find that if concealer and mascara are purchased together, then eyeshadow is
also bought most of the times. Again if Concealer and Eyeshadow are purchased together, then
the likelihood of mascara being bought also increases. But if Mascara and and eyeshadow are
bought together, the likelihood of mascara being bought as well doesn’t satisfy the required
condition of support and lift values. But if eyeliner is bought along with mascara and
eyeshadow, then the likelihood of the purchase of concealer also increases. Therefore,
Assignment-1 Group 13 Page |
10
concealer, mascara, eyeshadow and eyeliner should be placed nearby in the shelves.
Combining observation 1 and 2, we find that eyeshadow, concealer, mascara are common while
eyeliner and blush are unique.
{Blush, Concealer} => {Eyeliner} with lift 1.422
Therefore we can place all the 5 items in the same aisle so they can be considered a bundle and
thus should be easily available to the customers. Therefore, the recommender system can also
be designed in such a way that these 5 items are recommended accordingly to the purchase
behavior of the customers. For example if Concealer and Mascara are added to the cart of the
customer then it should recommend eyeshadow to the customer.
Observation 3:
Here, both these products can be treated as a bundle and thus should be placed together since
if the if lip gloss is purchased then probability of foundation being purchased is high and vice
versa. In a recommender system, if a customer is seen to be buying a lip gloss then foundation
should definitely be recommended to the customer and vice versa.