You are on page 1of 2

Apriori Algorithm Analysis

The Apriori Algorithm is used for mining frequent itemsets and discovering association
rules in transactional databases. This document outlines the step-by-step analysis of a given
dataset using the Apriori principle.

Step 1: List of Transactions


Transaction ID Items
1 Bread, Beer, Diaper, Eggs
2 Beer, Bread, Diaper, Eggs
3 Beer, Coke, Diaper, Milk
4 Beer, Bread, Diaper, Milk
5 Bread, Coke, Diaper, Milk
Minimum support is a threshold that determines whether an itemset is considered
'frequent' in the context of the dataset. To calculate the minimum support value, we
decide on a percentage (e.g., 60%) which indicates the itemset appears in at least
that percentage of all transactions. The support for an itemset is calculated as the
number of transactions containing the itemset divided by the total number of
transactions. If the calculated support is greater than or equal to the minimum
support threshold, the itemset is deemed frequent.

Step 2: Calculate Item Support


The support of an item is calculated as the proportion of transactions in which the item
appears. An item's support must be above the minimum threshold to be considered
frequent.

Item Support
Diaper 1.00
Bread 0.80
Beer 0.80
Milk 0.60
Eggs 0.40
Coke 0.40

Step 3: Generate Frequent Itemsets of Size 2


Frequent itemsets of size 2 are generated from the individual frequent items. These pairs
are then tested against the minimum support threshold.

Itemset Support
Bread, Diaper 0.80
Beer, Diaper 0.80
Bread, Beer 0.60
Diaper, Milk 0.60

Step 4: Generate Frequent Itemsets of Size 3


Using the frequent itemsets of size 2, we generate candidate itemsets of size 3 and calculate
their support. Itemsets with support less than the minimum threshold are pruned.

Itemset Support
Bread, Beer, Diaper 0.60

Conclusion
The Apriori Algorithm has been applied to the dataset to find frequent itemsets. The process
began by identifying individual items with high support, then constructing larger itemsets
while ensuring that they meet the minimum support threshold. The largest frequent
itemsets discovered in this dataset are of size 3.

You might also like