You are on page 1of 3

Data Mining (3160714) 200170107063

Experiment No - 4

Aim: Implement Apriority algorithm of association rule data mining technique in any Programming
language.

Date:

Competency and Practical Skills: Logic building, Programming and Analyzing

Relevant CO: CO2

Objectives: To implement basic logic for association rule mining algorithm with support and
confidence measures.
.
Equipment/Instruments: Personal Computer, open-source software for programming

Program:

Implement Apriori algorithm of association rule data mining technique in any Programming
language.

Code:
# Define the dataset transactions = [
["I1","I2","I5"],
["I2","I4"],["I2","I3"],["I1","I2","I4"],["I1","I3"],
["I2","I3"],["I1","I3"],
["I1","I2","I3","I5"],
["I1","I2","I3"]
]

def apriori(transactions, min_support, min_confidence):


# Get unique items in transactions
unique_items = sorted(list(set([item for transaction in transactions for item in transaction])))

# Create initial frequent itemsets


frequent_itemsets = {frozenset([item]): 0 for item in unique_items}
for transaction in transactions:
for item in transaction:
frequent_itemsets[frozenset([item])] += 1

# Remove items that don't meet minimum support threshold


frequent_itemsets = {itemset: count for itemset, count in frequent_itemsets.items() if count >=
min_support}

# Create initial candidate itemsets

19
Data Mining (3160714) 200170107063

candidate_itemsets = frequent_itemsets.keys()

# Create frequent itemsets of length 2 or greater k = 2 while candidate_itemsets:


# Generate candidate itemsets of length k candidate_itemsets = set(
[itemset1.union(itemset2) for itemset1 in candidate_itemsets for itemset2 in candidate_itemsets if
len(itemset1.union(itemset2)) == k])

# Calculate support for candidate itemsets and remove those that don't meet minimum support threshold
itemset_counts = {itemset: 0 for itemset in candidate_itemsets} for transaction in transactions:
for itemset in itemset_counts.keys(): if itemset.issubset(transaction):
itemset_counts[itemset] += 1 frequent_itemsets.update({itemset: count for itemset, count in
itemset_counts.items() if count >= min_support})

# Increment k k += 1

# Generate association rules


rules = [] for itemset, count in frequent_itemsets.items():
if len(itemset) > 1: for item in itemset:
left_side = itemset - frozenset([item]) support_left = frequent_itemsets[left_side]
confidence = count / support_left if confidence >= min_confidence:
rules.append((left_side, frozenset([item]), confidence))

return frequent_itemsets, rules frequent_itemsets, rules = apriori(transactions, 3, 0.6)


print("Frequent Items with Support:") for i in frequent_itemsets:
print("(",str(i)[11:-2],"):",frequent_itemsets[i])
print("Rules with confidence:") for i in rules:
print("(",str(i[0])[11:-2],",",str(i[1])[11:-2],"):",i[-1])

Observations:

20
Data Mining (3160714) 200170107063

Conclusion:
Apriority algorithm is an effective and widely used approach for discovering frequent itemsets and
association rules in large transaction datasets. It has been used in various applications such as market
basket analysis, customer segmentation, and web usage mining.

Quiz:

(1) What Do you Mean by Association rule mining?


(2) What are the different measures are used in apriori algorithm?

Suggested Reference:

• J. Han, M. Kamber, “Data Mining Concepts and Techniques”, Morgan Kaufmann

References used by the students:


https://www.geeksforgeeks.org/apriori-algorithm/
Rubric wise marks obtained:

Problem Completeness
Knowledge Logic
Recognition and accuracy Ethics (2)
Rubrics (2) Building (2) Total
(2) (2)
Good Average Good Average Good Average Good Average Good Average
(2) (1) (2) (1) (2) (1) (2) (1) (2) (1)

Marks

21

You might also like