You are on page 1of 3

MASENO UNIVERSITY

Private Bag, Maseno, Kenya, Tel: 057 2021013

DEPARTMENT OF COMPUTER SCIENCE


COURSE OUTLINE
Unit Code & Name CCS415/CCT416: Data Mining
Prerequisite CCS 303 Design and Analysis of Algorithms, CCS 315 Intelligent
systems, CCS 319 Database Administration
Cohort Jan 2022
Lecturer Dr. Obuhuma James
Contact jobuhuma@maseno.ac.ke / +254710 463 258

Purpose
To provide the student with foundation to understand business data mining methods, business data
visualization techniques and data warehouse technology.

Learning Outcomes
By the end of the course, the student should be able to:
1. Understand the basic concepts and techniques in business data mining.
2. Develop skills of using recent data mining software for solving practical problems in business.
3. Gain experience of doing independent study and research in business.
4. Apply data mining techniques to discover knowledge for decision making

Course Description
Introduction to data mining and knowledge discovery. Role of logic probability in data mining.
Foundations of pattern clustering: Theorem of the ugly ducking, abstraction and similarity. Clustering
paradigms. Clustering for data mining. Inductive logic programming and knowledge discovery.
Integrating inductive and deductive reasoning for data mining. Data mining using neural networks and
genetic algorithms. Fast discovery of association riles. Discovery of frequent episodes in event sequences.
Applications of data mining to pattern classification

Delivery Methodology
Lectures, directed reading, practical demonstrations of typical computing systems.
Learning Resources
Books, Computers, Internet, Whiteboard and Markers

Course Contents
Period Topic Outline
Week 1 1. Introduction to Data Mining ▪ Data Mining
▪ Data Warehousing
Week 2 2. Preprocessing/Data ▪ Data Preprocessing: Why?
Exploration ▪ Preprocessing Tasks
▪ Preprocessing Techniques
Week 3 3. Decision Tree Classifier and ▪ What is Decision Tree?
Model Evaluation ▪ Information Theory
▪ How to build Decision Tree
Week 4 4. Support Vector Machines ▪ A brief history of SVM
▪ Large-margin linear classifier
▪ Linear separable
▪ Nonlinear separable
▪ Creating nonlinear classifiers: kernel trick
Page 1 of 3
▪ A simple example
▪ Discussion on SVM
Week 5 5. Ensemble Algorithms ▪ Ensemble Learning: General Ideas
▪ Bootstrap Sampling
▪ Bagging
▪ Boosting
▪ Ensemble classifiers/clustering
▪ Success Story of Ensemble method
▪ General Idea of Ensemble methods
Week 6 6. Clustering/Hierarchical/Density ▪ Hierarchical Clustering vs. Partitional Clustering
based ▪ Agglomerative clustering algorithms
▪ Comparison of hierarchical clustering algorithms
▪ Divisive clustering algorithms
▪ Density-based clustering
Week 7 7. Frequent Itemset ▪ Frequent Itemset Mining Problem
Mining/Association Rule ▪ Closed itemset, Maximal itemset
▪ Apriori Algorithm
▪ FP-Growth: itemset mining without candidate
generation
▪ Association Rule Mining
Week 8 8. Graph Mining ▪ What, Why Graph Mining?
▪ Methods for Mining Frequent Subgraphs
▪ Mining Variant and Constrained Substructure
Patterns
▪ Applications:
Week 9 9. Text Data Mining ▪ Text mining, natural language processing
▪ Information extraction/Retrieval
▪ Text mining applications:
▪ Clustering/classification/categorization
▪ Text categorization methods
Week 10 10. Time Series Data Mining ▪ Time series Data
▪ Trend analysis
▪ Data Transformation
▪ Similarity search
Week 11 11. Web Data Mining ▪ Web mining applications
▪ Background on Web Search
▪ VIPS (VIsion-based Page Segmentation)
▪ Block-based Web Search
▪ Block-based Link Analysis

Course Assessment

Continuous Assessment Tests 30%


CATs - 15%
Lab work - 15%

End of Semester Examination 70%


100%

Course Textbook
1. “Data Mining – Concepts and Techniques,” Third Edition, Jiawei Han and Micheline Kamber,
Morgan Kaufmann, ISBN 978-0123814791
2. “The Data Warehouse Toolkit: The complete Guide to Dimensional Modeling,” Second Edition,
Ralph Kimall and Margy Ross, Wiley ISBN 978-0-471-20024-6
3. “Data Mining – Practical Machine Learning Tools and Techniques with java Implementation,”
Second Edition, Ian H. Witten and Eibe Frank, Morgan Kaufmann, ISBN 978-1558605527
4. “Principles of data mining,” Hand D. et al, MIT Press, ISBN 9780262332521

Page 2 of 3
Reference Textbooks

Course Journals

Online Resources

Page 3 of 3

You might also like