Professional Documents
Culture Documents
Purpose
To provide the student with foundation to understand business data mining methods, business data
visualization techniques and data warehouse technology.
Learning Outcomes
By the end of the course, the student should be able to:
1. Understand the basic concepts and techniques in business data mining.
2. Develop skills of using recent data mining software for solving practical problems in business.
3. Gain experience of doing independent study and research in business.
4. Apply data mining techniques to discover knowledge for decision making
Course Description
Introduction to data mining and knowledge discovery. Role of logic probability in data mining.
Foundations of pattern clustering: Theorem of the ugly ducking, abstraction and similarity. Clustering
paradigms. Clustering for data mining. Inductive logic programming and knowledge discovery.
Integrating inductive and deductive reasoning for data mining. Data mining using neural networks and
genetic algorithms. Fast discovery of association riles. Discovery of frequent episodes in event sequences.
Applications of data mining to pattern classification
Delivery Methodology
Lectures, directed reading, practical demonstrations of typical computing systems.
Learning Resources
Books, Computers, Internet, Whiteboard and Markers
Course Contents
Period Topic Outline
Week 1 1. Introduction to Data Mining ▪ Data Mining
▪ Data Warehousing
Week 2 2. Preprocessing/Data ▪ Data Preprocessing: Why?
Exploration ▪ Preprocessing Tasks
▪ Preprocessing Techniques
Week 3 3. Decision Tree Classifier and ▪ What is Decision Tree?
Model Evaluation ▪ Information Theory
▪ How to build Decision Tree
Week 4 4. Support Vector Machines ▪ A brief history of SVM
▪ Large-margin linear classifier
▪ Linear separable
▪ Nonlinear separable
▪ Creating nonlinear classifiers: kernel trick
Page 1 of 3
▪ A simple example
▪ Discussion on SVM
Week 5 5. Ensemble Algorithms ▪ Ensemble Learning: General Ideas
▪ Bootstrap Sampling
▪ Bagging
▪ Boosting
▪ Ensemble classifiers/clustering
▪ Success Story of Ensemble method
▪ General Idea of Ensemble methods
Week 6 6. Clustering/Hierarchical/Density ▪ Hierarchical Clustering vs. Partitional Clustering
based ▪ Agglomerative clustering algorithms
▪ Comparison of hierarchical clustering algorithms
▪ Divisive clustering algorithms
▪ Density-based clustering
Week 7 7. Frequent Itemset ▪ Frequent Itemset Mining Problem
Mining/Association Rule ▪ Closed itemset, Maximal itemset
▪ Apriori Algorithm
▪ FP-Growth: itemset mining without candidate
generation
▪ Association Rule Mining
Week 8 8. Graph Mining ▪ What, Why Graph Mining?
▪ Methods for Mining Frequent Subgraphs
▪ Mining Variant and Constrained Substructure
Patterns
▪ Applications:
Week 9 9. Text Data Mining ▪ Text mining, natural language processing
▪ Information extraction/Retrieval
▪ Text mining applications:
▪ Clustering/classification/categorization
▪ Text categorization methods
Week 10 10. Time Series Data Mining ▪ Time series Data
▪ Trend analysis
▪ Data Transformation
▪ Similarity search
Week 11 11. Web Data Mining ▪ Web mining applications
▪ Background on Web Search
▪ VIPS (VIsion-based Page Segmentation)
▪ Block-based Web Search
▪ Block-based Link Analysis
Course Assessment
Course Textbook
1. “Data Mining – Concepts and Techniques,” Third Edition, Jiawei Han and Micheline Kamber,
Morgan Kaufmann, ISBN 978-0123814791
2. “The Data Warehouse Toolkit: The complete Guide to Dimensional Modeling,” Second Edition,
Ralph Kimall and Margy Ross, Wiley ISBN 978-0-471-20024-6
3. “Data Mining – Practical Machine Learning Tools and Techniques with java Implementation,”
Second Edition, Ian H. Witten and Eibe Frank, Morgan Kaufmann, ISBN 978-1558605527
4. “Principles of data mining,” Hand D. et al, MIT Press, ISBN 9780262332521
Page 2 of 3
Reference Textbooks
Course Journals
Online Resources
Page 3 of 3