You are on page 1of 3

IV SEMESTER: G 603.

4: DATA MINING
Learning Objective:

 To introduce students to the basic concepts and techniques of Data Mining


 To study the methodology of engineering legacy databases for data warehousing and data mining
to derive business rules for decision support systems
 Develop and apply critical thinking, problem-solving, and decision-making skills
Learning Outcomes

 Students will be able to categorize and carefully differentiate between situations for applying
different data-mining techniques: frequent pattern mining, association, correlation, classification.
 design and implement systems for data mining.

UNIT I
Data Warehousing and Online Analytical Processing :Data Warehouse Basic
Concepts: What Is a Data Warehouse? Differences between Operational Database
Systems and Data Warehouses. Data Warehousing: A Multitier Architecture , Data
Warehouse Models: Enterprise Warehouse, Data Mart and Virtual Warehouse
Extraction, Transformation, and Loading Metadata Repository. Data Warehouse
Modelling: Data Cube and OLAP: Data Cube: A Multidimensional Data Model, Stars,
Snowflakes, and Fact Constellations: Schemas for Multidimensional Data Models
Dimensions: The Role of Concept Hierarchies Measures: Their Categorization and
Computation Typical OLAP Operations, Data warehouse Architecture.
12 HOURS
UNIT II
Data Mining: Why Data Mining. What Is Data Mining- Moving toward the Information
Age, Data Mining as the Evolution of Information Technology. What Kinds of Data Can
Be Mined- Database Data, Data Warehouses, Transactional Data, Other Kinds of Data.
What Kinds of Patterns Can Be Mined- Class/Concept Description: Characterization
and Discrimination, Mining Frequent Patterns, Associations, and Correlations,
Classification and Regression for Predictive Analysis, Cluster Analysis, Outlier Analysis.
Data mining tasks-Data mining vs. KDD- Issues in data mining, Data Mining metrics,
Data mining architecture - Data cleaning- Data transformation- Data reduction - Data
mining primitives. 12 HOURS
UNIT III
Association Rule Mining: Introduction, Large Itemsets,Basicalgorithms:Apriori
algorithms, Sampling algorithm, Partitioning.
Classification:Introduction:Issues in Classification, Statistical-Based Algorithms-
Regression, Bayesian Classification.Disance-Based Algorithms:SimpleApproach,K-
nearestNeighbors,DecisionTree-Basec algorithms-ID3,C4.5 and C5.0,CART,Neural
Network-Based Algorithms-Propagation,NN supervised learning,Radial Basis Function
network,Percceptron. 12 HOURS

UNIT IV
Clustering – Clustering Methods - Outlier analysis. Hierarchical algorithms-
Agglomerative,Divisive,Partitional algorithms K-mean, Nearest Neighbor,PAM.
Web mining:Introduction,Web content mining- Web Structure Mining-Page
Rank,Clever.Web Usage Mining- Preprocessing,Data Structures, Pattern
Discovery,Pattern Analysis.
Applications and Other Data Mining Methods:SupportVector Machine, Rough sets,
Supervised and Unsupervised learning. 12 HOURS

Text Book:
1. Jiawei Han and MichelineKamber, ” Data Mining Concepts and Techniques”,
Morgan Kaufmann Publishers, USA, 2006.
2. Dunham M H,”Data mining: Introductory and Advanced Topics”. Pearson
Education, New Delhi, 2003.

Refference Books
1. Arun K Pujari,”Data mining techniques”, Oxford University Press, London, 2003.

2. Berson,”DataWarehousing, Data Mining and OLAP”, Tata McGraw Hill Ltd, New
Delhi, 2004.
3. Pang-Ning Tan, Michael Steinbach, Vipin Kumar, Introduction to Data Mining, ,
Pearson Education.
4. MehmedKantardzic,” Data Mining Concepts, Methods and Algorithms”, John
Wiley and Sons, USA, 2003.
5. Soman K. P., DiwakarShyam, Ajay V., Insight into Data mining: Theory and
Practice, PHI 2006

You might also like