You are on page 1of 3

Module Code and Title: DIS405 Data Mining and Warehousing

Programme: BE in Information Technology


Credit: 12
Module Tutor: Tsheten Dorji

General Objective:

This course is designed to introduce the core concepts of data warehousing and mining, associated
techniques, implementations, benefits. The course also introduces the art and techniques of
knowledge management, different application of data warehousing & mining and knowledge
management.

Learning outcomes:
On completion of the module, students will be able to:
1. Describe the key principle of data mining and knowledge management.
2. Identify data mining and data warehousing functionalities.
3. Describe and demonstrate basic data mining algorithms, methods, and tools.
4. Apply the practices & principles of data mining and knowledge management.
5. Differentiate different techniques of data abstracting and mining.
6. Apply data pre-processing techniques - data cleaning, data integration and transformation, data
reduction, discretization, and concept hierarchy generation
7. Identify business applications of data mining and warehousing.

Learning and teaching approach:


Approach Hours per Week Total Credit Hours
Lecture 3 45

Tutorial 1 15

Independent study/self-directed learning 4 60

Total 120

Assessment approach:
[Updated section: removed assignment and one of the Term Tests and added the self reflection
of Flipped Class session assessment from SS2023]

Sl. No. Marks


Mode of Assessment Nos. Marks (%)
Allocated
1. Continuous Assessment (Theory)
1.1 Mid Term Test: Closed book, one-hour
duration. Include all Units Covered upto that 1 10
time.
1.2 Case Study and its presentation: Four
weeks duration for case study assigned on
1 10
6th week and will cover contents from Unit I 50
till Unit 5
1.3 Self-Reflection of Flipped Class Session
This assessment will be carried out every 1 20
week after the Flipped Class Session
through VLE
2. Semester End Examination
1 50 50
Closed book and 3 hours examination.
Total 100

Pre-requisites: DIS201 Database Management Systems


Subject matter:
Unit I: Introduction

1.1 Data mining concepts and attributes


1.2 Data warehousing – definitions and characteristics, Multi-dimensional data model, Warehouse
schema.

Unit II: Data Marts:

2.1 Data marts, types of data marts, loading a data mart, metadata, data model, maintenance,
nature of data
2.2 Software components; external data, reference data, performance issues, monitoring
requirements and security in a data mart.

Unit III: Online Analytical Processing

3.1 OLTP and OLAP systems, Data Modelling, LAP tools, State of the market
3.2 Arbour Essbase web, MicroStrategy DSS web, Brio Technology
3.3 Star schema for multi-dimensional view, snowflake schema; OLAP tools.

Unit IV: Developing a Data Warehousing

4.1 Building of a Data Warehousing, Architectural strategies & organizational issues


4.2 Design considerations, data content, distribution of data, Tools for Data Warehousing

Unit V: Data Mining

5.1 Definitions; KDD (Knowledge Discovery database) versus Data Mining


5.2 DBMS versus Data Mining, Data Mining Techniques; Issues and challenges
5.3 Applications of Data Warehousing & Data mining in Government.

Unit VI: Association Rules

6.1 A priori algorithm, Partition algorithm, Dynamic inset counting algorithm


6.2 FP–tree growth algorithm; generalized association rule.

Unit VII: Clustering Techniques

7.1 Clustering paradigm, Partition algorithms, CLARA – Clustering Large Applications, CLARANS
– Clustering Large Application based on RANdomize Search
7.2 Hierarchical clustering, DBSCAN-Density-based spatial clustering of application with noise,
BIRCH-Balanced Iterative Reducing and Clustering using Hierarchies, CURE; Categorical
clustering, STIRR, ROCK – Robust Clustering using Links, CACTUS-Clustering Categorical
Data Using Summaries.

Unit VIII: Decision Trees

8.1 Tree construction principle, Best split, Splitting indices, Splitting criteria
8.2 Decision tree construction with pre-sorting.
Unit IX: Web Mining

9.1 Web content Mining, Web structure Mining, Web usage Mining, And Text Mining.

Unit X: Temporal and Spatial Data Mining

10.1. Basic concepts of temporal data Mining, The GSP algorithm,


10.2. SPADE-Sequential Pattern Discovery using Equivalence classes, SPIRIT-Sequential Pattern
mining with Regular Expression Constraints, WUM-Web Utilization Miner

Reading List:
Essential reading
Prabhu, S. (2004). Data Warehousing – Concepts, Techniques, products, application (2nd ed.). India:
PHI Learning Pvt. Ltd.

Pujari, A.K. (2013). Data Mining Techniques (3rd ed.). India: Universities Press.

Additional reading
Berson, A. & Smith, S. J. (1997). Data Warehousing, Data Mining and OLAP. New Delhi: McGraw
Hill.

Anahory, S. & Murray, D. (1997). Data Warehousing in the real world (1st ed.). India: Addison Wesley
Longman Ltd.

Dunham, M. (2002). Data Mining Introductory & Advanced Topic (1st ed.). India: Pearson Education.

Date: 27th January 2023

You might also like