You are on page 1of 7

Name of the Programme: MBA

Batch: 2020-22, Term: V

Course Code and Title MFT5SEOQ01 Data Analytics and Data Mining
Credit Hours 3.0
Faculty Dr. Mahesh K C
E-mail ID maheshkc@nirmauni.ac.in
Blog http://www.mahesh-kcdadm.blogspot.com
Phone No. 9825736059
Tuesday: 3.30 pm-4.30 pm
Office Hours
Friday: 3.30 pm-4.30 pm

***************************************************** ***********************

I. Course Overview:

Electronic capture of data nowadays has become inexpensive because of the innovations
such as the internet, e-commerce, e-banking, electronic bar-code readers etc. Today’s
statistics applications involve large amount of data sets: many cases and many variables.
Variables are collected on each case, but only a few of them are useful and majority of them
may be irrelevant. This enormous amount of data has the potential to predict the evolution
of interesting variables or trends in the outside environment. Data mining and applied
statistical methods are the appropriate tools to extract knowledge from such data. The
technological advancements and researches in the field of computing and statistics have
led to the introduction of flexible and scalable procedures that can be used to analyze such
large data sets. A number of applications have been reported in the field of credit rating,
database marketing, CRM, and stock market investments. The course focuses more on
statistical modeling with large data and introduces some basic and advanced statistical
methods and data mining tools. The course uses statistical computing software R.

II. Course Learning Outcomes (CLO):

After the completion of the course, students shall be able to

1. Identify the basic concepts and the importance of data mining tools and
techniques.
2. Apply, analyze and implement some of the widely used tools and techniques in
data mining.
3. Develop data analysis and modelling through R.

III. Text Book:

Shumeli, G., Bruce, P.C., Yahav, I., Patel, N.R., Lichtendahl, K.C. Jr., (2018), Data Mining for
Business Analytics, 1st edition, Wiley.
IV. Assessment Components & Schedule:

CLO
Assessment Overall
SL. N0. Weightage % Schedule Number
Component Weightage %

1 Quizzes 1, 2, & 3

Quiz-1 10 After 6th Session

Quiz-2 10 After 13th Session 40

Quiz-3 10 After 19th Session

Quiz-4 10 After 24th Session

2 Group After 20th session 1, 2, & 3


20 20
Assignment

3 End-Term As per schedule 1, 2, & 3


40 40
Exam

V. Session Plan

Session
Description
No.
Topic: Overview of data mining process
Pedagogy: Lecture
Text Book: Chapter 2, Topics 2.1-2.3, 2.5-2.6, pp. 15-43
1 Reading: R1, Chapter 1, Topics 1.4-1.6, pp. 6-16
Case /
Exercise:
CLO No: 1
Topic: Data Pre-processing: Standardization & Outlier Detection
Pedagogy: Lecture & Discussion of Concepts using R
Text Book:
2 Reading: R1, Chapter 2, Topics 2.1-2.3, 2.5-2.6, pp. 15-43
Case /
Exercise:
CLO No: 1, 2
Topic: Data Pre-processing: Achieving Normality
Pedagogy: Lecture & Discussion of Concepts using R
Text Book:
3 Reading: R1, Chapter 2, Topics 2.1-2.3, 2.5-2.6, pp. 15-43
Case /
Exercise:
CLO No: 1, 2
Topic: Basic Data Visualization
Pedagogy: Lecture & Discussion of Concepts using R
Text Book: Chapter 3, Topics 3.1 & 3.3, pp. 55-64
4 Reading:
Case /
Exercise:
CLO No: 1, 2
Topic: Imputation of Missing Data
Pedagogy: Lecture & Discussion of Concepts using R
Text Book:
5 Reading: R1, Chapter 28, Topic 28.1-28.3, pp. 695-699
Case /
Exercise:
CLO No: 1, 2, 3
Topic: Dimension reduction methods: Principal Component Analysis
Pedagogy: Lecture & Discussion of Concepts using R
Text Book: Chapter 4, Topic 4.1-4.5, 4.8, pp. 91-97, 101-107
6 Reading: R1, Chapter 4, Topic 4.1-4.7, pp. 92-110
Case /
Exercise:
CLO No: 1, 2, 3
Topic: Dimension reduction methods: Principal Component Analysis
Pedagogy: Lecture & Discussion of Concepts using R
Text Book: Chapter 4, Topic 4.1-4.5, 4.8, pp. 91-97, 101-107
7 Reading: R1, Chapter 4, Topic 4.1-4.7, pp. 92-110
Case /
Exercise:
CLO No: 1, 2, 3
Topic: Dimension reduction methods: Factor Analysis
Pedagogy: Lecture & Discussion of Concepts using R
Text Book: Chapter 4, Topic 4.1-4.5, 4.8, pp. 91-97, 101-107
8 Reading: R1, Chapter 4, Topic 4.8-4.10, pp.110-114
Case /
Exercise:
CLO No: 1, 2, 3
Topic: Simple linear regression (revisited)
Pedagogy: Lecture & Discussion of Concepts using R
Text Book:
9 Reading: R2, Chapter 14, Topic 14.1-14.5 & 14.7-14.8, pp. 600-653
Case /
Exercise:
CLO No: 1, 2, 3
Topic: Multiple linear regression: Model Building
Pedagogy: Lecture & Discussion of Concepts using R
10
Text Book: Chapter 6, 6.1-6.3, pp. 684-703
Reading: R2, Chapter 15, 15.1-15.5, 15.8, pp. 684-703, 719-720
Case /
Exercise:
CLO No: 1, 2, 3
Topic: Multiple linear regression with Categorical predictors
Pedagogy: Lecture & Discussion of Concepts using R
Text Book:
11 Reading: R2, Chapter 15, 15.7, pp. 709-713
Case /
Exercise:
CLO No: 1, 2, 3
Topic: Multiple linear regression: Variable Selection Methods
Pedagogy: Lecture & Discussion of Concepts using R
Text Book:
12 Reading: R1, Chapter 9, Topic 9.8 & 9.11, pp. 266-268, p. 279
Case /
Exercise:
CLO No: 1, 2, 3
Topic: Performance Evaluation Techniques
Pedagogy: Lecture & Discussion of Concepts using R
Text Book: Chapter 5, Topic 5.3, pp. 122-131
13 Reading:
Case /
Exercise:
CLO No: 1, 2, 3
Topic: Discriminant Analysis
Pedagogy: Lecture & Discussion of Concepts using R
Text Book: Chapter 12, Topic 12.1-12.5, pp. 293-302
14 Reading:
Case /
Exercise:
CLO No: 1, 2, 3
Topic: Classification Problem: The k-NN technique
Pedagogy: Lecture & Discussion of Concepts using R
Text Book:
15 Reading: Chapter 7, Topic 7.1, pp. 173-180
Case /
Exercise:
CLO No: 1, 2, 3
Topic: Classification and Regression Trees: The CART model
Pedagogy: Lecture & Discussion of Concepts using R
Text Book: Chapter 9, Topic 9.1-9.3, pp. 205-215
16 Reading:
Case /
Exercise:
CLO No: 1, 2, 3
Topic: Classification and Regression Trees: The C5.0 model
Pedagogy: Lecture & Discussion of Concepts using R
Text Book: Chapter 9, Topic 9.4-9.5, pp. 216-226
17 Reading:
Case /
Exercise:
CLO No: 1, 2, 3
Topic: Classification and Regression Trees: The Random Forest model
Pedagogy: Lecture & Discussion of Concepts using R
Text Book: Chapter 9, Topic 9.8, pp. 229
18 Reading:
Case /
Exercise:
CLO No: 1, 2, 3
Topic: Logistic Regression Modelling: Continuous Predictor
Pedagogy: Lecture & Discussion of Concepts using R
Chapter 10, Topic 10.1-10.3 & 10.6 pp. 237-246, pp. 261-262
Text Book:
(Appendix B)
19
Reading:
Case /
Exercise:
CLO No: 1, 2, 3
Topic: Logistic Regression Modelling: Dichotomous Predictor
Pedagogy: Lecture & Discussion of Concepts using R
Text Book: Chapter 9, Topic 9.8, pp. 229
20 Reading:
Case /
Exercise:
CLO No: 1, 2, 3
Topic: Logistic Regression Modelling: Polychotomous Predictor
Pedagogy: Lecture & Discussion of Concepts using R
Text Book: Chapter 10, Topic 10.1-10.3, pp. 237-246
21 Reading:
Case /
Exercise:
CLO No: 1, 2, 3
Topic: Model evaluation techniques for classification
Pedagogy: Lecture & Discussion of Concepts using R
Text Book: Chapter 10, Topic 10.4, pp. 247-249
22 Reading:
Case /
Exercise:
CLO No: 1, 2, 3
Topic: Model evaluation techniques for classification
23 Pedagogy: Lecture & Discussion of Concepts using R
Text Book: Chapter 10, Topic 10.4, pp. 247-249
Reading:
Case /
Exercise:
CLO No: 1, 2, 3
Topic: Naïve Bayes
Pedagogy: Lecture & Discussion of Concepts using R
Text Book: Chapter 8, Topic 8.1-8.2, pp. 187-199
23 Reading:
Case /
Exercise:
CLO No: 1, 2, 3
Topic: Naïve Bayes
Pedagogy: Lecture & Discussion of Concepts using R
Text Book: Chapter 8, Topic 8.1-8.2, pp. 187-199
24 Reading:
Case /
Exercise:
CLO No: 1, 2, 3
Topic: Clustering: Hierarchical & k-means Clustering
Pedagogy: Lecture & Discussion of Concepts using R
Text Book: Chapter 15, Topic 15.1-15.5, pp. 357-377
25 Reading:
Case /
Exercise:
CLO No: 1, 2, 3
Topic: Measuring Cluster Goodness
Pedagogy: Lecture & Discussion of Concepts using R
Text Book:
26 Reading: R1, Chapter 22, Topic 22.1-22.8, pp. 582-597
Case /
Exercise:
CLO No: 1,2,3
Topic: Association Rules
Pedagogy: Lecture & Discussion of Concepts using R
Text Book: Chapter 14, Topic 14.1, pp. 329-340
27 Reading:
Case /
Exercise:
CLO No: 1,2,3
Topic: Association Rules
Pedagogy: Lecture & Discussion of Concepts using R
Text Book: Chapter 14, Topic 14.1, pp. 329-340
28 Reading:
Case /
Exercise:
CLO No: 1,2,3
Topic: Ensemble Methods-Bagging and Boosting
Pedagogy: Lecture & Discussion of Concepts using R
Text Book: Chapter 13, Topic 13.1, pp. 311-315
29 Reading:
Case /
Exercise:
CLO No: 1,2,3
Topic: Ensemble Methods-Bagging and Boosting
Pedagogy: Lecture & Discussion of Concepts using R
Text Book: Chapter 13, Topic 13.1, pp. 311-315
30 Reading:
Case /
Exercise:
CLO No: 1,2,3

VI. Readings:

R1. Larose, D.T. and Larose, C.D. (2015). Data Mining and Predictive Analytics, by, 2nd
edition, Wiley.
R2. Anderson, D.R., Sweeney, D.J., Williams, A.T., Camm, C.D., and Cochran, C.J. (2014). 12th
edition, Cengage.
R3. Ledolter, J. (2013). Data Mining and Business Analytics with R, Wiley.
R4. Kumar, U.D. (2017). Business Analytics-The Science of Data-Driven Decision Making,
Wiley.
R5. Crawley, M.J. (2013). The R-Book, 2nd edition, Wiley.

R6. Kabacoff, R.I. (2015). R in Action: Data Analysis and Graphics with R, 2nd edition,
Dreamtech Press.

Instructions
Attendance in all classes is a necessary requirement. Student is expected to come to the
class after reading the material assigned for the classes. Students are expected to learn R-
programming on their own.

You might also like