Item 64/23 - Annexure - 18
MDIZ003 ‘Advanced Predictive Analytics LTP se
3/0/2 0/4
Pre-requisite | Nil Syllabus version
1.0
Course Objectives:
1. To learn, how to develop models to predict categorical and continuous outcomes,
using techniques such as decision trees, logistic regression, neural networks, and
Bayesian models.
2. To advice on when and how to use each model. Also learn how to combine two or
more models to improve prediction,
Course Outcome:
1. Understand the process of formulating objectives, data _selection/collection,
preparation and process to successfully design the model
2. Able to prepare and process data for the models,
3. Gain the insights from the data through Exploratory Data Analysis for feature
engineering
4, Compare the underlying predictive modeling techniques. Analyze on the performance
of the model and the quality of the results.
5. Explore Hybrid models to enhance the prediction performance.
6. Compare time series models and apply predictive modeling approaches using a
suitable python package.
Module:1 | Introduction ‘@hours
Overview of Predictive Analytics — Business Intelligence - Statistics ~ Challenges — Data ,
Modelling Obstacles — Processing Steps: CRISP-DM
Module:2 | Problem Understanding and Data Preparation hours
Understanding Business problem — Prediction Variable — Data Requirement — Access to
Data ~ Solution Method - Key Metrics - Model Performance - Diamond prices - Case Study
- Data Collection - Preparation - Numerical features - Encoding Categorical Features - Low
Variance Features - Near Collinearity One-hot Encoding,
Module:3 | Feature Engineering hours
Dataset Understanding - Exploratory Data Analysis - Univariate — Bivariate — Multivariate —
Encoding Categorical Predictors - Engineering Numeric Predictors - Feature Selection —
Methodologies — Irrelevant Feature Effect - Overftting - Greedy Search ~ Global Search,
Module:4 | Predictive Modeling T hours:
Decision Trees — Logistic Regression — Neural Networks — ENN — Naive Bayes - Linear
Regression.
Module:5 | Model Assessment and Ensembles Thours
‘Approaches - Batch Assessment - Rank-Ordered — Assessing Regression Models ~ Model
Ensembles ~ Bagging ~ Boosting - Random Forests ~ Heterogeneous Ensembles.
Module:6 | Time Series Prediction Thours
Statistical Models — Autoregressive Models — Moving Average Models — Autoregressive
Integrated Moving Average Models ~ Statespace Models — Hidden Markov Models ~ Deep
Learning Models ~ Recurrent Neural Networks.
Module:7 | Python Stack and Case Studies 6 hours,
‘Anaconda — Jupyter ~ NumPy - pandas - Matplollib - Seaborn - Scikitlearn - TensorFlow
= Keras - Dash - Case Studies ~ Diamond Prices ~ Credit Card Defaults
Module: | Contemporary Issues hours
I Total Lecture hours:| a5 hours,
Proceedings of the 64th Academic Council (16.12.2021) 305Item 64/23 - Annexure - 18
Text Book(s)
1. | Feature Engineering and Selection: A Practical Approach for Predictive Models —
edition, Max Kuhn and Kjell Johnson, 2019, Taylor and Francis.
Reference Books
1. | Applied Predictive Analytics: Principles and Techniques for the Professional Data
Analyst — 1° edition, Dean Abbott, Wiley, 2014
2, [Hands-On Predictive Analytics with Python: Master the Complete Predictive Analytics
Process, from Problem Definition to Model Deployment -1* edition, Alvaro Fuentes,
Birmingham: Packet Publishing, 2018.
3_| Practical Time Series Analysis, Aileen Nielsen - 1 edition, 2019, O'Reilly Media
Mode of Evaluation: CAT / Assignment / Quiz / FAT Project / Seminar
List of Experiments
1__| House rent prediction using linear regression Shours
2.__| Medical diagnosis for disease classification using decision trees ‘3 hours
3.__[ Automate email classification and response using k-NN classifiers Zhours
4. [Customer segmentation in business model based on their) 3hours
demographic, psychographic and behavior data using Naive Bayes
Classifiers
.__| Analysis of tweet_data to predict the sentiments on a product Zhours
6._| Analyze crime data using AR and ARIMA time series techniques on | ___2 hours
reported incidents of crime based on time and location
7. [Construct a recommendation system based on the customer) 2 hours
transaction data using Random Forest method
& [Prediction on power consumption data to suggest for minimizing | __2 hours
the usage
9_ | Buying prediction of customers for any online product purchase. 3 hours
10 | Agricultural data analysis for yield prediction and crop selection on| __3 hours
Indian terrain data set
71, | Develop a recommender system for any real-workd problem (when @ | 3 pours
user queries to find the good hospital for Covid-19 treatment)
72, | Develop a business model to predict the trend in Investment and hours
Funding
Total Laboratory Hours | __30 hours
Mode of Evaluation: Projec/Activit
Recommended by Board of Studies | 25-10-2021
Approved by Academic Council No. 64 [Date [25-11-2027
Proceedings of the 64th Academic Council (16.12.2021) 306