You are on page 1of 2

Indian Institute of Management Nagpur

BDM&DM – Final Examination


Logistic Regression

a) Logistic regression is used to predict relationship between categorical dependent variable

and independent variables can be nominal, ordinal, interval or ratio scale.
b) The type of data where Logistic Regression can’t be used is when the dependent variable
is not discrete. Logistic regression can’t be used when there is high level of
multicollinearity in the independent variables
c) Some issues with Logistic Regression are: Absence of a goodness of fit R2 like statistic
for logistic regression, require sufficient responses in each category otherwise the errors
will be more, using method of maximum likelihood gives worse results.
d) One of the many applications of Logistic Regression is to predict employee attrition in an
organization. As the dependent variable is categorical i.e. employee left the job or not left
the job, and relationship between dependent and independent variables need not be linear,
Logistic Regression in the best applicable model in this situation.
The different independent variable data that need to be collected can be Gender, marital
status, age, education, years of experience, salary, designation, previous years
performance ratings etc. This past information and the fact the employee have left the job
or not can be used as input data in logistic regression and a logistic equation can be
obtained which will predict the future attrition probability of the employee. The
employees with high probability can be taken into consideration by the management and
they can be interviewed with to prevent their attrition in the future.

Classification Trees

a) Classification trees is used effectively when the dependent variable is categorical.

b) Classification trees can’t be used for continuous variables effectively. They need to be
transformed into discrete variables before using the data for classification trees.
c) Issues with the use of classification trees are using time sensitive data which requires a
lot of data preparation, they throw away some of the data from the data set, overfitting of
data is also an issue with the use of classification trees.
d) An application of classification trees is to understand customer churn. The independent
variables collected in the data set can be demographics, location, call minutes during
different period of day, charges for different services, total call duration etc. These
attributes can be used to predict if a customer is having a probability to churn in the
future and steps can be taken by companies to prevent customer churn by providing
different offers to the customer.

Artificial Neural Network

a) Artificial neural network can be used effectively to any type of data set both continuous
and categorical.
b) Artificial Neural network can be used for all type of data sets
c) Issues with Artificial Neural Network are they can be used only for a data set with small
number of input variables, also the weights it assigns to the input variables is optimum
there is no guarantee for that and there is no mathematical modelling available to explain
what the model is ding in order to generate the outcome. There is no set of mathematical
rules that ANN provided that it used to generate the results.
d) An application of Artificial Neural Network is in detection of fraud in credit cards. The
independent variables in the data set used for modelling can be demographics, amount of
credit given in past, history of past payment, amount of bill payed, amount of bill pending
etc. since the data is time dependent use of ANN for such interpretation will provide the
highest accuracy in terms of predicting the probability of fraud by an individual in the
future based on the past attributes.