Professional Documents
Culture Documents
Assignment 1
Case: Marketing Head’s Conundrum
1. Perform descriptive analytics on the training data. What insights can you gain based on the
descriptive statistics? (2 points)
2. Develop a logistic regression model for propensity to buy the product/service from WSES.
Perform diagnostic tests to check statistical significance of model developed. (2 points)
Ans.
Logistic regression was performed on training data using SPSS with sales outcome as the
dependent variable, and relative strength in the segment, profit of customer, sales value, profit%
and joint bid-WSES portion as independent variables with 0.3 as the cut-off probability. Sales
value was found to be insignificant as per Wald’s test. Hence it was removed. Model was re-
created using other 4 variables. Logistic regression model diagnostic tests were also performed,
and the following output was obtained –
Inferences -
The following model was obtained –
Sales Outcome = 115.5-0.89*(Relative Strength of Segment)-36.399*(Profit of Customers)-
1.119*(Profit %)+0.366*(Joint bid-WSES portion)
Omnibus test of model coefficients is statistically significant as p-value is 0.000. Thus, the
model obtained is statistically significant.
Wald’s test for the independent(predictor) variables is statistically significant as p-values for
all of them are 0.000 .
Hosmer-Lemeshow test gave the p-value to be 0.255 and hence we can conclude that the
logistic model fits the data.
The Nagelkerke’s R2 value was found to be 96% and Cox & Snell R2 value was found to be
72%. Thus it is safe to conclude that intercept-only model resembles the model with
independent variables by 96% and 72% respectively as per these two tests.
3.Compare the performance of the model using sensitivity, specificity and precision for a cut-off
probability of 0.3 in training as well as test data. (2 points)
For model
Classification Tablea
Predicted
No Percentage
Observed .000 1.000 Correct
We have 4 possibilities:
True Positives (TP) = the number of opportunities which were correctly classified to be won
False Positives (FP) = the number of opportunities which were incorrectly classified as won
True Negatives (TN) = the number of opportunities which were correctly classified to be lost
False Negatives (FN) = the number of opportunities which were incorrectly classified as lost
Predicted
0 1
Observe
0 TN FP
d
1 FN TP
Predicted
0 1
Observe
0 1456 59
d
1 24 1426
Using Formula,
Now, using same model that we used on training data, we can find out Z values and Predicted
probabilities in test dataset. With cut-off probability of 0.3, we get classification table shown below
Predicted
0 1
Observe
0 512 23
d
1 50 415
Using Formula,
Summary –
Performance
Training Data Test Data
Measure
Sensitivity 0.9834 0.8925
Specificity 0.9611 0.9570
Precision 0.9603 0.9475
All three performance parameters deteriorate on test data compared to training data. So, the model
performed better on the training data compared to the test data but the performance still
significantly high. So, we can use the same model for future purposes.
4. Rank the variables based on its high discrimination power (2 points).
5. Use data from the previous table to understand the impact of each variable on the win/loss
probability; i.e. study the impact of a variable by changing its value while keeping other variables
constant. (3 points)
a) Does winning improve with better relative strength perception for the segment?
Ans:
From the logistic regression results table, we can observe that the coefficient of relative
strength perception for the segment is significant and negative. This means that as relative
strength in the segment increases, the Z value of the logistic regression equation decreases.
Since P(Y=1) corresponds to the winning probability, and is directly proportional to exp(z),
so P(Y=1) decreases as relative strength in segment increases.
So, winning does not improve with better relative strength perception for the segment.
So, deal size has no significant relationship with the deal outcome.
f) Which of the levers are controllable and which of these are non-controllable?
Ans:
Relative strength in the Segment which maps to the Product (Product category by talent
required), Region (Area of the client), and Industry (of the customer) is NOT controllable.
Levers like Sales Value, Profit %, and Joint Bid- WSES Portion are controllable. This is
because sales value and margins depend on the deal negotiation, and thus can be influenced in
the proposal planned to be submitted. Similarly, the option of bundling its product with others
makes the Joint Bid- WSES portion lever controllable.
3. Suggest a deployment strategy for approaching customers that will result in Maximum
profit. WSES can approach a maximum of 100 clients and there should be at least 10% from
each product. (4 points)