You are on page 1of 10

Customer Churn Analysis in Telecom domain

Rahul Jaju
X18125671
Contents

 Business Problem
 Data Preparation
 Exploratory Data Analysis
 Churn percent of each attribute
 Model building
 Confusion Matrix
Business Problem

 ‘Churn’ or ‘Not Churn’


 Customers today go through a complex decision-making
process of subscribing to any one of Telecom service
options.
 Customers loyalty has become an issue.
 It is very important for the companies to identify the
attributes that have a tendency to unsubscribe and take
preventive measure to retain the customers.
Data Preparation
› The raw data consists of 7043 rows, 21 columns
and 1 target variable as “Churn”.
› Dataset is cleaned as there are a lot of missing
values, incorrect values like ‘‘Null’’ and imbalance
attributes in the dataset
› We have portioned the dataset into training and
testing dataset.
Exploratory Data Analysis
 Cost of Acquiring New
Customers >> Retaining Old
Customers

 Almost, 27 % customers
‘Churn’

 73 % customers remain ‘Not


Churn’
Churn Percent of each Attribute
• Gender

• Senior Citizen

• Partner

• Dependents

• Phone Service

• Multiple Lines
Other Attributes
• Internet Service

• Online Security

• Online Backup

• Device protection

• Tech Support

• Streaming Tv
Model Building
“A model is like a pair of goggles. It puts certain things into focus.”

› Random Forest is been used for this purpose


› Why Random Forests ?
• Most accurate learning algorithm
• Runs efficiently on larger database
• High performance with less need of interpretation
• It gives us an estimate of the important variables
• Can handle lots of variables without variable deletion.
• Generated forests can be saved for future use on other data.
Calculated Confusion Matrix
CONFUSION MATRIX AND STATISTIC
 
REFERENCE
PREDICTION NO YES
NO 1232 330
YES 61 137
Accuracy : Another such measure to
evaluate precision in the data.
ACCURACY: 0.7778
95% CI: (0.7577, 0.7971)
Recall : TP (True Positive)/ (True Positive +
NO INFORMATION RATE: 0.7347
False Negative)
P-VALUE [ACC > NIR]: 1.673E-05
KAPPA: 0.3017
Precision : TP (True Positive) / (True
Positive + False Positive) MCNEMAR'S TEST P-VALUE: < 2.2E-16

SENSITIVITY: 0.9528
Accuracy : 78 percent SPECIFICITY: 0.2934
POS PRED VALUE: 0.7887
Sensitivity : 95.2 percent NEG PRED VALUE: 0.6919
PREVALENCE: 0.7347
DETECTION RATE: 0.7000
DETECTION PREVALENCE: 0.8875
f measure : It speaks about the relational BALANCED ACCURACY: 0.6231
value 'POSITIVE' CLASS: NO
 

You might also like