You are on page 1of 79

TREND ANALYSIS AND PREDICTION OF CUSTOMER

CHURN IN TELECOMMUNICATION INDUSTRY USING


MACHINE LEARNING

By

Kashmala Sahar
Reg. #S20-025-16854

Supervised by
Mr. Muhammad Waleed Ashraf

Master of Science
In
Computer Science
at

Riphah International University,


Faisalabad Campus, Pakistan
January, 2019

1
TREND ANALYSIS AND PREDICTION OF CUSTOMER
CHURN IN TELECOMMUNICATION INDUSTRY USING
MACHINE LEARNING

By
Kashmala Sahar
Reg. # S20-025-16854

Supervised by
Mr. Muhammad Waleed Ashraf

A thesis submitted in partial fulfillment of requirements for the degree of

Masters of Science
In
Computer Science
at
Riphah International University,
Faisalabad Campus, Pakistan
January, 2019

1
APPROVAL SHEET

SUBMISSION OF HIGHER RESEARCH DEGREE THESIS


The following statement is to be signed by the candidates ‘supervisor (s), Dean/ HOD and
must be received by the COE, prior to the dispatch of the thesis to the approved examiners.

Candidate’s Name & Reg#: Kashmala Sahar (S20-025-16854)


Program Title: Master of Science in Computer Science, MS (CS)
Faculty/Department: Computing
Thesis Title: Trend Analysis and Prediction of Customer Churn in Telecommunication
Industry Using Machine Learning
I hereby certify that the above candidate’s work, including the thesis, has been completed to
my satisfaction and that the thesis is in a format and of an editorial standard recognized by the
faculty/department as appropriate for examination. The Thesis has been checked through
Turnitin for plagiarism (test report attached).

Signature (s):

Principal Supervisor: _________


Date: _______________________
Co-Supervisor: ______________
(if any) ______________________
Date: _______________________

Plagiarism In-charge: _________

Date and Stamp: _____________

The undersigned certify that:


1. The candidate presented at a pre-completion seminar, an overview and
synthesis of major findings of the thesis, and that the research is of a standard and
extent appropriate for submission as a thesis.
2. I have checked the candidate’s thesis and its scope, format, and editorial
standards are recognized by the faculty/department as appropriate.
3. The plagiarism check has been performed. Report is attached

Signature (s):

Principal/ Head of Department: ___________

Date: ____________

iii
DECLARATION

I certify that the research work presented in this thesis is my own to the best of my
knowledge. All sources used and any help received in the preparation of this
dissertation have been acknowledged. I hereby declare that I have not submitted this
material, either in whole or in part, for any other degree at this or any other institution.

Note: It is responsibility of student to write thesis in his or her own words carefully.
In case of any plagiarism issue or any other related issue(s) student will solely be
responsible.

Name: Kashmala Sahar


Registration no: S20-25-16854

Signature: __________________

iv
ACCEPTENCE CERTIFICATE
(Department will provide you these pages)

v
ACKNOWLEDGEMENT

First and foremost, I would like to profoundly praise the Almighty Allah SWT for
enabling me to see this great moment. I would like to thank and express my deepest
gratitude and appreciation to my supervisor Mr. Muhammad Waleed Ashraf who
gradually helped me in every way which I needed to go through all difficulties. I have
been extremely honored to have a supervisor who cared so much about my work, and
who responded to my questions and queries so promptly. I am really thankful to him
and without his excellent guidance this thesis would not have been possible.

I would also like to pay my gratitude to the rest of the faculty members of Riphah
International University Faisalabad Campus, who have gradually offered their time,
expertise, wisdom and encouraged me to complete my thesis work in a better way.

vi
DEDICATION

This thesis is dedicated to my parents for always believing in me, inspiring me,
and encouraging me to achieve my goals.

vii
TABLE OF CONTENTS
Chapters
1. INTRODUCTION…................................................................................... 01
1.1 Overview……………………………………………………………01
1.2 Research Background……………………………………………….02
1.3 Problem Statement…......................................................................... 06
1.4 Research Questions………………………….…………………….. 06
1.5 Research Objectives…………………..…………………………… 06
1.6 Research Significance………….………………………………….. 06
1.7 Thesis Organization………….…………………………………….. 07
1.8 Chapter Summary...…….………………………………………….. 07
2. REVIEW OF LITERATURE…………………………………………….08
2.1 Introduction………………………………………………………....08
2.2 Related Work………………………………………………………. 08
2.3 Discussion and Analysis…………………………………………… 29
2.4 Chapter Summary…………………………………………………. 30
3. RESEARCH METHODOLOGY………………………………………... 31
3.1 Introduction…………………………………………………………31
3.2 Research Framework………………………………………………. 31
3.2.1 Data Collection…………………………………………….. 32
3.2.2 Data Preprocessing………………………………………….35
3.3 Evaluation metrics…………………………………………………. 38
3.4 Tools and Technology………………………………………………
38
3.5 Proposed Method…………………………………………………... 39
3.6 Chapter Summary………………………………………………….. 39
4. RESULTS AND DISCUSSION………………………………………….. 40
4.1 Introduction…………………………………………………………40
4.2 Design of Proposed Method………………………………………...
40
4.3 Results and Discussions…………………………………………….41
5. SUMMARY, FINDINGS, CONCLUSION &
RECOMMENDATIONS………………………………………………….51

viii
5.1 Summary………………………………………………………….....51
5.2 Findings……………………………………………………………..51
5.3 Conclusion…………………………………………………………..52
5.3.1 Limitations…………………………………………………..53
5.4 Recommendations…………………………………..………………53
5.4.1 Future Recommendations……………………………….…..53
REFERNCES………………………………………………………………………54
APPENDIXES……………………………………………………………………...62
A: Source Code of Porposed Method (If applicable)………………. 62

ix
LIST OF TABLES

Table 1.1 critical literature analysis………… ……………………………………….…04


Table 2.1 Comparison Results of KNN, RF and XGB…………….……………….…………..10
Table 2.2 Performance Results of comparison model……….……………….………………...15
Table 2.3 Performance of LR and BLR………………………………………………………...16
Table 3.1 Creation of confusion matrix……….……………….……………………………….35
Table 3.2 Basic terminologies of used by confusion matrix……….……………….………….35
Table 3.3 Measurement derived by confusion matrix……….………………………………....36
Table 3.4 Confusion Matrix……….……………….………….……….……………….………36
Table 3.5 Evaluation Parameters……….……………….………….………..………………….38
Table 4.1 Results of classification method XGB……….……………….…………….………..42
Table 4.2 Comparison of accuracy XGB……….……………….……………………………...42
Table 4.3 Research of churn prediction using telecom dataset……….……………….……….43

x
LIST OF FIGURES

Figure 2.1 Life cycle of prediction analysis .…………………….……………..11


Figure 2.2 Prediction behavior………. ………………………………………...12
Figure 2.3 Prediction behavior………………………………………………….13
Figure 2.4 Churn prediction using hybrid………………………………………13
Figure 2.5 Parallel ensemble………………………………………………….....30
Figure 3.1 Customer churn framework……………………………………….....32
Figure 3.2 Customer Churn dataset……………………………………………..34
Figure 3.3 Six steps for customer churn perdition………………………………34
Figure 3.4 Confusion Matrix…………………………………………………….36
Figure 3.5 Methodology…………………………………………………………38
Figure 4.1 Flow diagram for proposed method………………………………….41
Figure 4.2 Service type…………………………………………………………..45
Figure 4.3 Satisfaction of customer with signal strength………………………..45
Figure 4.4 Satisfaction of customer with call rate……………………………….46
Figure 4.5 Satisfaction of customer with price of packages……………………..47
Figure 4.6 Quality of internet……………………………………………………47
Figure 4.7 Availability of 4G services…………………………………………...48
Figure 4.8 Customer relation duration with companies………………………….49
Figure 4.9 Churn rate of telecom companies…………………………………….50
Figure 4.10 Area under curve……………………………………………………..50

xi
LIST OF ABBREIVATIONS

AUC Area under Curve


AGBPNN Adaptive Gain with Back propagation neural Network
ANN Artificial Neural Network
BLR Bayesian Boosting Logistic Regression
CRM Customer Relationship Management
CCP Customer Churn Prediction
CDS Clouding Data server
CLV Customer life time value
DM Data Mining
DT Decision Tree
DFNN Deep Feed Forward Neural Network
GA Genetic Algorithm
GBDT Gradient Boosting Decision Tree
ISP internet service provider
KNN K- Nearest neighbor hood
KSVM kernel model of Support Vector Machine
LR Logistic Regression
ML Machine Learning
MLP Multi-layer Perceptron Neural Network
RBFN Radial Base Function
RST Rough Set Theory
SMOTE synthetic minority over Sampling Techniques
SOM Self organized Map
SVM Support Vector Machine
UCI University of California Irvine
WRF Weight Random Forest
XGBoost extreme Gradient Boosting
XGB extreme Gradient Boosting

xii
ABSTRACT

In recent years, customer churn prediction has become a hot topic in the telecom
industry. Telecommunication Company is producing a massive amount of data every
minute. Churn refers to the loss of regular customers due to competitors’ offers or
maybe network issues. In certain cases, the customer may decide to discontinue their
service subscription. Churn rate has an important impact on customer life-time value
and it also affects the future income of a company. That’s why Companies are seeking
a model which can provide them predictions about customer churn because it has a
direct impact on the industry's profits. Machine learning techniques are used to create
a prediction model which can predict consumers are likely to terminate their
subscriptions. Now day’s competitive business environment, "customer retention" is
becoming increasingly significant task. The XGBOOST algorithm produced the
greatest outcomes in this research. This research examines customer trends and
predicts customer churn. Using the dataset 19.26% churned and 80.73% non - churn
customers with 89.39% accuracy, precision value 73.71%, recall rate of XGBoost is
69.79 % and F1-score 71.69%.

Keywords: churn, machine learning, customer trend, telecommunication, XGBoost

xiii
CHAPTER 1
INTRODUCTION
1.1 Overview
Customer behavior leads customer trend as well as customer dissatisfaction
leads customer churns. The process of moving customer from one company to another
in given time is called “customer churn” (Keramati & Ardabili, 2011). The main
reason of churn is dissatisfaction of customer (A. A. Ahmed & Maheswari, 2017).
Customer churn is expressed as the ratio of customers that discontinue utilizing a
company's products or services for a longer duration. Mathematically, C(T) =
A(T)/B(T)*100 here C represent customer churn % for time frame (T), A(T) represent
total number of customers after time (T), B(T) represent total number of customer
before time (T). Customer retention is more profitable rather than attractive and
fetching new customer (Sivasankar & Vijaya, 2019). The cost of customer retention
is much lesser than acquiring new one (Huang, Kechadi, & Buckley, 2012).
Moreover, long term standing and satisfaction of customer is more profitable for firm
and it means that customer is not going to churn (Karahoca & Karahoca, 2011), (Idris,
Iftikhar, & ur Rehman, 2019). Customer churn leads financial loss of company that
affects company’s reputation (Zhao, Gao, Dong, Dong, & Dong, 2017). Analysis of
customer churn can be done through analytical work which can define possibilities of
customer churn. To predict churners, an adequate churn prediction model is required
(Dalvi, Khandge, Deomore, Bankar, & Kanade, 2016), (Chu, Tsai, & Ho, 2007). In
recent years, different machine learning algorithms used for prediction about churn in
telecommunication (Huang, Buckley, & Kechadi, 2010). Furthermore, telecom
service providers companies design an architecture model to subscribes more
customers (Verbraken, Verbeke, & Baesens, 2014). Moreover, the architecture model
focus on attracting new customers as well as maintain existing customer because
customer retention is cheaper and more profitable rather than getting a new one (A.
Ahmed & Linen, 2017). (N. Lu, Lin, Lu, & Zhang, 2012) Recommended Ada-Boost
technique for churn prediction because it achieves highly accuracy and allows good
separation for churn models a comparison is also made between the results of logistic
regression and a single logistic regression model, to prove LR is better. (Azeem,
Usman, & Fong, 2017) evaluated the performance of various classifier using fuzzy
1
model and found that fuzzy classifiers are provide more accuracy in prediction about
customer churn with noisy datasets. (C.-F. Tsai & Lu, 2009) evaluated ANN and
hybrid data mining techniques with the combination of ANN and SOM to achieve
high accuracy for prediction. Today's businesses are concerned with rising consumer
satisfaction through customer lifecycle analysis (Rygielski, Wang, & Yen, 2002).
Customer relationship management (CRM) is a marketing philosophy that consist of
four dimension: firstly, customer recognition, secondly client attraction, thirdly client
retention, and last is customer growth (Hosseini, Maleki, & Gholamian, 2010)
1.2 Research Background
Most highly competitive companies have realized that maintaining current and
valuable customers is a key managerial strategy for surviving in business. As a result,
the significance of Churn management is the process of reducing the number of
people who leave the business. Customer turnover occurs when a company's
customers leave (C.-F. Tsai & Chen, 2010). Mobile service providers must deploy
churn predictive models that can accurately detect customers that are about to exit, to
minimize churn, this common classification algorithm should be used carefully, and
the data should be cleaned. SVM can also be used to evaluate consumer churn activity
(Ben, 2020). For customer churn prediction various machine learning algorithms, data
mining algorithms and hybrid technologies are used these methods help the
companies to identify churning, predicting and retaining the customer as well as help
the industries in decision making. Majority of them used decision tree for prediction
about churn but it’s not appropriate for complicated challenges and problems
(Lazarov & Capota, 2007). However, study shows that decreasing of data enhanced
the accuracy of decision tree (Vadakattu, Panda, Narayan, & Godhia, 2015). Machine
learning algorithms used for both customer churn prediction and historical analysis
for regression ML techniques such as rule-based learning, decision tree, and ANN
(Qureshi, Rehman, Qamar, Kamal, & Rehman, 2013). Nowadays service provider
companies or industries of telecom lose and suffer valuable customers due to their
competitors that are the reason for customer churn. Many developments have
occurred in the telecommunications industry in recent years, including new networks,
technology, and market liberalizations that have increased competition. 

2
Moreover, machine learning techniques have recently emerged to address and
tackle the difficult issue, Customer churn is a main concern and problem in the
telecommunications industry. The process of prediction for customer churn
classification is a very important phase for results. Many classification techniques are
used successfully for prediction such as Decision Tree, Support Vector Machine,
Naïve Bayes, Logistic Regression, and Artificial Neural Network.
But in advancement in information technology, data volume has increased at
a phenomenal rate over the last two centuries. At the moment, there has been a lot of
progress in the mining of data many new techniques and algorithms have been
developed. To precede data and collect information, a new feature has been added.
The data collected from many sources is raw data in the sense that it hasn't been
processed where the important knowledge is secret Information the method of
collecting minerals is known as mining. Customer churn is the most challenging
challenge that the telecom industry faces. Customer churn models are designed to
recognize customers who are likely to leave (Almana, Aksoy, & Alzahrani, 2014).
Retention of existing customers is still the company's preferred choice rather than
attracting new consumer, but expense about five to six times costly for attracting new
one as compared to maintaining existing customer. 
It is regarded as a business loss when the number of customers is less than the
calculated number. If you want to retain a current customer you have to increase
valuable profit and returns. To prevent losses to the telecom industry, the Models
must recognize customers who want to churn and their causes for leaving or churning
so that’s why the model must be developed to evaluate the reasons for this churn and
the improvements that must be made in order to keep customers’ safe state. One of the
serious issue tackle in the telecommunications industry now day is retaining
customers while a high risk for customer churns. If the expected number of customers
falls below existing customers it will be considered as a company and business loss,
to retain the existing consumers need to a small move towards increasing profits or
revenues and develop a model that describes the reason for customer churn and how it
can be overcome (H.-F. Wang & Hong, 2006).
Many researchers use various types of machine learning algorithms and data
mining algorithms to predict the challenges of customer churn. This research uses

3
XGBoost (extreme Gradient Boosting) algorithm for customer trend analysis and
prediction about customer churn. XGBoost is use due to its huge implementation in
machine learning and data science. The performance and execution speed is amazing.

Table 1.1: Critical Literature Analysis to Formulate Research Problem

Sr Author Title Methodology The best method for Limitation


# Name an accurate result

1 (Ben, 2020) Enhanced Churn The algorithms ANN, For customer churn The turnover
Prediction in the SVM, logistic prediction LR is better
Telecommunication regression, DT, and (Logistic regression) predicted by
Industry random forest the technique is the logistic
better. regression
model.
Furthermore,
the results
revealed that
internet
service, and
online
security is all
important
factors.

Churn is
influenced by
a number of
things,
including
internet
services and
contract types.

2 (Ahmad, Customer churn Decision Tree (DT), Best and accurate Data in not
Jafar, & prediction in Random Forest, results are obtained balanced
Aljoumaa, telecom using XGBOOST (Extreme by XGBOOST results may be
2019) machine learning in Gradient Boosting), Approach. decreased
big data platform and Gradient Boosted reason non
Machine Tree stationary data
(GBM). models.

3 (Arivazhagan Customer Churn To categorize LR with Bayesian BLR takes


& Sankara, Prediction Model customers, DM, boosting. high
2020) using Regression supervised technique processing
with Bayesian LR with Bayesian time as
Boosting technique boosting is used. compared to
in Data Mining existing LR,
highly
biasness, and

4
low accuracy,

(Sundarkuma One-Class Support SVM, LR and For Fraud


r, Ravi, & Vector Machine decision tree detection is
4 Siddeshwar, based under SVM more useful
2015) sampling: than consumer
Application to churn
Churn prediction prediction.
and
Insurance Fraud
detection

(N. Lu et al., A Customer Churn Logistic regression Logistic regression Unable to


2012) Prediction Model in and Ada boost determine the
Telecom causes of
Industry Using customer
5 Boosting churn

6 (Almana et A survey on data Generally, apply DT In terms of accuracy, Because of the


al., 2014) mining techniques based techniques, techniques of number of
in customer churn regression decision tree datasets,
analysis for the techniques especially C5.0 and neural
telecom industry Neural Network- CART have networks
based techniques and outperformed some outperform
are generally applied existing data mining the former
in customer churn. techniques such as method, and it
regression. does not work
using RULES
family
techniques.

(Qureshi et Telecommunication Neural Networks, The F-measure is a It does not


al., 2013) subscribers' churn Decision Trees method of evaluating process on
prediction model including CHAID, the output of various larger data
using machine Exhaustive CHAID, prediction models sets with data
7 learning. K Means clustering, and The Exhaustive collected over
used to predict active CHAID algorithm a longer
and churn consumer. generated the best period of
results. time, nor
work on data
from multiple
nations or
telecommunic
ation
companies.

(H.-H. Tsai, Research trends This paper examines This work examines The author
2011) analysis by data mining and data mining and productivity
comparing data CRM analysis CRM research trends distribution
mining and patterns using a using a biblio-metric predicted by
8 customer biblio-metric method. approach. Lotka's Lotka holds
relationship law explains the for data
management frequency of mining, but
through bibliometric publishing by writers not for CRM,
methodology in a certain field, and according to
the K–S test is used the K–S test.
to determine whether CRM does not
the analysis follows fit Lotka's law
5
Lotka's law. because the
number of
authors who
published a
single article
is too wide
and large.

1.3 Problem Statement


Customer retention is identified as a challenge in our feasibility report.
Customer retention is becoming a more urgent concern and a big challenge or issue
for a business industry such as banking insurance, departmental stores, and service
provider companies (e.g., telecom, internet service). This issue is particularly
problematic for businesses with a limited client base since each customer represents a
large number of companies. A single customer's turnover has a huge impact; the
company's income has decreased by a certain amount.
Churning data is typically noisy and unbalanced; to remove noisy data use a
variety of classifiers but each classifier has its own set of limitations. To achieve a
higher AUC, there is a need to propose an improved model design by using the
XGBOOST algorithm, which allows for visualization, storing, and mining the data.
1.4 Research Question(s)
1. How to predict the customer churn and Customer trend analysis using a
machine learning algorithm?
1.5 Research Objective(s)
1. To predict the customer churn and Customer trend analysis using machine
learning algorithm.
2. To Predict the Earlier-warnings causes of churn, reduce churn rate, and make
sure to maintain customer retention.

1.6 Research Significance


The first priority of a company is customer retention. That’s why this research
deals with customer satisfaction, to predict “Earlier-warnings” causes of churn, reduce
churn rate, and make sure to maintain customer retention. The goal is clear and well-
defined. This research concerned with customer defection or turnover and mission of

6
research is to recognize those who may be on the edge of defecting or churning so that
steps can be taken to keep them till they do (defecting).
1.7 Thesis Organization
The rest of the chapters are organized as follows: Literature review has been
discussed in Chapter 2. Research methodology with its research framework, dataset,
evaluation parameters including tools and technology have been discussed in Chapter
3. Chapter 4 consists of proposed work and chapter 5 concludes the research and
provides suggestion for future work.
1.8 Chapter Summary
In this chapter, based on the research background has been critically discussed
to formulate a problem was identified and research objective was set to overcome
customer churn problem. Generally, this chapter describes an overview of the whole
thesis related to censoring of inappropriate scenes. When a client transfers from one
service supplier to another, this is referred to as customer churn. Churn is a concern
for every subscription or recurrent purchase service provider. Because of its
remarkable expansion in recent years, the telecom industry is the primary subject of
this research.
Everybody has a telecom subscription these days, thanks to convenient
connectivity and a variety of service providers. The Churn Prediction study can help
in analyzing past data to identify a list of clients that are at greater risk of churning.
This will allow the telecom business to focus on a small set of customers rather than
trying to retain every consumer. Individualized client retention is tough to achieve
since most firms have a large number of customers and cannot manage to devote
much money, resources, or time to it. However, so that we can forecast which
consumers are likely to leave ahead of time, we could focus our customer retention
strategies exclusively on them. Churn prediction is extremely vital since obtaining
new customers is far more expensive than keeping existing ones.

CHAPTER 2

7
LITERATURE REVIEW
2.1 Introduction
Customer trend analysis checks the customer trend and competitive companies
due to churn occurred. Currently, businesses face both internal and external
difficulties that cause customers to abandon their services. Internal problems arise as a
result of poor service quality, such as inferior products, poor customer service, and
expensive costs, among other things. External difficulties, on the other hand, come
from both direct and indirect competition. In addition to attracting new customers, the
company must maintain old ones since churn consumers are new customers of
competitors. Here review the literature on which machine learning algorithm
produced accurate results for churn prediction and causes of churn.
2.2 Related Work
Sharma and Kumar discuss in this article customer churn is an excellent
indicator of the quality of service and happiness of any customer. The telecom
industry is a fast-paced, high-growth industry. Companies that operate on a
subscription basis make up this group model. These businesses are continuously under
pressure a larger percentage of clients who churned and switched to competitor
companies that provide products and services that are competitive. As a result,
determine why their customers churn and seek innovative strategies to improve
customer satisfaction and increase the revenue base of customers CRM (Customer
Relationship Management) is a strategic management tool customer relationship
management is the process of managing customer relationships, interaction, and
retention. Some businesses utilize customer data to obtain a better understanding of
their customers' behavior and tricks insights that can help them enhance customer
care. When machine learning (ML) is integrated into CRM software, it may measure
churn rates, identify churn factors, and identify clients who are likely to leave. It
could also assist a business in determining and implementing proactive retention
methods. Multiple researchers have found that keeping existing consumers is less
expensive than finding new ones (K. P. Sharma, 2011).

(Do et al., 2017) describe customer retention as one of the most important
concerns for every organization, as customers are the primary source of revenue.
8
Losing customers not only results in a loss of earnings, but it also puts the entire firm
in danger that's why customer churn prediction has become one of the most
challenging problems. This study proposes a customer churn prediction in ISP
(Internet Service Provider) organization based on very unbalanced data to identify
consumers who are in danger of leaving the services. the dataset is imbalanced 98:2
non-churner is greater than churner so reducing the imbalance of classes is very
important before implementing numerous models such as ANN, AdaBoost, KNN, and
XGBoost, etc imbalance meant of classes was corrected using the SMOTE
oversampling approach. SMOTE is a classification method based on an unbalanced
dataset of the telecommunication industry that detects subscribers at danger of
quitting the service. It primarily concentrated on the feature engineering and modeling
phases. The goal of this research is to provide a solution for dealing with common
problems in real life: extremely unbalanced data in CCP (customer churn prediction)
challenge. It could be used to solve other situations with imbalanced data. Further, the
research presented benchmark results for this problem (Do, Huynh, Vo, & Vu, 2017)

Machine-learning techniques (Niken & Ohwada, 2014), data mining


techniques (Jadhav & Pawar, 2011), hybrid neural networks (C.-F. Tsai & Lu, 2009)
and other approaches have been proposed to solve the customer churn problem.
However many real-life challenges imbalance data that have not been handled in
terms of accuracy and recall. Precision measurements would be more helpful for cost
estimation when dealing with the possibility of customer churn. Recall of a model is
important for maintaining the majority of customers. To select suitable models and
data, a balance between these measures is compulsory.

(Kavitha et al., 2020) discuss in this article three tree-based algorithms


because of their usefulness. After comparing other algorithms higher accuracy gain by
Random forest (RF), XGBoost, and Logistic Regression (LR). This research uses a
dataset consisting of some customers' information service packages and plans which
help precise predictions to identify the customer that is planning to switch to another
service provider company. After this, the telecom industry has a clear view and the
company can provide the best offers and services to stay in that company. The best
results and performed better in the proposed churn model is random forest which
9
provider better and high accuracy among the various approaches (Kavitha, Kumar,
Kumar, & Harish, 2020)

(Pamina et al., 2019) are used three well-known classifiers, KNN, RF, and
XGBoost, to develop the churn prediction model. Evaluation indicators such as the
accuracy and the F score are compared. In this research paper, F-score is also
calculated because only accuracy is insufficient for a higher churn prediction model.
Table 2.1 shows the results after comparing KNN, RF, and XGBoost, higher accuracy
and f score is achieved using XGBoost (Pamina, Raja, SathyaBama, Sruthi, & VJ,
2019).
Table 2.1: Comparison results of KNN, RF, XGBoost (Pamina et al., 2019)
Classifier Accuracy F-score
KNN 0.754 0.495
RF 0.775 0.506
XGB 0.798 0.582
Customer Relationship Management (CRM) is a marketing strategy that helps
to clarify customers in order to increase customer satisfaction. Profitability and long-
term consumer relationships are important. It is a customer-centric marketing domain
that gives a better customer experience Individuals' needs are met through a service
that is customized to their features and consumption patterns. CRM makes the process
go more smoothly, where a group of people interact and work together. CRM
combined with data mining tools simplifies the analyzing process. Its ability to
explore patterns from vast datasets makes it appealing to customers. It gives CRM a
strong technical platform for analyzing enormous amounts of complicated customer
data. Customer churn prediction is a domain in CRM that identifies satisfied
consumers (non-churners) from churners who have an adverse influence on the
business's growth. As a result, locating them is a critical step in strengthening the
company. The goal of this study is to employ data mining techniques to anticipate
customer attrition in the telecom industry. At first, this approach classified customers
as churners or non-churners consumers based on criteria in each field with their own
set of rules. Then, to classify the consumers, the data mining Bayesian boosting
logistic regression with supervised technique is used.CRM (customer relationship
management) is a rapidly growing field across all domains (S. Sharma, 2018).

10
(Cordoba, 2014) Predictive analytics' success is based on better decision-
making. Previously, decision-making was successful even with a small amount of
data. However, as the volume of data has grown to unprecedented levels, humans'
capacity to make judgments has been difficult. Data-driven decision-making is
frequently known as qualitative models developed utilizing closed-loop procedures,
also known as cycles, as shown in Figure 2.1

Figure 2.1: Life cycle of predictive analysis (Cordoba, 2014)


Analysis of customer behavior may assist service companies in increasing
customer satisfaction by improving the quality of their products, services and
offerings (Rattanathavorn & Premchaiswadi, 2015). Predictive analytic, which is
often used for marketing purposes, tries to forecasting future behavior of consumers,
such as their responses to marketing events and offers, marketing databases typically
contain very few information about customer interest, product needs, and  purchasing
habit. But marketing data base less contains information
about clients regarding purchased items from other companies (Leventhal, 2018). As
shown in Figure 2.2 information behavior is divided into a number of stages that are
tightly linked.

11
Figure 2.2: predicting behavior (Abbasi, Lau, & Brown, 2015)
One issue that arises in predictive analytics research is client behavior
prediction (Abbasi et al., 2015). Many attempts to capture human interactive behavior
in the form of computation have been made in recent years (Kouzehgar,
Badamchizadeh, & Feizi-Derakhshi, 2015). As a result shown in Figure 2.3 this work
aims to present Predictive Analysis for Forecasting customer Behavior using Behavior
Informatics as well as Analytics Approach. Behavior of customer helps to determine
customer trend as well as prediction about churn.

Figure 2.3: predicting behavior (Abbasi et al., 2015)


12
Many authors have proposed hybridizing ML algorithms such as ANN, SVM,
and Naive Bayes etc (Chandrakala, 2016). CRM is a key role to the growth of a
company. In order to understand a customer's entire performance progression,
researchers may link it to other factors (García, Nebot, & Vellido, 2017). The goal of
this research is to create a model that can forecast churn with more accuracy, a variety
of churn prediction techniques have been used to meet requirements. ANN and
KSVM (kernel model of SVM) were combined as shown in figure 2.4 to achieve
outstanding results. The researcher used a hybrid technique to predicting customer
churn in Indian bank sectors using ANN and KSVM.

Figure 2.4: churn prediction using hybrid (ANN and KSVM)(Hemalatha &
Amalanathan, 2019)
(Yanfang & Chen, 2017) suggested a method for identifying churners in an e-
commerce environment by assessing parameters like users' online stay time, number
of login details, and site attentions. The consumer retention rate is determined using a
logistic regression model with a sigmoid function. The dataset's original features are
replaced by logical induction. The data is retrieved from a real-world e-commerce
platform, and four evaluation metrics are used to conduct the analysis. User interest
rate, exchange rate, specific uses, and churn time are among the parameters
considered. Although the accuracy appears to be good, the study relies on a post set.
(Mitkees, Badr, & ElSeddawy, 2017) develop a model to predict buyer
behavior using telecom industry variables. The missing values are entered into sample
datasets obtained from "IBM Watson Analytics". This work focuses on using data
13
mining prediction algorithms to predict whether or not a client would churn, as well
as detecting churn customers by grouping them together. To obtain the topmost
patterns with confidence level one, clustering techniques and association rule mining
are used. The brand name is defined in the dataset by taking the customer's frequency
into account.
(Halibas et al., 2019) describe DM (Data mining) and ML (machine learning)
is techniques that telecoms businesses can employ to track client turnover. Naive
Bayes, Logistic Regression, Decision Tree, Deep Learning, Random Forest,
Generalized Linear Model, and Gradient Boosted Trees were used to classify data of
the telecoms industry. Accuracy, F1-score, Classification error, AUC, and Precision-
Recall, are some of the metrics used to evaluate the outcomes. This research looked
into how reducing customer turnover and customer service enhancements in the
telecoms industry, found that there is currently a lack of effective customer turnover
prediction methodologies. According to halibas exploratory data analysis and feature,
engineering is adopted to improve the performance of seven classifiers that predict
customer churn.GBT algorithm is considered the best choice to determine customer
churn rate as results shown in table 2.2.

Table 2.2: performance results of comparison model (Halibas et al., 2019)

Machine learning strategies for predicting churn in the banking industry are
compared by (Karvana, Yazid, Syalim, & Mursanto, 2019) the churn is described as
someone who stops doing business with a bank and closes their accounts. Data was
14
gathered through interviews, historical studies, surveys, and observations in the form
of market share information. The K-fold classification algorithm is used to evaluate
modeling. To obtain useful training sets, this method divides a set of data by k and
repeats iterations. Aside from cross-validation, there are three types of sample
methods: stratified sampling, second percentage, and last one split. Stratified
sampling, as a whole, provides good accuracy across all classification techniques.
A study conducted by (Hassouna, Tarhini, Elyas, & AbouTrab, 2016)
compares two classification methods for predicting churn rate. With two sets of data,
logistic regression, a data mining technique used to estimate the likelihood of
consumer churn rate, DT, and tree graph-based technique that depicts the variables'
associations in the dataset, were utilized for analysis. The data sets were gathered
from a variety of sources. The outcome is one of 17 factors in a UK telecoms
warehouse. The data was gathered across two different time periods, a period of time
Replacement of missing information, the discretion of numerical quantities, and
feature selection are all part of the initial phase. The datasets analyses using
classification methods, and it is discovered that the decision tree output and logistic
regression in terms of investigating the significance of correlations between
variables. 
The study by (Ullah et al., 2019) established churn forecasting models that
employ classification approach and clustering to detect churn customers and offer the
elements that cause customer churn in the telecom industry. Feature selection is done
in pre-processing using the information gain ratio or correlation feature rating filter.
The suggested methodology uses classification methods to divide customer
information into churners (turn-over customers) and non-churners. Through the rules
supplied by the attribute-selected classifier algorithm, also gives elements that
contribute to client turnover. Random forest is one of the most accurate classifiers.
Following classification, the model uses the K-Means method to classify churn
customers into groups based on cosine similarity. The CRM can enhance productivity
and provide valuable promotions to turnover consumers by recognizing the key churn
factors from customer data used in this model. The evaluation measure for clustering,
on the other hand, is not explored. 

15
Customer churn forecasting is critical for (CRM) since it has a significant
impact on corporate profitability. This research considers three distinct customer-
oriented industries: e-commerce, finance, and telecommunications as shown in figure
2.6. LR is used with Bayesian boosting to classify the customer.
Table 2.3: performance analysis of LR and BLR in E-commerce, bank, and telecom
industry (Arivazhagan & Sankara, 2020)

The dataset use existing LR (logistic regression) as well as the suggested BLR
(Bayesian boosting with logistic regression). The results reveal that, despite the
proposed strategy's longer processing time than current methods, it provides high
accuracy in data classification because it addresses the existing technique LR high
bias problem. This strategy could be used in other customer-oriented industries in the
future. Another upgraded procedure with a shorter processing time can be explored to
eliminate bias in the existing method accuracy can also be enhanced (Arivazhagan &
Sankara, 2020)
A client retention model containing validity, monetary, and frequent variables
was developed by (Dingli, 2017) the first dataset is made up of four (4) months of
data, whereas the second dataset is built up of eleven (11) months of production data.

16
In the E-commerce business, churn is determined by studying consumer transactions.
The consumer who cancels the transaction within a certain time frame is referred to as
Churners is a term used to describe people who work in the food industry. The dataset
is split into two frames for churn analysis. The first frame, referred to as the
prediction window, shows how many transactions performed by a consumer within a
given time period is used to identify active customers.
Customers who participate in activities are kept, whereas churners are
removed from the first frame. If the active clients continue to be active for a further
time period, they are classified as non-churners, while the rest are churners. Two
classification techniques, Random forest and LR (Logistic regression) are used to
create the model. Random forest delivers excellent accuracy, whereas the logistic
regression classifier requires additional improvement in order to produce the best
accuracy. Regression modeling strategies is used to determine a customer's response
in a telecom firm by examining the relationship between numerous customer-related
factors. To boost customer retention, CRM uses ML algorithms analyze personal and
social data from customers. The parameters of logistic regression are chosen to
increase the likelihood of seeing the example values. The outcome is determined by
the customer's age and income. Each variable is given a P-value, and those with a
high P-value are removed from the dataset, leaving the remaining variables to be
analyzed using logistic regression. However, the precision appears to be very low. 
 (Idris, Khan, & Lee, 2012) presented a genetic programming solution with
AdaBoost in telecom industry for prediction. Two standard data sets were used to
evaluate the model. It was by Orange Telecommunications another by cell2cell, with
the cell2cell dataset having 89% accuracy and the other having 63%.
(Brânduşoiu, Toderean, & Beleiu, 2016) proposed an advanced data mining
approach for predicting churn for prepaid clients, based on a dataset of 3333 call
details with 21 features as well as one dependent variable churn with two possible
values: Yes/No. few features have information regarding the quantity of incoming
& outgoing messages, as well as voicemail for each client.  To minimize data
dimensions, the author used the principal component analysis technique "PCA." To
forecast the churn factor, three machine learning techniques were used SVM, Neural
Networks, and Bayes Networks. The author utilized AUC to assess the algorithms'

17
performance. For SVM, bayes network, and Neural Networks, the AUC values were
99.70 percent, 99.10 percent, and 99.55 percent, respectively. The size of dataset is
small and does not contain any missing values.
(Y. He, He, & Zhang, 2009) presented customer churn prediction model using
ANN technique to handle the problem of customer turnover in a big Chinese telecom
firm with roughly 5.23 million consumers the total accuracy rate was used as the
forecast accuracy is 91.1 percent.
As previously stated (Coussement, Benoit, & Van den Poel, 2015) the shift in
consumer preferences and expectations brought about by growing technology as well
as the widespread availability of numerous products and services has produced a
fiercely competitive environment in various customer service industries, including the
financial business. As a result of the dangers and disruptions posed by not only direct
competitors, but also new entrants to the market, the Canadian banking business has
become extremely competitive. For the first time in the financial industry, the major
goal of this research is to build a predictive churn model by combining structured
archive data with unstructured data from sources such as online web pages, the
number of website visits, and phone conversation logs. It also investigates the impact
of various consumer behaviors on churn decisions. Furthermore, customer loyalty has
been one of the primary focuses of most CRM initiatives. 
It has been proven that valid, reliable, and accurate churn model results
improve the efficiency of retention operations (Prasad & Madhavi, 2012). Churn can
be reduced by providing the correct product at right time (Shirazi & Mohammadi,
2019). Customer turnover is an important issue and one of the most pressing
challenges for large businesses. Companies are working to create methods to predict
prospective customer churn because it has such a direct impact on their revenues,
particularly in the telecom industry. As a result, identifying factors that contribute to
customer churn is critical in order to take the required steps to reduce churn. Our
work's key contribution is the development of a churn prediction model that helps
telecom carriers estimate which customers are most likely to churn. The model
created in this research employs a machine learning approach on a large data platform
to create a novel approach to characteristic engineering and evaluation. The (AUC)
global standard is used to assess the model's performance, and the AUC Area under

18
curve calculated value is 92.7 percent. Another significant contribution is the
extraction of SNA (Social Network Analysis) elements from the consumer social
media network inside the forecasting model. The use of SNA improved the model's
performance from 83 to 92.7 percent when compared to the AUC standard. Working
on a massive dataset obtained by translating big raw data provided by Syriatel
telecom firm, the model was constructed and validated using the Spark environment.
The dataset was utilized to train, test, and evaluate the system at Syriatel, and it
includes all of the customers' information throughout a nine-month period (Ahmad et
al., 2019).
In developed nations, telecommunications became one of the most important
industries. The degree of competition has increased as a result of technological
advancements and a rise in the number of competitors (Gerpott, Rams, & Schindler,
2001).Many studies have shown that machine learning technology is extremely
effective at predicting situations. The strategy which is adopted by ML is Learning
from past data is used to implement (Umayaparvathi & Iyakutti, 2016).
(Amuda & Adeyemo, 2019) state in this work, a prediction for customer
turnover in financial institutions was constructed utilizing a multi-layer Perceptron of
ANN architecture. Supervised ML classifiers like Logistic Regression, k-nearest,
Decision Tree, Random Forest, and Support Vector Machine, have been employed in
previous studies. The feature engineering required by these classifiers necessitates
human intervention, resulting in over-specified and incomplete feature selection. As a
result of this study, a model was designed to eliminate human feature selection
throughout the data pre-processing phase. For the study, data from 50,000 clients
were taken from a database financial institution in Nigeria. Two under fitting
strategies were utilized to build the multi-layer perceptron model, which was written
in Python (Dropout, or L2 regularization), and outcomes revealed that the
performance of ANN S/W, as well as accuracy rates, are 96.7 and 96.5 percent,
respectively, with ROC abbreviation of (Receiver Operating Characteristic) curve
graph of 0.90 and 0.86 (Amuda & Adeyemo, 2019).
In recent decades, o a variety of consumer turnover forecasting models have
been created. Modern machine learning classifiers which include random forest, and
LR, are used in the most advanced models (Castanedo, Valverde, Zaratiegui, &

19
Vazquez, 2014) One of the most direct and successful ways to keep current customers
is for the organization to be able to predict and react to prospective churners ahead of
time. Recognizing signs of impending churn, meeting client requirements, and
maintaining and retaining loyal customers are all measures aimed at lowering the cost
of acquiring new customers (Mitkees et al., 2017).
Cellular telecom companies are growing increasingly competitive in the
telecommunications industries, and churn management has become a vital role in the
sector. (K. P. Sharma, 2011) used a neural network to forecast the wireless network
relationship between client turnovers. The dataset was compiled from the UCI ML
Database at the University of California, Irvine, and consists of 20 variables worth of
resources about 2,427 users. Clementine DM software suite from SPSS Inc. was used
to create the neural network. Clementine provides two types of supervised NN (neural
networks): Multilayer Perceptron (MLP) and Radial Base Function Network (RBFN),
and its accuracy rate are about 92%.
In customer disciplined banking business, customer churn has become a
significant issue, and banks have attempted to analyze consumer Interaction
throughout order to discover early signs and symptoms in customer behavior, such as
the decrease in transactions and deposit dormancy. They also developed a model that
employed K-Means or Repeated Incremental Pruning to Generate Error Reduction
and was applied on Weka for customer churn analysis in the financial system. This
data was taken from a large Nigerian bank's CRM (customer relationship
management) database and transactional warehouses. The findings reveal a trend in
customer behavior and assist banks in identifying customers who are about to churn
(Oyeniyi, Adeyemo, Oyeniyi, & Adeyemo, 2015).
(Xia & Jin, 2008) employed SVM on structural risk reduction to employee
turnover analysis to improve the prediction abilities of machine learning methods. The
data was gathered from the University of California's UCI machine learning database.
When comparison to ANN, Logistic Regression, Decision Tree, SVM, and Nave
Bayesian classifier, the approach seems to have the highest accuracy rate, hitting rate,
coverage rate, & lift coefficient. The SVM is a useful tool for predicting client
attrition.

20
In the Telecommunication industry, employed predictive data and descriptive
DM approaches to determine subscriber calling behavior and identify subscribers with
a high probability of churn. At this point in the process, you are in the descriptive
stage. Customers were grouped together based on how they used the service
behavioral characteristics and algorithms that are employed for K-Means or Expected
Clustering were the clustering methods used. During the foresight stage, Classifiers
Decision-Stump, Rep-Tree, or M5 Weka was used to implement these algorithms.
The outcomes demonstrate that EM outperforms K-mean in the M5P outperforms
both in the descriptive stage, while in the prediction stage, Decision-Stump and Rep-
Tree. The algorithm used in the telecom industry is boosting algorithm to forecast
customer turnover. Customer divided into two groups the data set utilized was
assigned via a boosting technique. Boosting the technique outperforms logistic
regression and it does a decent job of separating churn data (Xia & Jin, 2008).
DFNN stands for (Deep Feed Forward Neural Network), a multilayered neural
network that performs predictive analytics on customer maintenance in finance
companies like banks. UCI's machine learning archive contains a total of ten thousand
customer records with almost 14 different dimensions of attributes and features. To
clean and pre-process data, the model uses configured one-hot file format and Turkey
outliers' algorithms. A number of machine learning algorithms were compared to the
model, including Logistic Regression, Gaussian Nave Bayes method, and DT
(Decision Tree), and output results accuracy rate is much higher of DFNN as
compared to other ML algorithms (Mundada, 2019).
 According to (Q.-F. Wang, Xu, & Hussain, 2019) large-scale suit models
have been developed to identify consumer churn or turnover in search Ads.
Customers with a high likelihood of leaving the ad platform were to be identified. GB
DT (Gradient boosting decision tree) ensemble model was developed to estimate
customer churners in the near future based on their behavior in search ads. Static and
dynamic features were the GBDT's 2 distinct features. It took into account a long-
term sequence of customer activities such as impressions and clicks. While static
aspects take into account information such as the customer's creation time and type.
The database obtained from the Bing Ads channel with AUC (area under the curve of
ROC) value 0.7411 shows that static or dynamic features complement each other. 

21
(Rosa, 2019) built a model for estimating customer attrition and presentation
in the banking sector that utilized ANN (Artificial Neural Networks). A total of 1587
customer records were extracted from the bank's Data Warehouses using SAS Base.
As a result, other ML algorithms such as Decision Trees, LR (Logistic Regression),
and SVM (Support Vector Machine) were not considered in the research. KNN and
Random Forest are two of the most widely used algorithms. Due to the fact that
features engineering is done by hand with these classifiers, over-specified and
imperfect feature selection results. Therefore, the whole research developed an
analytical model using a Multi-layer Perception ANN structure for prediction about
customer churn and minimizes the manual data engineering process in the data pre-
processing phase. A multi-layered perceptron ANN architecture is used to build up a
consumer churn forecasting model. The model is then used as a tool to predict
provides a theoretical churn and non-churners.
 CRM to increase the retention rate of all customers is less efficient for long-
term business operations than investing in targeted customer acquisition activities.
Overall, customers who have been with you for a long time are more profitable than
randomly targeting new ones. In a variety of service fields, studies on customer churn
have been suggested. Studies on churn analysis sought to identify and predict the
likelihood of customers leaving a company in advance by using a variety of
indicators. Early customer churn was once used to define a customer's status in the
CRM system, but that has changed. As a solution to enhance efficiency in retail,
customer service, marketing, selling, and supply-chain functions, CRM emerged as a
business management tool. As a result, CRM has evolved into two distinct categories:
operational Customer Relationship Management and another is analytical CRM.
Trying to develop databases and resources usually containing consumer behavior and
attitudes is the focus of analytical CRM. In analytical CRM build database and
different resources which consist about customer features and its special purpose is to
observe customer’s behavior and their attitude. A contract's churn rate is one of the
most important factors to consider. Even though the current contract period has
expired, customers do not renew their contracts referred to as contractual churn An
example of this type of turnover is when a customer ends up losing interest in a
particular service coverage area and moves out of the area, making it impossible to re-

22
enter. When customers close their bank accounts or switch service providers, churn
problems occur. The same is true for flat-rate services along with music, TV, and film
streaming. It is used in insurance companies; Internet services providers (ISP), games,
and management to determine how many people are leaving service. Churn prediction
studies are usually the first step in enhancing business outcomes. Instead of measuring
a customer's total churn, the peak value is used to identify potential customers who
are likely to leave. CAC or CLV are used to calculate customer churn costs. Scientists
used analysis of variance and time-series forecasting, as well as traditional machine-
learning (ML) algorithms, in the old days to predict customer churn. The use of deep
learning algorithms for churn prediction is a relatively recent development. A study of
deep learning algorithms discovered they outperformed other methods (Ahn, Hwang,
Kim, Choi, & Kang, 2020).
Churn in the telecommunications industry is a real problem. In order to
prevent their customers from switching to another telecoms company, the Telecom
Company must have a model for predicting churn rates d Defined as a customer
turnover prediction forecasting using Pearson Correlation as well as KNN algorithm,
this research goal is to provide a framework for predicting customer churn. This
approach is evaluated using a 70:30 percentage of training data, and testing datasets.
We compared many algorithms that can forecast whether or not a consumer will
cancel their service and go with another company. The comparison of multiple
classifiers will aid us in accurately predicting customer turnover as well as addressing
the primary cause of client retention. The Pearson Correlation Function is first used in
the pre-processing stages. Then we look at all of the classifiers from a variety of
angles. According to the results and evaluation, and KNN algorithms exceeds the
others, and as well as with an accuracy of 80.45 percent for training, and 97.78
percent for testing (Sjarif et al., 2019).
When it comes to a dynamic and competitive firm, customers are the most
crucial asset. As a result, the Customer Relationship Management team is in charge of
building, managing, and enhancing long-term customer relationships (Vafeiadis,
Diamantaras, Sarigiannidis, & Chatzisavvas, 2015). One of the biggest challenges for
a telecom company is to keep its customers loyal. For one thing, losing customers
could result in serious financial losses (Kirui, Hong, Cheruiyot, & Kirui, 2013).

23
Customer retention costs five to six times less than acquiring a new one,
according to a recent study. Hence, telecom companies need to be able to predict
which customers will remain loyal and satisfied without intervening in a way that
could result in revenue loss, increased consumer retention, maintenance and re-
acquisition costs, increased advertising costs, and organizational, scheduling, and
budgeting confusion (Kirui et al., 2013)
Consumer unhappiness with service supplied or other companies offering better
offerings within the customer's budget is the most common definition of churn
(Ahmad et al., 2019). The research is done utilizing data from the Oracle Company's
data collection, and the churn prediction model is built using the Nave Bayesian
Theorem (Nath & Behara, 2003). Data was evaluated for churn prediction in China
Telecommunication Company utilizing C5.0 algorithm of decision tree, BPN, and
Neural network. the discontent of the customer is the key cause, not the company. As
a result of this procedure, existing customers are identified as those who are likely to
discontinue using the services in the near future, or churn prediction. If the company
loses clients, it will have a substantial impact on its revenue (Ahmed & Linen, 2017).
Unbalanced data processing strategy for better customer churns prediction. The
suggested technique incorporates sampling & WRF which is abbreviated for
(Weighted Random Forest), resulting in a more balanced dataset and its accuracy rate
is much high in churn prediction. As part of the sampling procedure, under-sampling,
as well as SMOTE abbreviation of (Synthetic Minority Oversampling Techniques), is
used. After using the combined sampling procedure, the F-measure and overall
accuracy rate increase, indicating that the deletion of data improves prediction. With
the employment of the under-sampling approach, meanwhile, it's not considerable
(Effendy & Baizal, 2014). Implemented of an-other algorithm of machine learning
(ML) approach, decision trees, and LR (logistic regression). According to the
proposed approach, the machine learning algorithms are combined and compared in a
comparable manner. We're using an algorithm called logistic regression from machine
learning to figure out what factors influence a decision. Decision trees are used to
visualize the data. This study's findings demonstrate an increase in churn forecast
accuracy, as well as a decrease in the time it takes to do so. But there are only a few
categories (Dalvi et al., 2016).

24
Regarding the Macedonian telecommunication sector, four kinds of machine
learning (ML) algorithms were created. On the basis of area under the curved and
performance evaluation time, the studies analyzed the four kinds of classifiers are:
C4.5 DT (Decision-Tree), LR (logistic regression), KNN, and the last one is Naive
Bayes. Several of these classifiers have been shown to have an accuracy of greater
than 90 percent, with logistic regression (LR) being the most accurate. However, this
classifier's performance is limited by its lengthy execution time and the demand for a
huge amount of memory (Petkovski, Stojkoska, Trivodaliev, & Kalajdziski, 2016).
According to (Sjarif et al., 2019) the Pearson Correlation Function with KNN (k-
Nearest Neighbor) can be used to improve the accuracy of your predictions. Methods
will be compared against other algorithms to see which has the most accurate
performance.
CRM abbreviation Customer Relationship Management is a broad term that
refers to a system for establishing, managing, and enhancing long-term customer
relationships. Consumer Churn Prediction is a challenging problem for both decision-
makers, planners, and the machine learning sector because churn and non-churn
consumers often have similar characteristics. Simultaneously, the increased usage of
the “IoT" and cloud computing make it possible to collect client data for CCP
(Customer Churn Prediction). This research proposes a CCP model based on a
machine learning method in the context of cloud computing. It consists of three stages
first one is data collection, the second, data pre-processing, and the third apply P-
AGBPNN which stands for (Adaptive Gain with Back Propagation Neural Network)
Customer data can be collected utilizing a variety of IoT devices. The data collected
by the IoT devices are sent to a CDS (clouding data server). Following that,
preprocessing occurs, during which the dataset's missing values are effectively
imputed. After that, the P-AGBPNN model is run on a cloud to determine whether the
customer is a churner or not. According to the simulation findings, this P-AGBPNN
model performs exceptionally well, with a high sensitivity of 95.50, an accuracy of
70.49, and a kappa value of 67.20. Feature selection, as well as clustering approaches,
can be used to improve the performance of the described model in the future
(Jeyakarthic & Venkatesh, 2020).

25
Using hybrid NN approach for CCP using an American telecommunications
firm's CRM dataset For the CCP technique, it used a model that combined ANN and
SOM abbreviation of (Self-Organized Map). ANN is also used to reduce the amount
of data in non - representative data taken from the training set. Following that, the
results of the first step led SOM to develop a predictive approach. As a result of the
simulation results, the combination of ANN, as well as SOM, works well in a single
NN in terms of reliability and accuracy. However, it is observed that data reduction
and filtering in the primary model result in the greatest loss of samples in the training
set (C.-F. Tsai & Lu, 2009).
An arbitrary sampling approach and a CCP framework were developed and
based on SVM classification. As a result, a random sampling method is employed to
adjust the data distribution in order to eliminate the imbalanced class difficulty due to
the lack of data, which could not improve CCP's predictive function (B. He, Shi, Wan,
& Zhao, 2014). When it comes to the weighted random forest, was advised to use it.
However, RF is frequently chastised since it is difficult to learn and comprehend,
particularly when it comes to determining the source of customer churn, which is
necessary for preventing customers from being forced to leave. As a result, reporting
the CCP is not necessary. The researchers present the ensemble model to improve the
forecasting function of the CCP approach. When it comes to categorization, ensemble
techniques are described as the aggregation of several individuals (N. Lu et al., 2012).
In CCP, we created an ensemble technique based on rotational forest and Rot
boost, which are supposed to have been two models. When the Rot-boost
methodology is combined with the rotational forest as well as Ada-Boost to improve
the operations of customer churn prediction technique under the application of
ensemble methodology, the former is employed to extract features. As a result, the
simulation results show that Rot-boost performs very well when compared to rotation
forests in terms of accuracy, but rotation forests obtained the highest AUC and lift
values (De Bock & Van den Poel, 2012). Furthermore, using the GAM model
abbreviation of (Generalized Additive Model), along with combining ensemble
classification techniques such as semi-parametric GAM, Bagging, and random
subspace, is recommended. Finally, when it comes to training classifiers, Logistic

26
Regression, and GAM approaches, it uses the highest accuracy in prediction operation
(De Bock & Van den Poel, 2012)
Similarly, (Pendharkar, 2009) requested that GA and NN be used for CCP.
When NN is used to identify customer churn, the GA is conducted to analyze the
features space of Neural Network. As a result, it is measured using the ROC (Receiver
Operating Characteristics) curve and compared to the efficiency of the Genetic Algo-
based NN approach using the z-score approach. When compared to the z-score
statistical method, the Genetic Algo-based NN approach outperforms the z-score
statistical technique. Furthermore, the ROC curve has been frequently utilized in
business to maximize profit (Maldonado, Flores, Verbraken, Baesens, & Weber,
2015).
           (Óskarsdóttir et al., 2017) CCP has devised a mechanism to identify more users
who are churning. It has been decided to utilize JRip to extract rules from a simple k-
means clustering model. In order to classify efficiently the consumer churn, (Makhtar
M, 2017) developed a customer classification approach based on rough set theory.
LR, J48, and NN may not perform as well as the rough set classification technique.
According to the report, a client is vulnerable to turnover if they have a relationship
with consumers who have already left. Many academics in the field of machine
learning believe it to be really a promising problem. There is a large number of
existing CCP models; however, ML methods are needed to create an effective model.
Recently, (Abbasimehr & Shabani, 2019) CRM has gained a lot of traction CRM has
also been a major emphasis for companies operating in a competitive industry such as
banking and telecom. The CRM architecture was divided into two categories:
operational and analytical CRM. Operations CRM focuses on automating customer-
centric business processes, whereas analytical CRM focuses on analyzing customer
behavior to help firms make customer management, retention decisions. The
technique of extracting and identifying hidden patterns throughout massive data is
known as data mining. Consumer segmentation is an important, CRM approach that
helps companies gain a better understanding of their customers’ demands and habits.
This research offers a new strategy towards customer segmentation that uses time
series clustering to analyze consumer behavior change over time. In addition, the
suggested methodology's key advantage is its capacity to study clients while taking

27
into account their changing behavior. Over the course of 7 months, this methodology
was successfully applied to a bank and businesses’ transaction dataset. The use of
hierarchical clustering with multiple similarity measures allowed for the extraction of
meaningful groups or clusters of customer trends. As a result, banks are strongly
advised to implement unique marketing techniques to reduce customer churn. The
discovery of consumer behavioral trends throughout time, instead of a single point of
time, illustrates the efficacy of the presented methodology. The limitation of this
research is that it only analyzed or looks data in a time period of over seven months; it
really would be the preferable time period of twelve months.
  In the sectors of an internet service provider (ISP), management, gaming, and
insurance churn analysis is employed. Churn prediction studies are frequently the first
step in improving business performance. As a result, rather than evaluating a
customer's total churn, the time frame is utilized to identify potential churning
consumers. Customer turnover costs are determined using CAC (Customer
Acquisition Cost) as well as CLV (customer lifetime value). Mostly in previous,
academics employed statistics, ML also, graphs are used to forecast customer
turnover via survival analysis as well as time - series. Deep learning algorithms have
been recently used to perform churn prediction analysis. The use of feature
engineering approaches to process logs has a substantial impact on model system
performance. Unlike previous modeling techniques, the DL model may contain time-
series characteristics to turn high-dimensional sparse logs into low-dimensional dense
features (Ahn et al., 2020).
In the subject of CRM, a variety of research has been conducted in numerous
sectors to retain customers and develop ways to establish an effective model so that a
specific set of consumers could be targeted for retention. For churn prediction, a
variety of DM and statistical approaches have been utilized, including Decision trees,
ANN, SVM, Regression models, Clustering, Bayesian Approaches, and others. The
authors present a hybrid learning approach for predicting churn in mobile
Telecommunication networks in their research. WEKA, a really well-known ML
program, was used to create their model. A recent study looked at the methods for
prediction churn. In customer churn, decision tree-based approaches, NN-based
approaches, and regression methods are commonly used. According to the review,

28
decision tree-based algorithms, particularly C5.0 and CART, outperform several
current data mining techniques including regression in terms of effectiveness and
accuracy. As observed in the literature study, the most commonly used algorithms for
churn prediction are DT, neural networks, and LR (logistic regression), among which
we have suggested investigating the performance of DT C5.0 and LR using publicly
accessible telecom data.

2.3 Discussion and Analysis


Machine-learning (ML) classification approaches, data mining
(DM) technologies, hybrid neural networks (NN), and other approaches have been
proposed to address the churn customer problem. However, many real-world
challenges contain significant unbalanced data that these studies have not addressed in
terms of accuracy and recall. Several models, including AdaBoost, NN, Extra Trees,
KNN, and XGBoost, were developed and compared. These are the most common and
well-known models for classifying jobs. AdaBoost (also named as Adaptive
Boosting) seems to be an experimental approach that builds strong learners by
combining basic weak learners in a linear fashion. However, AdaBoost approach is
susceptible to noise and abnormalities in the data, Extra trees are much faster in
computationally as compared to random forest.

The k-NN algorithm is a non-parametric lazily learning technique. A neural


network is modeled after the human brain, which is made up of billions of neurons.
Feed - forward neural and feed - back neural are used in neural networks, and
XGBoost algorithm is use in data science challenges can be solved quickly and
accurately.

2.4 Chapter Summary


In this chapter there are different approaches and studies have been done
comparison as well as discuss different issues and challenges through literature
review and their outcomes. Such as Ada boost was weak in linear fashion, random
forest is slower than extra tress, KNN is a non-parametric lazily learning technique
and so on.

29
In chapter 3rd consist of my research methodology XGboost abbreviation of
Extreme Gradient Boosting really refers to the technical objective of pushing the
computational resources of boosted tree algorithms to their limit. XGB Classifier is a
gradient boosting framework decision tree-based ensemble technique. It operates on
the principal Gradient Descent architecture to enhance weak classifiers such as
(Classification and Regression Trees).

Figure 2.5: parallel ensemble (Labhsetwar, 2020)


The inner model independence in XGBoost distinguishes it from the AdaBoost
Classification method. XGBoost also uses a parallel ensemble approach, in which
many learners are created in parallel during the training phase.

CHAPTER 3
RESEARCH METHODOLOGY
3.1 Introduction
Research methods and research methodology both are different in nature,
research methods are basically some techniques and tools used to collect and gather
data for research while research methodology is the theory and conceptual framework
which refers to systematic and logical procedures taken to address the identified
problems. 
The essential data was obtained from primary and secondary sources using
various research approaches in order to understand and analyze the phenomenon, as
well as to develop an in-depth understanding of the issue. The methodology of the
30
study is descriptive and analytical. A survey was also undertaken to collect significant
primary data for quantitative analysis. The purpose of this descriptive and analytical
research was to determine due to which reason customer churn occur. Generally, the
theory is a phenomenon that explains the performance of customer behavior and its
aspects. These aspects help us to understand what is happening in the surroundings
and help us to the prediction about customer churn.
3.2 Research Framework
The computational framework of proposed model is shown in Figure 3.1. This
framework consists of five phases the first one is Data acquisition, the second
preparation of data thirdly data preprocessing, fourthly data extraction and last one
fifth is the decision. Data acquisition: Because of the potential for misuse, obtaining
data from the telecom dataset sector is a difficult undertaking. The KDD Cup 2009
provided the data set for this investigation. It is used to study the marketing behavior
of clients using the vast databases of Orange, a French telecommunications firm
(Guyon, Lemaire, Boullé, Dror, & Vogel, 2009).
Preparation of data: Because the dataset obtained cannot be immediately
applied to churn prediction models, collection of data is required, in which additional
variables are must be added to the existing variables by observing the customers'
periodic user behavior. These variables are vital in identifying customer behavior in
advance because they include critical information that the prediction models need.
Data preprocessing: The most vital phase in the prediction model is data
preprocessing because the data contains redundancies, ambiguities, and errors that
must be cleaned before use. Because the whole data collected is not acceptable for
modeling purposes, the data are gathered from various sources is first collected data
and then cleansed. The records which consist of unique values don't have significance
because it does not contribute to the prediction model while the fields containing an
excessive number of null values must also be removed. Data extraction: For the
classification process, the qualities are identified. This research worked with both
quantitative and qualitative data. Decision: By selecting a certain threshold value, the
criteria set will allow customers to identify and categories themselves as churners or
non-churners.

31
 
Figure 3.1: Customer Churn framework
3.2.1 Data Collection
The churn dataset is about cellular service provider companies, customers, and
the information about offers and voice calls and packages. Customers can choose
from a variety of service providers companies that provide cellular network services.
Churn occurs when consumers switch cellular service providers, the process of
switching one organization to another is become a reason of loss of revenue.
A survey is conducted for the collection of data; here data is collected from
primary resources. This research target Pakistani different Telecom services provider
companies such as Jazz/Warid, Ufone, Zong, and Telenor to determine which one is
the best services provider in low cost and how much rate of customer churn in these
companies and cause of churn. The dataset consists of 9760 records with 19 features.
The customers who want to churn are 1880 and non churn customers are 7880 the
ratio of non churner and churner 81:19 respectively.
This dataset consists two categorize of variables predictor variables and target
variable. Predictor (independent variables) considered for this research is gender,
which telecom service do you use?, relation duration, service type, source of

32
attraction, preferred communication, satisfaction of call rate, contact with customer
care, rate of customer care calling, reason of contact, monthly usage, internet quality,
is available 4G service, satisfaction of voice quality, signal strength, satisfied with
price of packages, billing process, and recommended to others, on the other hand
customer churn is target (dependent variable) and it will be exhibited dichotomous
outcome with two values either 0 or 1. In XGBoost algorithm 1 value indicate
customer churn means customer switch from one company to another while 0 indicate
customer still in existing company. Figure 3.3 shows the six processes that lead from
data collection to the implementation of a churn strategy.

Figure 3.2: Customer Churn Dataset

33
Figure 3.3: six steps for customer churn prediction
3.2.2 Data Preprocessing
The phases of our churn prediction method are consisting of data sampling, data
preparation, classification means categorized, and prediction. According to the
concept of churn, data sampling selects randomly a group of customers with their
relative details. Preprocessing removes noise, extracts variables and features, and
blends and normalizes them. The variables are not always normalized in certain
modeling techniques (e.g. In the XGBoost, and decision tree). As a result, for these
types of modeling techniques, normalization in the preprocessing step can be
overlooked. The main goal of the classification process is to use modeling techniques
to predict consumer behavior in the future.
On the other hand creation of confusion matrix is also represent step by step in
table 3.1, table 3.2 represent basic terminologies which used in confusion matrix,
table 3.3 shows the measurements which derived through confusion matrix and table
3.4 shows the actual matrix as well as figure 3.4 shows the confusion matrix through
diagram.
Table 3.1: Creation of Confusion matrix

34
Predicted Values

Negative Positive

Actual Values Negative TN FP

Positive FN TP

Table 3.2: Basic terminology which used in confusion-matrix

Basic terms of confusion matrix.

Terminology Description

TP (True Positive) Number of customers who are appropriately labeled as churned.

TN (True Negative) Calculate the number of clients who have been categorized as churning but are
unlikely to do.

FP (False positive) The number of clients identified not performing churn, but with a high
chance of churning.

FN (False negative) The number of customers that are categorized as not conducting churn and who
actually are not performing churn.

Table 3.3: Measurement which derived by confusion-matrix


Terminology Description Formula

Accuracy Number of all correct (TP+TN) /Total


prediction divided by the total
number of the dataset

Precision Number of correct positive TP/(TP+FP)


prediction divided by number
of positive prediction

Error Rate Number of all incorrect (FN+FP) /Total


prediction divided by the total
number of the dataset

Recall Number of predicted churn TP/(TP+FN)


customers divided by number
of all churned customers

35
Table 3.4: Confusion matrix
Non-churner Churner
Non-churner 1853 117
Churner 142 328

 
Figure 3.4: Confusion matrix
Step 1: Data collection, data preprocessing, and normalization: Computer
science is related to data processing, big data, and deep learning. So, the data set is
easily available. We will accumulate data from the websites that provide unlimited
access. We will collect data from different sources of telecommunication industries.
Select randomly customers all detail from the selected dataset of the
telecommunication industry as well as firstly, remove noise from data, noise reduction
is an important step to remove duplicate data, remove all irrelevant information such
as wrong spelling and symbols, remove missing values from the database which are
commonly exits, missing value is normally marked using string ‘‘NULL”, and blank
spaces. All methodology is shown in figure 3.5.
Step 2: Extract features and variables: The most significant role that feature/variable
extraction plays in determining the outcomes of predictive models in terms of
prediction rates is feature/variable extraction. Step 3: XGBoost: Extreme Gradient
Boosting used for regression and classification. Its execution speed is amazing. Step
4: classification and prediction: In the field of telecommunications, several techniques
for predicting churn have been suggested. However, this research is based on the
XGB using python language and python 3.7.3. Python consider most popular

36
language of machine learning due to its vast libraries the following machine learning
libraries were utilized in this study firstly import libraries: Pandas, Numpy, XGBoost,
Matplotlib, SKLearn etc. Secondly import dataset: the dataset is in CSV format and is
made up of tabular data in plain text. The pandas library's read_csv() technique is
used to generate a data frame from the provided dataset. Thirdly handle missing
values: selected dataset had been processed carefully did not contain any missing
value. Fourthly encoding categorical data: all the algorithms which used for machine
learning deal with numerical values not label. Hence label value converts into
Boolean such as churn.
Fifthly, split dataset into train and test set: initially dataset consist of 9760 instances
the splitting ration is 70:30.

  
Figure 3.5: Methodology
3.3 Evaluation Metrics
Evaluation parameters for XGBoost algorithm as shown in table 3.5 are learning
rate=0.300000012, base score=0.5, booster='gbtree', colsample by level=1, colsample
bynode=1, colsample by tree=1, gamma=0, max depth=1, min child weight=1,
estimators=100, alpha=0, lambda=1, scale position weight=1,
Table 3.5: Evaluation parameters
Evaluation parameters Values
Learning rate 0.300000012
Base score 0.5
Estimators 100

37
Max depth 1
min child weight 1
Scale position weight 1

3.4 Tools and Technology


The Experiments for the proposed computational method were implemented
using Anaconda using python language, running on Microsoft Windows 7 64-bit OS.
The desktop PC was built with 8 GM Random Access Memory (RAM) and an Intel
Core i5 2.30 GHz Central Processing Unit (CPU).
3.5 Proposed Method
XGBoost is a newly dominant technique in applied ML and Kaggle contests for
structural or tabular data. XGBoost is a gradient-boosted decision tree implementation
optimized for speed, reliability, and performance. The word xgboost, on the other
hand, relates to the technical aim of pushing the computational resources limit for
boosting tree algorithms. That’s why numerous researchers employ XGBoost. The
algorithm's implementation was aimed at increasing computational time and memory
requirements. One of the design goals was to make the greatest use of available
resources when training the model. XGBoost algorithm can be used for both
structured and tabular datasets and it can be also used for regression prediction and
classification predictive models.
Developing a machine learning algorithm XGBoost is a nonparametric
technique. The XGBoost algorithm is used for applied ML. XGBoost it’s a library for
creating high-performance gradient boosting tree structures. That XGBoost
outperforms the competition on a variety of demanding machine learning tasks. This
library may be used from the R, Python, and command lines. This research will be
used XGBoost in python.
Gradient boosting is a method in which new models have been developed that
forecast the residuals, errors, or mistakes of previous models, which are then
combined to form the final prediction. This method is applicable towards both
regression as well as classification predictive modeling issues.

38
3.6 Chapter Summary
Customer churn forecast, on the other hand, is a complicated process with
several decision nodes for analysts. Business understanding, data interpretation, data
preparation, analysis, evaluation, and implementation are the six stages for a company
Strict to follow for machine learning. For this research, concentrate on a few of the
highest time-consuming and crucial steps: data pre-processing, which requires a
rigorous model with trash in and out criteria that can be applied to customer churn
prediction. Predicting which customers will churn ahead of time can assist the
telecommunications sector and CRM section in identifying which customers will quit
the network. The classification problem is explored in our research, which is used to
identify individual subscribers as prospective churners or non-churner.

CHAPTER 4
RESULTS AND DISCUSSION
4.1 Introduction
In this research consumer churn prediction model was developed by using the
XGBoost algorithm. xgboost is also known as eXtreme Gradient Boosting. The
accuracy of this ML algorithm is high and the speed of performance is mind-blowing.
Developing an efficient and optimal churn model is a difficult task. To accomplish the
aim, experiments are proposed to increase the prediction ability of the churn model
utilizing the widely used XGBoost.
This research uses the Gradient Boosting framework to create ML (Machine
Learning) algorithms. It works on parallel tree boosting to deal with a wide range of
data science issues quickly and accurately.
4.2 Design of Proposed Method
Boosting is a suitable learning strategy for constructing a strong classifier
from a sequence of weak classifiers. Boosting algorithms are essential in dealing with
the bias-variance swap. Boosting algorithms, as opposed to bagging algorithms,
which solely adjust for excessive variance in a model, handle both elements (bias and
variance) and are seen to be more successful. There are several types of Boosting
algorithms such as AdaBoost, Gradient Boosting, Extrem Gradient Boosting, Cat

39
Boosting, and light GBM (gradient Boosted Machine). This research proposed
method is XGBoost.
XGBoost is a Gradient Boosting ML package that has been improved. It was
created in C++ at first, but it includes APIs in various other languages. The main
XGBoost technique is parallelizable and flexible, which means it may be parallelized
inside a single tree. The following are some of the advantages of utilizing XGBoost:
I. It is one of the most powerful algorithms, with fast and excellent performance.
II. It is capable of utilizing the entire processing capacity of current multi-core
processors.
III. Large datasets may be used to train.
IV. Outcomes of any single algorithm technique consistently.
When compared to Gradient Boosting, XGBoost gives better results and high
performance. It has a very rapid training time and can even be parallelized over
clusters. Figure 4.1 shows flow chart of proposed method how to predict customer
churn using XGBoost.

40
Figure 4.1: flow diagram for proposed method
4.3 Results and Discussions
The Experiments for the proposed computational method were implemented
using Anaconda using python language, running on Microsoft Windows 7 64-bit OS.
The desktop PC was built with 8 GM Random Access Memory (RAM) and an Intel
Core i5 2.30 GHz Central Processing Unit (CPU).
The XGBoost method is utilized to obtain accurate values, which allows us to
predict client turnover/ customer churn. Here, we apply the model by using a dataset
consisting of 9760 instances with 19 features, 1880 leave the company, and 7880 non-
churn that has been trained and tested, resulting in the greatest number of accurate
values. The first phase is data preparation, which involves filtering data and
converting it into a comparable format before doing feature selection. The XGBoost
41
algorithms are used for prediction and classification in the next stage. We watch and
evaluate the customer's behavior while training and testing phase of the data set.
Finally, we collect and analyze data based on research findings gathered and predict
client turnover with accuracy 89.39%, precision value 73.71%, recall value 69.79%,
and F1 score 71.69%, as well as churned customer 19.26% and non churned customer
80.73%. Table 4.1 shows the accuracy of this research using XGBoost. After using
XGBoost algorithm this research achieve highest accuracy as compared to other
algorithms as comparison is also shown in table 4.2, on the other hand table 4.3 shows
the different studies of churn prediction using telecom data set.
Table 4.1: Results of classification method (XGB)

Performance XG Boost

Parameter

Accuracy 89.30%

Precision Value 73.71%

Recall 69.79%

F1- Score 71.69%

AUC 81.3%

Table 4.2: comparison of accuracy with XGBoost

Model %Accuracy

Churn Rate 26% 41%

LR 80.1 75.6

NB 72.7 73.0

DT 73.4 70.7

RF 73.3 75.1
42
GBT 77.4 79.0

XGBoost 79.8 81.4

Table 4.3: Researches of churn prediction using telecom dataset

Authors Classifiers/ Results


approaches
used for Churn
Prediction

(Pamina et al., 2019) KNN, XGB and XGBoost perform well


random forest out higher accuracy
achieved by XGB.

(Keramati et al., 2014) Decision Tree, ANN, A hybrid technique was


K introduced, which
Nearest Neighbors, increased the outcomes
and and performance of
evaluation measures.
SVM

(Huang et al., 2012) LR, Linear To manufacture new


Classification, DT, set of features used
Naive Bayes, different seven
Multilayer prediction techniques
Perceptron Neural
Networks, SVM, and
Evolutionary data
mining
algorithms

(J. Lu, 2002) Survival analysis Predict about


techniques duration of
membership of a
customers

(Pamina et al., 2019) KNN, Random XGB perform well-


forest, XGBoost out rather than others

(Vafeiadis et al., 2015) ANN, SVM, LR, illustrate a clear


Decision Trees, and benefit of the boosted
Naive Bayes versions models vs.
43
classifier, non-boosted versions

(Lemmens & Croux, Bagging and Recommended for


2006) Boosting Techniques large dataset utilize
balanced sampling
scheme

(C.-F. Tsai & Chen, Neural network and For churn prediction
2010) DT use neural network and
DT with four
evaluation matrix
accuracy, F-
measurement, recall,
precision rate and all of
these use association
rule

(Hung, Yen, & Wang, ANN and Decision by utilizing


2006) Tree customer
profile, billing, call
info, and service
turnover records
achieve high
accuracy in
prediction

here represent accuracy of different classifier which used in telecom industry


for customer churn perdition this research achieve high accuracy 89.92% as
compared to others classifiers as shown in table 4.2.

44
Figure 4.2: service type

On the dataset, we ran numerous trials here on proposed churn model via
XGBoost. This research firstly analysis about customer trend analysis, various
services provider companies are examined here. Which type of service customer use
pre-paid or post paid is shown in figure 4.2. Maximum customer use pre-paid service.

Figure 4.3: satisfaction of customer with signals strength

45
Often customer churn occurred in service provider companies due to weak
signal strength. Weak signal strength becomes cause of dissatisfaction of customer.
Above figures shows the signal strength and signals strength is very weak in Ufone
and may be a huge number of customer turn over due to signal problem and secondly
customer less satisfied signal strength of Telenor as shown in figure 4.3.

Figure 4.4: satisfaction of customer with call rate

For communication often people make voice calls and these calls conduct by
paying dues and these dues may be varying in different companies. In figure 4.4
shows the call rate of selected telecom companies and display satisfaction of
customers, customer is less satisfied of call rate of Ufone, Jazz/warid and Telenor
while high satisfaction of customer about call rate is Zong.

46
Figure 4.5: satisfaction of customer price of packages

As every company offer different price of packages as well as price of


packages may be vary city to city, and company to company. The result of this
research shows that may be high churned occurred due to high price of packages such
as show in figure 4.5 Ufone, Telenor and Jazz offer high price of packages. While the
customer of zong pleased with price of packages.

Figure 4.6: Quality of internet


47
Figure 4.7: Availability 4G service

Nowadays, internet is vital part of life due to its huge use and it’s an effective and
efficient way of communication all over the world. Service provider companies offer
various internet packages and this research examine the quality and speed of internet
provided by these telecom companies and satisfaction of customer with internet
quality is shown in figure 4.6. There are some consumers who show dissatisfaction
about availability of 4G service as shown in figure 4.7. According to trend internet is
become vital part of life every person use it now days as well as for customer
retention offer less price of packages is a best method.

48
Figure 4.8: Customer relation duration with company

Creating a relationship with consumers is one of the most successful methods a


company can do. Customers' relationships are inextricably linked to a company's
financial well-being. Building great customer relationships will increase customer
loyalty and retention, long-term clients, resulting in increased income from recurrent
purchases. In figure 4.8 shows the customer duration of relationship with selected
telecom company.

Sometimes long duration of relation is also produce churn due to unmaintained


network, high rate and high prices of packages cause of loss of revenue.

49
Figure 4.9: Churn rate of telecom companies

Figure 4.10: Area under Curve

After analysis trend of customer and applying XGBoost algorithm, this research
obtain result prediction about customer churn, high risk of churn in Ufone Telecom
Company and then Jazz/Warid as shown in figure 4.9, as well as area under curve

iteration result shown in figure 4.10. and XBBoost achieve AUC is 81.3%.

50
CHAPTER 5

SUMMARY, FINDINGS, CONCLUSION &


RECOMMENDATIONS
5.1 Summary
The purpose of this sort of study is to help businesses in increasing their
profits in telecom sector. Predicting churn has become one of the most significant
sources of income for telecom firms. Hence, the goal of this study is to develop a
system that can forecast customer churn in telecom firm. High AUC values are
required for these prediction models. The sample data is separated into 70 percent for
training and 30 percent for testing for train and test the model.

Churn studies enable in the analysis of a comprehensive churn perspective and


the subsequent exploration of various features for using churn models. Furthermore,
various studies have been undertaken in the past, with each research employing a
various data size and distinct characteristics, which causes severe limitations for
customer churn for real-time applications.

Due to increased competition, predicting churn has always been a difficult


issue. To solve the churn problem this research examines the trend of customers
which offers make a company attract customers. To improve income of a company
it’s necessary to solve all problems of a customer regarding service quality,
performance and complaints register by customer. As a result, the focus of this study
is on developing an efficient model. Machine learning measures such as Accuracy,
Precision, and Recall are used to assess XGBoost. 19.26% of clients in the churn
dataset are churned, whereas 80.73% are non-churn customers. This method gives
89.39% accuracy. The precision value of XGBoost is 73.71%. XGBoost obtains a
recall rate of 69.79%, AUC is 81.3% and an F1 score of 71.69%

5.2 Findings
The major reason for utilizing XGBoost is because of its execution speed and
model performance. XGBoost employs ensemble learning methods, which means that
it employs a collection of multiple algorithms to create output as a single model.
XGBoost allows parallel and distributed computation while utilizing memory
efficiently.

54
If difficulty with classification problem. Particularly effective if you have a lot
of features and a lot of data, there are outliers, a lot of missing values and you don't
want to conduct any feature engineering. It wins virtually all competitions; therefore
keep this algorithm in mind if you're working on a classification task.

There isn't as much feature engineering necessary (not need of scaling, normal
ising data, well deal with missing values).The significance of a feature can be determi
ned (it's output compulsory for each feature, to feature selection).Easy to understand,
Outliers have a negligible effect.Works well with huge datasets.Quickness of executio
n Model performance is excellent ( most of Kaggle's competitions win).

5.3 Conclusion
Now customer churn is a big concern for many organizations, especially the
telecommunications sector. It has a negative impact on client retention, making it
much difficult to obtain new customers. Predictive model of customer churn can assist
firms in overcoming this threat by identifying sensitive clients and implementing
customer retention strategies. The suggested predictive model is perfectly solved by
XGBoost algorithm. This research is based on customer trend analysis and prediction
about customer churn in telecom industry using XGBoost machine learning algorithm
as all the results as shown in figures in above shows the trend of customer and predict
churn, highly churned would be occurred in Ufone and jazz due to various reasons as
mentioned above chapter. The significance of customer churn will assist many
organizations, particularly those in the telecommunications industry, in achieving a
good profit and a high level of revenue. Customer churn forecast is a major concern in
the telecommunication sector, and as a result, corporations want to maintain existing
customers rather than acquire new ones. When compared to previous algorithms, this
research achieves more accuracy by employing XGBoost. In this case, dataset of
certain customers' service plans and evaluating the values to make a precise forecast,
which will effectively target consumers who are planning to shift to
other firm services. The Telecom Company will have a clear picture and will be able
to supply them with some exciting incentives to retain customer in that service. The
results indicate that suggested churn model XGBoost gave better outcomes and
performed better. This method gives 89.39% accuracy. The precision value of

52
XGBoost is 73.71%. XGBoost obtains a recall rate of 69.79%, AUC is 81.3% and an
F1 score of 71.69%
5.3.1 Limitations

Before feeding categorical features into the models in XGBoost, you must
manually build dummy variable/label encoding. For larger datasets, training time is
quite long. Interpretation is difficult and visualization is also difficult. If parameters
are not calibrated appropriate, over fitting is possible. It is more difficult to adjust too
many hyper parameters.

5.4 Recommendations
Furthermore, proposed method XGBoost algorithm that incorporates a weight
function to solve the issue of unbalanced datasets and data. Further testing is done by
comparing it to other machine learning algorithms, and it reaches big performance
metrics of even more than 99 percent.

5.4.1 Future Recommendations

As a result, future effort might be focused on more study on lazy learning


algorithms to improve customer churn prediction. The study may be expanded to
learn about shifting consumer behavior by employing Artificial Intelligence tools for
trend analysis of customer and customer prediction and even achieve more accuracy
99.9 %

53
REFERENCES
Abbasi, Ahmed, Lau, Raymond YK, & Brown, Donald E. (2015). Predicting behavior. IEEE
Intelligent Systems, 30(3), 35-43.
Abbasimehr, Hossein, & Shabani, Mostafa. (2019). A new methodology for customer
behavior analysis using time series clustering: A case study on a bank’s customers.
Kybernetes.
Ahmad, Abdelrahim Kasem, Jafar, Assef, & Aljoumaa, Kadan. (2019). Customer churn
prediction in telecom using machine learning in big data platform. Journal of Big
Data, 6(1), 1-24.
Ahmed, Ammar AQ, & Maheswari, D. (2017). Churn prediction on huge telecom data using
hybrid firefly based classification. Egyptian Informatics Journal, 18(3), 215-220.
Ahmed, Ammara, & Linen, D Maheswari. (2017). A review and analysis of churn prediction
methods for customer retention in telecom industries. Paper presented at the 2017
4th International Conference on Advanced Computing and Communication Systems
(ICACCS).
Ahn, Jaehyun, Hwang, Junsik, Kim, Doyoung, Choi, Hyukgeun, & Kang, Shinjin. (2020). A
Survey on Churn Analysis in Various Business Domains. IEEE Access, 8, 220816-
220839.
Almana, Amal M, Aksoy, Mehmet Sabih, & Alzahrani, Rasheed. (2014). A survey on data
mining techniques in customer churn analysis for telecom industry. International
Journal of Engineering Research and Applications, 4(5), 165-171.
Amuda, Kamorudeen A, & Adeyemo, Adesesan B. (2019). Customers churn prediction in
financial institution using artificial neural network. arXiv preprint arXiv:1912.11346.
Arivazhagan, B, & Sankara, SDRS. (2020). Customer Churn Prediction Model Using Regression
with Bayesian Boosting Technique in Data Mining. Ijaema. Com, 12(V), 1096-1103.
Azeem, Muhammad, Usman, Muhammad, & Fong, Alvis Cheuk M. (2017). A churn prediction
model for prepaid customers in telecom using fuzzy classifiers. telecommunication
Systems, 66(4), 603-614.
Ben, Adeniyi. (2020). Enhanced Churn Prediction in the Telecommunication Industry.
International Journal of Innovative Research in Computer Science & Technology
(IJIRCST), 8.
Brânduşoiu, Ionuţ, Toderean, Gavril, & Beleiu, Horia. (2016). Methods for churn prediction in
the pre-paid mobile telecommunications industry. Paper presented at the 2016
International conference on communications (COMM).

54
Castanedo, Federico, Valverde, Gabriel, Zaratiegui, Jaime, & Vazquez, Alfonso. (2014). Using
deep learning to predict customer churn in a mobile telecommunication network.
Wise Athena LLC.
Chandrakala, D. (2016). A survey on customer churn prediction using machine learning
techniques. International Journal of Computer Applications, 154(10).
Chu, Bong-Horng, Tsai, Ming-Shian, & Ho, Cheng-Seen. (2007). Toward a hybrid data mining
model for customer retention. Knowledge-Based Systems, 20(8), 703-718.
Cordoba, Alberto. (2014). Understanding the predictive analytics lifecycle: John Wiley &
Sons.
Coussement, Kristof, Benoit, Dries Frederik, & Van den Poel, Dirk. (2015). Preventing
customers from running away! Exploring generalized additive models for customer
churn prediction The Sustainable Global Marketplace (pp. 238-238): Springer.
Dalvi, Preeti K, Khandge, Siddhi K, Deomore, Ashish, Bankar, Aditya, & Kanade, VA. (2016).
Analysis of customer churn prediction in telecom industry using decision trees and
logistic regression. Paper presented at the 2016 Symposium on Colossal Data
Analysis and Networking (CDAN).
De Bock, KW, & Van den Poel, D. (2012). Reconciling performance and interpretability in
customer churn prediction modeling using ensemble learning based on generalized
additive models. Retrieved from
Dingli, Marmara, Fournier. (2017). Enhancing Customer Retention through Data Mining
Technique, Machine Learning and

Applications. An International Journal (MLAIJ), Vol.4, No.1/2/3,( September 2017).


Do, Duyen, Huynh, Phuc, Vo, Phuong, & Vu, Tu. (2017). Customer churn prediction in an
internet service provider. Paper presented at the 2017 IEEE International Conference
on Big Data (Big Data).
Effendy, Veronikha, & Baizal, ZK Abdurahman. (2014). Handling imbalanced data in customer
churn prediction using combined sampling and weighted random forest. Paper
presented at the 2014 2nd International Conference on Information and
Communication Technology (ICoICT).
García, David L, Nebot, Àngela, & Vellido, Alfredo. (2017). Intelligent data analysis
approaches to churn as a business problem: a survey. Knowledge and Information
Systems, 51(3), 719-774.

55
Gerpott, Torsten J, Rams, Wolfgang, & Schindler, Andreas. (2001). Customer retention,
loyalty, and satisfaction in the German mobile cellular telecommunications market.
Telecommunications Policy, 25(4), 249-269.
Halibas, Alrence Santiago, Matthew, Anju Cherian, Pillai, Indu Govinda, Reazol, Jay Harold,
Delvo, Erbeth Gerald, & Reazol, Leslyn Bonachita. (2019). Determining the
intervening effects of exploratory data analysis and feature engineering in telecoms
customer churn modelling. Paper presented at the 2019 4th MEC International
Conference on Big Data and Smart City (ICBDSC).
Hassouna, Mohammed, Tarhini, Ali, Elyas, Tariq, & AbouTrab, Mohammad Saeed. (2016).
Customer churn in mobile markets a comparison of techniques. arXiv preprint
arXiv:1607.07792.
He, Benlan, Shi, Yong, Wan, Qian, & Zhao, Xi. (2014). Prediction of customer attrition of
commercial banks based on SVM model. Procedia computer science, 31, 423-430.
He, Yue, He, Zhenglin, & Zhang, Dan. (2009). A study on prediction of customer churn in fixed
communication network based on data mining. Paper presented at the 2009 sixth
international conference on fuzzy systems and knowledge discovery.
Hemalatha, Putta, & Amalanathan, Geetha Mary. (2019). A hybrid classification approach for
customer churn prediction using supervised learning methods: Banking sector. Paper
presented at the 2019 International Conference on Vision Towards Emerging Trends
in Communication and Networking (ViTECoN).
Hosseini, Seyed Mohammad Seyed, Maleki, Anahita, & Gholamian, Mohammad Reza.
(2010). Cluster analysis using data mining approach to develop CRM methodology to
assess the customer loyalty. Expert Systems with Applications, 37(7), 5259-5264.
Huang, Bingquan, Buckley, Brian, & Kechadi, T-M. (2010). Multi-objective feature selection
by using NSGA-II for customer churn prediction in telecommunications. Expert
Systems with Applications, 37(5), 3638-3646.
Huang, Bingquan, Kechadi, Mohand Tahar, & Buckley, Brian. (2012). Customer churn
prediction in telecommunications. Expert Systems with Applications, 39(1), 1414-
1425.
Hung, Shin-Yuan, Yen, David C, & Wang, Hsiu-Yu. (2006). Applying data mining to telecom
churn management. Expert Systems with Applications, 31(3), 515-524.
Idris, Adnan, Iftikhar, Aksam, & ur Rehman, Zia. (2019). Intelligent churn prediction for
telecom using GP-AdaBoost learning and PSO undersampling. Cluster Computing,
22(3), 7241-7255.

56
Idris, Adnan, Khan, Asifullah, & Lee, Yeon Soo. (2012). Genetic programming and
adaboosting based churn prediction for telecom. Paper presented at the 2012 IEEE
international conference on Systems, Man, and Cybernetics (SMC).
Jadhav, Rahul J, & Pawar, Usharani T. (2011). Churn prediction in telecommunication using
data mining technology. International Journal of Advanced Computer Science and
Applications, 2(2).
Jeyakarthic, M, & Venkatesh, S. (2020). An Effective Customer Churn Prediction Model using
Adaptive Gain with Back Propagation Neural Network in Cloud Computing
Environment. Journal of Research on the Lepidoptera, 51(1), 386-399.
Karahoca, Adem, & Karahoca, Dilek. (2011). GSM churn management by using fuzzy c-means
clustering and adaptive neuro fuzzy inference system. Expert Systems with
Applications, 38(3), 1814-1822.
Karvana, Ketut Gde Manik, Yazid, Setiadi, Syalim, Amril, & Mursanto, Petrus. (2019).
Customer churn analysis and prediction using data mining models in banking
industry. Paper presented at the 2019 International Workshop on Big Data and
Information Security (IWBIS).
Kavitha, V, Kumar, G Hemanth, Kumar, SM, & Harish, M. (2020). Churn prediction of
customer in telecom industry using machine learning algorithms. International
Journal of Engineering Research and Technology (IJERT), 9(5), 181-184.
Keramati, Abbas, & Ardabili, Seyed MS. (2011). Churn analysis for an Iranian mobile
operator. Telecommunications Policy, 35(4), 344-356.
Keramati, Abbas, Jafari-Marandi, Ruholla, Aliannejadi, Mohammad, Ahmadian, Iman,
Mozaffari, Mahdieh, & Abbasi, Uldoz. (2014). Improved churn prediction in
telecommunication industry using data mining techniques. Applied Soft Computing,
24, 994-1012.
Kirui, Clement, Hong, Li, Cheruiyot, Wilson, & Kirui, Hillary. (2013). Predicting customer
churn in mobile telephony industry using probabilistic classifiers in data mining.
International Journal of Computer Science Issues (IJCSI), 10(2 Part 1), 165.
Kouzehgar, Maryam, Badamchizadeh, Mohammadali, & Feizi-Derakhshi, Mohammad-Reza.
(2015). Ant-inspired fuzzily deceptive robots. IEEE Transactions on Fuzzy Systems,
24(2), 374-387.
Labhsetwar, Shreyas Rajesh. (2020). Predictive analysis of customer churn in telecom
industry using supervised learning. ICTACT Journal on Soft Computing, 10(2), 2054-
2060.

57
Lazarov, Vladislav, & Capota, Marius. (2007). Churn prediction. Bus. Anal. Course. TUM
Comput. Sci, 33, 34.
Lemmens, Aurélie, & Croux, Christophe. (2006). Bagging and boosting classification trees to
predict churn. Journal of Marketing Research, 43(2), 276-286.
Leventhal, Barry. (2018). Predictive Analytics for Marketers: Using Data Mining for Business
Advantage: Kogan Page Publishers.
Lu, Junxiang. (2002). Predicting customer churn in the telecommunications industry––An
application of survival analysis modeling using SAS. Paper presented at the SAS User
Group International (SUGI27) Online Proceedings.
Lu, Ning, Lin, Hua, Lu, Jie, & Zhang, Guangquan. (2012). A customer churn prediction model
in telecom industry using boosting. IEEE Transactions on Industrial Informatics,
10(2), 1659-1665.
Makhtar M, Nafis S, MA, MK, MNA, . (2017). Churn classification

model for local telecommunication company based on rough set theory. Journal of
Fundamental and

Applied Sciences, Deris MM (2017). 9(6S): 854-868.


Maldonado, Sebastián, Flores, Álvaro, Verbraken, Thomas, Baesens, Bart, & Weber, Richard.
(2015). Profit-based feature selection using support vector machines–General
framework and an application for customer retention. Applied Soft Computing, 35,
740-748.
Mitkees, Ibrahim MM, Badr, Sherif M, & ElSeddawy, Ahmed Ibrahim Bahgat. (2017).
Customer churn prediction model using data mining techniques. Paper presented at
the 2017 13th International Computer Engineering Conference (ICENCO).
Nath, Shyam V, & Behara, Ravi S. (2003). Customer churn analysis in the wireless industry: A
data mining approach. Paper presented at the Proceedings-annual meeting of the
decision sciences institute.
Niken, P, & Ohwada, H. (2014). Applicability of machine-learning techniques in predicting
customer defection. Paper presented at the Technology Management and Emerging
Technologies (ISTMET), 2014 International Symposium on. IEEE.
Óskarsdóttir, María, Bravo, Cristián, Verbeke, Wouter, Sarraute, Carlos, Baesens, Bart, &
Vanthienen, Jan. (2017). Social network analytics for churn prediction in telco:
Model building, evaluation and network architecture. Expert Systems with
Applications, 85, 204-220.

58
Oyeniyi, AO, Adeyemo, AB, Oyeniyi, AO, & Adeyemo, AB. (2015). Customer churn analysis in
banking sector using data mining techniques. Afr J Comput ICT, 8(3), 165-174.
Pamina, Jeyakumar, Raja, Beschi, SathyaBama, S, Sruthi, MS, & VJ, Aiswaryadevi. (2019). An
effective classifier for predicting churn in telecommunication. Jour of Adv Research
in Dynamical & Control Systems, 11.
Pendharkar, Parag C. (2009). Genetic algorithm based neural network approaches for
predicting churn in cellular wireless network services. Expert Systems with
Applications, 36(3), 6714-6720.
Petkovski, Aleksandar J, Stojkoska, Biljana L Risteska, Trivodaliev, Kire V, & Kalajdziski,
Slobodan A. (2016). Analysis of churn prediction: A case study on telecommunication
services in Macedonia. Paper presented at the 2016 24th Telecommunications
Forum (TELFOR).
Prasad, U Devi, & Madhavi, S. (2012). Prediction of churn behavior of bank customers using
data mining tools. Business Intelligence Journal, 5(1), 96-101.
Qureshi, Saad Ahmed, Rehman, Ammar Saleem, Qamar, Ali Mustafa, Kamal, Aatif, &
Rehman, Ahsan. (2013). Telecommunication subscribers' churn prediction model
using machine learning. Paper presented at the Eighth international conference on
digital information management (ICDIM 2013).
Rattanathavorn, Kanok, & Premchaiswadi, Wichian. (2015). Analysis of customer behavior in
a call center using fuzzy miner. Paper presented at the 2015 13th International
Conference on ICT and Knowledge Engineering (ICT & Knowledge Engineering 2015).
Rosa, Nelson Belém da Costa. (2019). Gauging and foreseeing customer churn in the banking
industry: a neural network approach.
Rygielski, Chris, Wang, Jyun-Cheng, & Yen, David C. (2002). Data mining techniques for
customer relationship management. Technology in society, 24(4), 483-502.
Sharma, Kumar Panigrahi. (2011). A Neural Network based Approach for Predicting

Customer Churn in Cellular Network Services. International Journal of Computer Applications


(0975 – 8887)

Volume 27– No.11( August 2011).


Sharma, Sidhu (2018). Customer Relationship Management Using Clustering,Classification
Technique. Journal of Computer Engineering, Volume 20(issue 5, 2018).

59
Shirazi, Farid, & Mohammadi, Mahbobeh. (2019). A big data analytics model for customer
churn prediction in the retiree segment. International Journal of Information
Management, 48, 238-253.
Sivasankar, E, & Vijaya, J. (2019). Hybrid PPFCM-ANN model: an efficient system for
customer churn prediction through probabilistic possibilistic fuzzy clustering and
artificial neural network. Neural Computing and Applications, 31(11), 7181-7200.
Sjarif, NN, Rusydi, M, Yusof, M, Hooi, D, Wong, T, Ya’akob, S, . . . Osman, MZ. (2019). A
customer Churn prediction using Pearson correlation function and K nearest
neighbor algorithm for telecommunication industry. Int. J. Advance Soft Compu.
Appl, 11(2).
Sundarkumar, G Ganesh, Ravi, Vadlamani, & Siddeshwar, V. (2015). One-class support vector
machine based undersampling: Application to churn prediction and insurance fraud
detection. Paper presented at the 2015 IEEE International Conference on
Computational Intelligence and Computing Research (ICCIC).
Tsai, Chih-Fong, & Chen, Mao-Yuan. (2010). Variable selection by association rules for
customer churn prediction of multimedia on demand. Expert Systems with
Applications, 37(3), 2006-2015.
Tsai, Chih-Fong, & Lu, Yu-Hsin. (2009). Customer churn prediction by hybrid neural networks.
Expert Systems with Applications, 36(10), 12547-12553.
Tsai, Hsu-Hao. (2011). Research trends analysis by comparing data mining and customer
relationship management through bibliometric methodology. Scientometrics, 87(3),
425-450.
Ullah, Irfan, Raza, Basit, Malik, Ahmad Kamran, Imran, Muhammad, Islam, Saif Ul, & Kim,
Sung Won. (2019). A churn prediction model using random forest: analysis of
machine learning techniques for churn prediction and factor identification in
telecom sector. IEEE Access, 7, 60134-60149.
Umayaparvathi, V, & Iyakutti, K. (2016). A survey on customer churn prediction in telecom
industry: Datasets, methods and metrics. International Research Journal of
Engineering and Technology (IRJET), 3(04).
Vadakattu, Ramakrishna, Panda, Bibek, Narayan, Swarnim, & Godhia, Harshal. (2015).
Enterprise subscription churn prediction. Paper presented at the 2015 IEEE
International Conference on Big Data (Big Data).

60
Vafeiadis, Thanasis, Diamantaras, Konstantinos I, Sarigiannidis, George, & Chatzisavvas, K Ch.
(2015). A comparison of machine learning techniques for customer churn prediction.
Simulation Modelling Practice and Theory, 55, 1-9.
Verbraken, Thomas, Verbeke, Wouter, & Baesens, Bart. (2014). Profit optimizing customer
churn prediction with Bayesian network classifiers. Intelligent Data Analysis, 18(1),
3-24.
Wang, Hsiao-Fan, & Hong, Wei-Kuo. (2006). Managing customer profitability in a
competitive market by continuous data mining. Industrial marketing management,
35(6), 715-723.
Wang, Qiu-Feng, Xu, Mirror, & Hussain, Amir. (2019). Large-scale ensemble model for
customer churn prediction in search ads. Cognitive Computation, 11(2), 262-270.
Xia, Guo-en, & Jin, Wei-dong. (2008). Model of customer churn prediction on support vector
machine. Systems Engineering-Theory & Practice, 28(1), 71-77.
Yanfang, Qiu, & Chen, Li. (2017). Research on E-commerce user churn prediction based on
logistic regression. Paper presented at the 2017 IEEE 2nd Information Technology,
Networking, Electronic and Automation Control Conference (ITNEC).
Zhao, Long, Gao, Qian, Dong, XiangJun, Dong, Aimei, & Dong, Xue. (2017). K-local maximum
margin feature extraction algorithm for churn prediction in telecom. Cluster
Computing, 20(2), 1401-1409.

61
APPENDIXES
A: SOURCE CODE OF PROPOSED METHOD

# imporiting libraries

import pandas as pd

import numpy as np

import xgboost as xgb

from sklearn.model_selection import train_test_split, GridSearchCV

from sklearn.metrics import


balanced_accuracy_score,accuracy_score,precision_score,recall_score,f1_score

from sklearn.metrics import confusion_matrix, plot_confusion_matrix

import awkward as ak

from xgboost import plot_tree

import matplotlib.pyplot as plt

import seaborn as sns

from numpy import loadtxt

from matplotlib import pyplot

import matplotlib.pyplot as plt

plt.style.use('ggplot')

#loadiong dataset

df = pd.read_csv('churnData.csv')

#ensuring that monthly usage is in numeric

df['monthly_usage_rs'] = pd.to_numeric(df['monthly_usage_rs'])

# converting all the spaces inn data to "_"

df.replace(' ', '_', regex=True, inplace=True)

# converting Churn values Yes to 1 and No to 0 for accurate xgb implimentation

df.loc[df['Churn'] == 'Yes', 'Churn'] = 1

df.loc[df['Churn'] == 'No', 'Churn'] = 0

# making X dataset without Churn

X = df.drop('Churn', axis=1).copy()

62
# making Y dataset i.e churn

y = df['Churn'].copy()

# Hot Encoding of X data set for Xgb Implimentation

X_encoded = pd.get_dummies (X, columns =


['Gender','Network','relation_durattion','service_type','source_of_attraction','prferred_communication',

'satisfy_with_call_rate','contact_with_cust_care','Happy_with_intrnet_quality','4G_available',

'rate_of_cust_care_calling','reascon_of_contact_cc','satisfy_with_voice_quality','satisfy_with_signal_st
rength','pleaseed_with_prices',

'satisfy_with_billing_process','recommend_to_others'] , drop_first=True)

# making train and test data set on x an y

X_train, X_test, y_train, y_test = train_test_split(X_encoded, y, random_state=42, stratify=y)

#implimenting xtreme Gradient Boost Algorithm i.e xgb

clf = xgb.XGBClassifier(objective='binary:logistic', missing=1,


seed=42,use_label_encoder=False,eval_metric='logloss',max_depth=1,

min_child_weight=1,colsample_bytree=1)

clf.fit(X_train, y_train, verbose=True, early_stopping_rounds=10,

eval_metric='aucpr', eval_set=[(X_test, y_test)])

#making prediction

y_pred = clf.predict(X_test)

predictions = [round(value) for value in y_pred]

#converting to 1D array of y_test for confusion matrix making

ytest_array = ak.from_iter(y_test)

# finding balance accuracy score

bal_accuracy = balanced_accuracy_score(ytest_array, y_pred)

print("Balanced Accuracy: %.2f%%" % (bal_accuracy * 100.0))

#Accuracy

accuracy = accuracy_score(ytest_array, predictions)

print("Accuracy: %.2f%%" % (accuracy * 100.0))

# precision

precision = precision_score(ytest_array, y_pred)

63
print("Precision: %.2f%%" % (precision * 100.0))

# recall

recall = recall_score(ytest_array, y_pred)

print("Recall: %.2f%%" % (recall * 100.0))

# f1

f1 = f1_score(ytest_array, y_pred)

print("F1 Score: %.2f%%" % (f1 * 100.0))

# making Confusion matrix

confusion_matrix(y_true=ytest_array, y_pred=y_pred)

#Plotting confusion matrix

plot_confusion_matrix(clf, X_test, ytest_array, values_format='d', display_labels=['Did not Leave',


'Left'])

# Trend analysis visualizationn of graphs between Churn and diferent Attributes

#Customer Churn and network we use

plt.figure(figsize=(8,6))

sns.barplot(x="Network", y="Churn", hue = "Network" ,data=df)

plt.title("Customer Churn and network we use")

plt.show()

#Customer Churn and Pleased with Prices of Packages offer by network

plt.figure(figsize=(8,6))

sns.barplot(x="pleaseed_with_prices", y="Churn", hue = "Network" ,data=df)

plt.title("Customer Churn and Pleased with Prices of Packages offer by network")

plt.show()

#Customer Churn and 4G available

plt.figure(figsize=(8,6))

sns.barplot(x="4G_available", y="Churn", hue = "Network" ,data=df)

plt.title("Customer Churn and 4G available")

plt.show()

#Customer Churn and Happy with Internet quality

plt.figure(figsize=(8,6))

64
sns.barplot(x="Happy_with_intrnet_quality", y="Churn", hue = "Network" ,data=df)

plt.title("Customer Churn and Happy with Internet quality ")

plt.show()

#Customer Churn and satisfy with Signal Strength use

plt.figure(figsize=(8,6))

sns.barplot(x="satisfy_with_signal_strength", y="Churn", hue = "Network" ,data=df)

plt.title("Customer Churn and satisfy with Signal Strength use")

plt.show()

#Customer Churn and Satisfaction with Call rate

plt.figure(figsize=(8,6))

sns.barplot(x="satisfy_with_call_rate", y="Churn", hue = "Network" ,data=df)

plt.title("Customer Churn and Satisfaction with Call rate")

plt.show()

#Customer Churn and relation duration

plt.figure(figsize=(8,6))

sns.barplot(x="relation_durattion", y="Churn", hue = "Network" ,data=df)

plt.title("Customer Churn and relation duration")

plt.show()

#Customer Churn and Service Type

plt.figure(figsize=(8,6))

sns.barplot(x="service_type", y="Churn", hue = "Network" ,data=df)

plt.title("Customer Churn and Service Type")

plt.show()

#Customer Churn and Recommendation to orhers

plt.figure(figsize=(8,6))

sns.barplot(x="recommend_to_others", y="Churn", hue = "Network" ,data=df)

plt.title("Customer Churn and Recommendation to orhers")

plt.show()

65
66

You might also like