You are on page 1of 9

Research Article

Received 8 September 2013, Revised 8 June 2014, Accepted 8 June 2014 Published online 10 July 2014 in Wiley Online Library

(wileyonlinelibrary.com) DOI: 10.1002/asmb.2053

An improved customer lifetime value model


based on Markov chain
Mohamed Ben Mzoughiaa*† and Mohamed Limama,b
Firms are increasingly looking to provide a satisfactory prediction of customer lifetime value (CLV), a determining metric to target
future profitable customers and to optimize marketing resources. One of the major challenges associated with the measurement
of CLV is the choice of the appropriate model for predicting customer value because of the large number of models proposed in
the literature. Earlier models to forecast CLV are relatively unsuccessful, whereas simple models often provide results which are
equivalent or even better than sophisticated ones. To predict CLV, Rust et al. (2011) proposed a framework model that performs
better than simple managerial heuristic models, but its implementation excludes cases where customer’s profit is negative and does
not handle lost-for-good situations. In this paper, we propose a modified model that handles both negative and positive profits based
on Markov chain model (MCM), hence offering a greater flexibility by covering always-a-share and lost-for-good situations. The
proposed model is compared with the Pareto/Negative Binomial Distribution (Pareto/NBD), the Beta Geometric/Negative Binomial
Distribution (BG/NBD), the MCM, and the Rust et al. (2011) models. Based on customer credit card transactions provided by the
North African retail bank, an empirical study shows that the proposed model has better forecasting performance than competing
models. Copyright © 2014 John Wiley & Sons, Ltd.

Keywords: customer lifetime value; forecasting; Markov chain model

1. Introduction

Customer lifetime value (CLV) is a customer level metric used to target profitable customers and optimize marketing
resources [1]. Pfeifer et al. [2] defines CLV as the present value of the future cash flow associated with a customer. For
firms, it is interesting to predict future profitability for each customer during each period of his lifetime with the firm,
knowing that some customers could become more profitable over time, whereas others could turn out to be less profitable.
The predicted CLV can then be used to make marketing actions more effective and efficient. While traditional marketing
metrics are unable to show a return on marketing investment, targeting customers based on their predicted values can
help firms get an improved return on their marketing investment. Also, such diagnostics are not possible from aggregate
financial measures.
Research on CLV measurement has focused on particular contexts. Jackson [3] identified two major categories of mer-
chants: lost-for-good and always-a-share. The lost-for-good context assumes that a customer is either totally committed to
the merchant or totally lost and then committed to some other merchant. At this point, a retention rate is expected and it is
generally based on past data. In the always-a-share context, the customer can easily try new vendors. Switching costs are
a major factor to identify one behavior from another.
Given the large number of CLV models proposed in the literature, and the specificities of different industries, firms are
looking to provide a satisfactory prediction of the CLV metric for each customer [4]. The challenge associated with the
measurement of CLV is the choice of an adequate model to predict it.
The first CLV model named Pareto/NBD was proposed by Schmittlein et al. [5] and extended by Schmittlein and
Peterson [6]. This model has the advantage of using only three variables to predict the number of transactions per cus-
tomer at each future point. This model is classified in the lost-for-good context while it assumes that customer lifetime
follows an exponential distribution. The Pareto/NBD is a well-known model, even though it is difficult to implement

a LARODEC, ISG, University of Tunis, Tunisia


b Dhofar University, Oman and University of Tunis, Tunisia
528

*Correspondence to: Mohamed Ben Mzoughia, LARODEC, ISG, University of Tunis, Tunisia.
† E-mail: mohamed.mzoughia@gmail.com

Copyright © 2014 John Wiley & Sons, Ltd. Appl. Stochastic Models Bus. Ind. 2015, 31 528–535
M. B. MZOUGHIA AND M. LIMAM

because of computational challenges related to parameters estimation. A simpler alternative to the Pareto/NBD model is
proposed by Fader et al. [7] for predicting customers future purchase based on past purchase behavior. Their model named
beta-geometric/NBD is easily implemented via maximum likelihood estimation.
In the same vein, Rust et al. [8] indicated that earlier models to forecast CLV have been relatively unsuccessful. They
argue that comparatively simple models can often provide results, which are either equivalent or even better than those
of sophisticated models. They proposed a new model framework to measure CLV, called hereafter Rust et al. Model
Framework (RMF), that performs considerably better than simple managerial heuristic models. RMF framework considers
specific links between different factors used to compute the CLV, as purchase behavior Y, customer characteristics D,
marketing contacts X, purchase propensity 𝜙, gross profit 𝜋, and control variables Z.
The implementation of RMF assumes that the gross profit 𝜋 is always positive and adopts a simulation approach to
estimate purchase incidence. This methodology is appropriate for an always-a-share context but cannot be implemented in
a lost-for-good context, where customers leave the company for good after some period of inactivity.
In this paper, we develop and apply a new CLV model based on RMF framework and on Markov chain model (MCM),
handling both negative and positive profits and estimating CLV in always-a-share context as well as in lost-for-good context.
The remainder of this paper is organized as follows. Section 2 details the proposed model. Section 3 applies the model
to a validation sample data set and compares its performance with other methods. Finally, Section 4 presents a conclusion.

2. The proposed model

The basic formula for calculating CLV for customer i and a finite time horizon (T) introduced by Berger and Nasr [9] is


T
Pit − Xit
CLVi = (1)
t=1
(1 + d)t

where d is the discount rate, Pit is the gross profit generated from the customer’s relationship at time t, and Xit is the variable
marketing cost.
We need to calculate Pit for each customer i in time t. Using Bayes’ theorem, Pit can be defined as follows:

Pit = P(𝜋it , Purit )𝜋it = P(𝜋it ∕Purit = 1)𝜋it P(Purit = 1) (2)

where 𝜋it is the gross profit and Purit is the customer purchase incidence. Purit is measured as
{
1 when customer makes purchase
Purit = . (3)
0 otherwise

Our proposed model measures separately customer gross profit P(𝜋it ∕Purit = 1)𝜋it and customer purchase incidence
P(Purit = 1). Using the framework proposed by Rust et al. [8], we model customer gross profit as a linear combination of
estimated customer marketing contact X̂ it , customer past purchase behavior Yit−1 , customer characteristics Dit , and control
variables Zt

P(𝜋it ∕Purit = 1)𝜋it = C0 + C1 Yit−1 + C2 Zt + C3 X̂ it + V0 Dit (4)

where C0 , C1 , C2 , C3 , and V0 are regression parameters.


Unlike RMF formulation, which assumes that profits are always positive, Equation (4) allows to estimate both positive
and negative profits. Marketing contacts Xit are estimated as a linear function of the past purchase behavior, past marketing
contacts, and customer characteristics.
To estimate customer purchase incidence P(Purit = 1), we propose to use an MCM to model both customer relationship
and the CLV. MCM provides flexibility when supporting both lost-for-good and always-a-share situations; moreover, it has
been used effectively in several areas including marketing [10, 11]. MCM, a probabilistic model used to predict probability
while considering firms future relationship with each customer, is appropriate for allocating right marketing efforts [12].
Markov chain model considers n possible states of the relationship between the firm and each customer. Each state
corresponds to the period. If a customer makes a purchase, he moves to the first state, else, he evolves to the next state. The
529

last state is an absorbent state, where the customer cannot move to any other state. Figure 1 gives a graphical representation
of MCM with five states.

Copyright © 2014 John Wiley & Sons, Ltd. Appl. Stochastic Models Bus. Ind. 2015, 31 528–535
M. B. MZOUGHIA AND M. LIMAM

Figure 1. Graphical representation of the Markov chain model with five states.

The probabilities of moving from one state to another in a single period are called transition probabilities and can be
calculated for each customer using historical data. The matrix P is constructed from transition probabilities representing a
one-step transition matrix given by

⎛P1 1 − P1 0 0 0 ⎞
⎜P2 0 1 − P2 0 0 ⎟
⎜ ⎟
P = ⎜P 3 0 0 1 − P3 0 ⎟
⎜P4 0 0 0 1 − P4 ⎟
⎜ ⎟
⎝0 0 0 0 1 ⎠

Pt is the t-step transition matrix, defined to be the matrix of probabilities of moving from one state to another in exactly
t periods.
The specific steps to measure purchase incidence are as follows:
(1) Initialize Rt−1 to the last recency value (1, 2, 3, 4, or 5), for example, if the last purchase has occurred within the last
period (week, month, or quarter), Rt = 1.
(2) Measure t-step transition matrix Pt . The element PtR ,1 in the first column and Rt−1 th row is the probability of
t−1
transition from actual state to state 1, which is the probability that a customer makes a purchase during the tth period.
(3) Generate a uniform random variable u ∼ [0,1]. If PtR ,1 > u, then the purchase incidence P(Purit = 1) = 1, and it
t−1
is zero otherwise.
(4) Update recency value as follows

{
1 when P(Purit = 1) = 1
Rt = . (5)
min(Rt−1 + 1, 5) otherwise

Using Equation (2), the predicted customer profit Pit can be calculated as

Pit = (C0 + C1 Yit−1 + C2 Zt + C3 X̂ it + V0 Dit )P(purit = 1). (6)

Knowing gross profit Pit , variable marketing cost Xit , and the discount rate d, the proposed model predicts CLV for
each customer at any future time t, using RMF framework and MCM customer relationship modeling. Future customer
profit is measured using RMF, which provides an excellent conceptual framework of relations linking explanatory factors
of profitability. MCM meanwhile allowed us to predict purchase incidence handing both lost-for-good and always-a-share
contexts. After explaining the model, we describe its application to a real-world data set. We then compare the performance
of the proposed model with other methods of CLV measurement.

3. Empirical application

3.1. Data
The retail banker data set is provided by an important North African retail bank. The data set contains customer card trans-
actions data from January 2006 till December 2011. We obtained the data about customer characteristics from the bank’s
530

CRM system as customer incomes, customer age, and customer category. The total number of transactions is 1,395,226
made by 12,709 customers.

Copyright © 2014 John Wiley & Sons, Ltd. Appl. Stochastic Models Bus. Ind. 2015, 31 528–535
M. B. MZOUGHIA AND M. LIMAM

The bank makes profits when customers use their cards for purchase. The profit generated by the bank is a variable
percentage of the amount of the purchase made by a customer minus the direct charges related to transactions paid to
multinational financial services providers as Visa and MasterCard. In some cases, profit generated by the bank may be
negative. When computing the CLV, we use quarterly periods. The choice of this time unit is related both to the small
number of transactions done by customers during one quarter and to the marketing costs, which cannot be calculated for
a period of less than 3 months. The discount rate is taken as the weighted average cost of capital of 3.2% yearly given a
quarterly discount rate of d = 0.7906%.
In this study, we used EViews software to perform regression analysis and Microsoft Visual Basic to run model steps and
to predict the individual CLV. The data set is extracted from the CRM database in Oracle 10g. We selected customers who
have made their first transaction between January 2006 and December 2008. The selected cohort contains 8224 customers
with 976,141 transactions. The CLV prediction will be made from January 1, 2009 till December 2011. As shown in
Figure 2, the first 12 quarters data (years 2006, 2007, and 2008) are used to estimate model parameters. For the next 12
quarters (years 2009, 2010, and 2011), data are used both to validate our model and to make a comparison with other
models.
Table I gives the complete information required to compare our model to previously cited models. Marketing actions
consist in both periodic mailing and occasional initiatives to encourage customers to use their cards to make purchases.
Marketing costs are distributed over cards and may change over periods and cardholders.
Profits and purchase behavior are obtained from the transactional database. The selected indicator representing purchase
behavior is the customer purchases amount per period. Whereas marketing costs are allocated in a non-random way, both
purchase timing and purchase amounts do not happen continuously or at known periods and can only be predicted.
By analyzing customer profit as a function of the same customer’s characteristics, we notice from Figure 3 that the
average profit per customer increases with customer’s income. This agrees with RMF where the profit is depending on the
customer’s characteristics.

Figure 2. Data analysis timeline.

Table I. Requested data to models comparison.


Model Data needed
Pareto/NBD and BG/NBD - Cohort: time from the entry of the customer to
the company until now.
- Frequency: gross number of transactions made
by the customer.
- Recency: time between the entry date and the
last purchase date.
- Average profit per transaction
- Average costs per customer

MCM - Transactions made per customer per time unit


(Boolean value)
- Average profits per time unit if the customer
makes a transaction
- Average costs per customer

RMF -Profits, marketing costs, and purchase behavior


per time unit and per customer
-Customer characteristics (age, incomes...)
- Control variables per time unit

Proposed Model Both RMF and MCM models data.


531

MCM, Markov chain model.

Copyright © 2014 John Wiley & Sons, Ltd. Appl. Stochastic Models Bus. Ind. 2015, 31 528–535
M. B. MZOUGHIA AND M. LIMAM

Figure 3. Average profit as function of customer incomes.

Figure 4. Probability to make purchase after n quarters of inactivity.

Using historical data, we calculate the probability that a customer makes a purchase after n period of inactivity. As shown
in Figure 4, if a customer doesn’t make any transaction during one quarter, the probability to make one in the next quarter
is 99%. But after five quarters of inactivity, this probability decreases to 7% showing that customer purchase incidence
depends on customer recency justifying the use of the MCM to predict customer purchase incidence.

3.2. Purchase incidence and profitability estimation


To compare predicted purchase incidence using proposed model and RMF for 12 quarters from January 2009 till December
2011, we count for each model the number of customers by average purchase incidence as

1 ∑
12
P(Puri = 1) = P(Purit = 1). (7)
12 t=1

Using real data, the purchase incidence is calculated as a function of customer’s purchases during the 12 quarters of the
prediction period. If a customer i makes a purchase in each of the 12 quarters, its incidence is equal to 1. However, if he
makes n purchases in 12 quarters, the incidence is n/12.
Figure 5 shows that the purchase incidence calculated with the proposed model fits very well with real data and confirms
the ability of MCM approach to accurately model the relationship of the customers with the company. However, RMF
displays a large number of customers with a purchase incidence less than 25% because of the simulation approach used to
measure incidence.
In the second step, we calculated the predicted gross profit generated by the customer in case of purchase P(𝜋it ∕Purit =
1)𝜋it using MCM, RMF, and the proposed model. For the Markov case, P(𝜋it ∕Purit = 1)𝜋it is the expected net contribution
to the company profits on the customers’ initial purchase and on each succeeding purchase. In that case, it is considered as
a constant value for each customer.
RMF considers the predicted gross profit as a real gap compared with known models. The prediction is made using a
532

complete framework and considering customer marketing contacts, marketing purchase behavior, control variables, and
especially customer characteristics as customer incomes, age, or category.

Copyright © 2014 John Wiley & Sons, Ltd. Appl. Stochastic Models Bus. Ind. 2015, 31 528–535
M. B. MZOUGHIA AND M. LIMAM

Figure 5. Number of customers per purchase incidence.

Figure 6. Predicted customer profit.

Table II. Gross profit and CLV statistics.


Statistics Gross profit CLV
Average 21.3 12.2
Median 16.9 8.1
Min −1.3 −11.2
Max 304.0 285.1
Standard deviation 21.7 20.1
Mean absolute deviation 15.8 14.4
CLV, customer lifetime value.

When we compare the average gross profit predicted by the three models, we notice from Figure 6 that the proposed
model is closer to reality and gives better results than RMF and MCM.

3.3. CLV prediction


Combining predicted customer purchase incidence and customer profitability using Equation (6) resulted in the CLV
presented in Table II.
To point out the advantage of the proposed model in predicting both customer profits and CLV, we compare the results
calculated from our model with those from the Pareto/NBD, the BG/NBD, the MCM, and RMF models. Note that we
considered the average gross profit to predict CLV using the BG/NBD and the Pareto/NBD models.
In the first step, we compare average predicted customer profits. Figure 7 shows the relative predictive performance of
the models. We plot the actual average profit at each period and compare it with the predictive performance of the proposed
model and the other models. The average customer profit is the mean profits generated from all selected customers.
Profit prediction is the most important challenge for the calculation of CLV. It is defined as the discounted difference
533

between profits and marketing costs. Generally, marketing costs are controlled by the company by allocating each year
the resources required for the development of its activity. Also, the prediction of these costs is difficult and may affect the

Copyright © 2014 John Wiley & Sons, Ltd. Appl. Stochastic Models Bus. Ind. 2015, 31 528–535
M. B. MZOUGHIA AND M. LIMAM

Figure 7. Predicted gross customer profit.

Figure 8. Average predicted CLV per customer.

Table III. Correct targeted customers per model.


Targeted customers Correct targeted customers Proportion of
Model (predicted CLV > 0) (predicted and real CLV > 0) correct targeting (%)
Proposed model 3393 2698 80
MCM 5809 3772 65
BG/NBD 3395 2830 83
Pareto/NBD 3699 2966 80
RMF 8216 4237 52
CLV, customer lifetime value.

quality of prediction. In our approach, we used the RMF model to predict marketing costs as a linear function of the past
purchase behavior, past marketing contacts, and customer characteristics.
Figure 8 presents the average CLV per customer predicted by the BG/NBD, the Pareto/NBD, the MCM, the RMF models,
and our proposed model. We note that with fixed marketing costs, the CLV predicted by the BG/NBD model is positioned
below zero. We also find that our model deviates slightly from the actual CLV, but it is still the closest one. This difference
may be caught up if marketing costs are already fixed throughout the prediction period.

3.4. Targeting customers


The main objective of predicting CLV at the customer level is to be able to target future profitable customers. To test the
performance of different models, we propose to identify customers considered profitable for each model and we test if
those customers will be profitable or not in the future.
If a model allows to target profitable customers, it should predict their CLV correctly in order to determine the optimal
allocation of resources for those customers. At this point, we calculate for each model the predicted CLV of targeted
customers and we compare it with the real CLV.
While comparing targeted customers using different models, as shown in Table III, the BG/NBD, the Pareto/NBD mod-
534

els, and our proposed model yield about 80% of correct targeted customers. RMF targets 8216 customers from 8224
selected in our study yielding 52% of the correct targeted customers.

Copyright © 2014 John Wiley & Sons, Ltd. Appl. Stochastic Models Bus. Ind. 2015, 31 528–535
M. B. MZOUGHIA AND M. LIMAM

Table IV. CLV of targeted customers per model.


Predicted CLV of Real CLV of targeted Difference %
Model targeted customers customers
Proposed model 98,150 87,342 +12
MCM 58,206 84,308 −31
BG/NBD 29,661 80,567 −63
Pareto/NBD 48,427 74,299 −35
RMF 193,310 80,235 +141
CLV, customer lifetime value.

For each model, we measure the CLV generated by the corresponding targeted customers and we compare the results to
the real CLV. From Table IV, we notice that our proposed model gives the best results with the minimum difference with
real data or about +12%. The BG/NBD and the Pareto/NBD models, which have more than 80% of the correct targeted
customers, give a CLV far from reality by −63% and −31%, respectively.

4. Conclusion

Customer lifetime value is a key metric for any business activity. The difficulty encountered by companies when computing
the CLV is the choice of the appropriate model, which provides a satisfactory prediction of the CLV for each customer.
Rust et al. [8] proposed a framework to model links between factors used to measure CLV. Their model performs
better than simple managerial heuristic models; however, their context is limited to always-a-share context and does not
support negative profits generated especially by important customers, who could become more profitable into the future.
We propose a model based on both RMF and MCM models. Our proposed model offers significant advantages over existing
alternatives, by handling both positive and negative profits, and providing good forecasting performance. Also, MCM
approach is used to model customer relationship within a company and to measure purchase incidence for each customer.
The contribution of MCM is its flexibility to adopt both lost-for-good and always-a-share situations, unlike RMF which
considers only always-a-share situation. In our empirical study, we show that the proposed method has a better forecasting
performance than competing models.
A modification needs to be made in order to improve marketing costs prediction by including forward-looking cost
allocation strategy. Future research may also explore the impact of competition on the customer behavior and on the CLV
measurement.

References
1. Kumar V, Reinartz WJ. Customer Relationship Management: A Databased Approach. John Wiley & Sons, Inc: New Jersey, 2006.
2. Pfeifer PE, Haskins ME, Conroy RM. Customer lifetime value, customer profitability and the treatment of acquisition spending. Journal of
Managerial Issues 2005; 17(1):11–25.
3. Jackson B. Winning and Keeping Industrial Customers. Lexington Books: Lexington, MA, 1985.
4. Venkatesan R, Kumar V. A customer lifetime value framework for customer selection and optimal resource allocation strategy. Journal of
Marketing 2004; 68(4):106–125.
5. Schmittlein DC, Morrison DG, Colombo R. Counting your customers: who are they and what will they do next? Management Science 1987;
33:1–24.
6. Schmittlein DC, Peterson RA. Customer base analysis: an industrial purchase process application. Marketing Science 1994; 13(Winter):41–67.
7. Fader PS, Hardie BGS, Lee KL. Counting your customers the easyway: an alternative to the Pareto/NBD model. Marketing Science 2005;
24(2):275–284.
8. Rust RT, Kumar V, Rajkumar V. Will the frog change into a prince? Predicting future customer profitability. International Journal of Research
in Marketing 2011; 28:281–294.
9. Berger PD, Nasr NI. Customer lifetime value: marketing models and applications. Journal of Interactive Marketing 1998; 12(1):17–30.
10. White DJ. A survey of applications of markov decision processes. The Journal of the Operational Research 1993; 44(11):1073–1096.
11. Bronnenberg BJ. Advertising frequency decisions in a discrete markov process under a budget constraint. Journal of Marketing Research 1998;
35(3):399–406.
12. Pfeifer P, Carraway R. Modeling customer relationships as markov chains. Journal Of Interactive Marketing 2000; 14(2):43–55.
535

Copyright © 2014 John Wiley & Sons, Ltd. Appl. Stochastic Models Bus. Ind. 2015, 31 528–535
Copyright of Applied Stochastic Models in Business & Industry is the property of John Wiley
& Sons, Inc. and its content may not be copied or emailed to multiple sites or posted to a
listserv without the copyright holder's express written permission. However, users may print,
download, or email articles for individual use.

You might also like