Professional Documents
Culture Documents
177-192
Maurizio Toccu*
Alessandro Fassò*
SUMMARY
Customers are important intangible assets of a company, but they are not equally remunerative.
Therefore, to identify profitable customers, it is necessary to use a disaggregated metric called cus-
tomer lifetime value (CLV). This metric represents the present value of all future profits generated
by a customer. Using a known formula, it is possible to calculate customer lifetime value taking into
account the survival probability of each customer. Therefore, in this paper we develop a statistical
analysis of CLV using SAS. To illustrate this method we present a simple case study related to a tel-
ecommunications company. In particular, we start the study by analyzing the distribution of custo-
mers and revenues (a component of CLV) with respect to geographical area and size of customer.
We continue with an analysis of the distribution of CLV with respect to the above factors, and this
allows us to make a comparison, on average, with the distribution of revenues. Finally, we estimate
a regression model to detect the factors affecting CLV, thereby obtaining interesting results.
1. INTRODUCTION
Over the past decade, customer management has rapidly emerged as an important
area of marketing research. In particular, the transition from the traditional product-
centric approach to a customer-centric approach reflects the fact that customers are
not only critical assets (Blattberg and Deighton, 1991) but are actually the most im-
portant assets of a company. For this reason, over time, different economic and fi-
nancial metrics have been created to evaluate customers. Indeed, it may be advanta-
geous to select certain profitable customers or assign different resources to different
groups of customers. For example, Peppers and Rogers (1993) argue that selling
more goods to fewer people is not only more efficient but also more profitable.
However, such controls are not possible using aggregate financial measures. Custo-
mer lifetime value (CLV), in contrast, is a disaggregate metric that enables valuable
customers to be identified and resources to be allocated accordingly.
Specifically, CLV can be used for several purposes. The first of these is customer
acquisition. Indeed, CLV defines the upper bound of what one should spend to ac-
quire a customer (Berger and Nasr, 1998). The second is customer selection, that is,
the focus on customers with high CLV (Venkatesan and Kumar, 2004). Thirdly it
can be used for the purpose of resource allocation, with respect to more profitable
customers (Reinartz and Kumar, 2003).
As customers are not all equally valuable, there is growing interest on the part of
companies in performing quantitative marketing and statistical analysis based on
CLV. As a result, in recent years there has been an increase in the literature dedi-
cated to CLV analysis. Below we summarise significant recent literature on CLV.
Gupta, Hanssens, Hardie, Kahn, Kumar, Lin, Ravishanker and Sriram (2006)
have developed a new method for calculating CLV which uses survival methods to
estimate customer churn probability. We discuss these methods in Section 2.
Aeron, Bhaskar, Sundararajan, Kumar and Moorthy (2008) discuss the credit card
market. In particular, they argue that CLV estimation can help a credit card issuer in
making crucial decisions. The authors present a conceptual model for credit card rev-
enue and a metric for CLV designed for credit card customers.
Jen, Chou and Allenby (2009) write about the analysis of customer value in direct
marketing. Typically this analysis combines customer timing and quantity data into a
single statistic. This statistic is used to compute lifetime values in order to rank cus-
tomers for different actions and to identify prospects for cross-selling. However, cur-
rent models assume that purchase timing and quantity decisions are independent (i.e.
uncorrelated) over time, given individual-level parameters. The authors show that in
these models customer value calculations can be severely biased when timing and
quantity are dependently related. They thus propose some alternative models.
Kumar, Aksoy, Donkers, Venkatesan, Wiesel and Tillmanns (2010) argue that
customers can interact and create value for firms in different ways. Their article pro-
poses that assessing the value of customers based solely upon their transactions with
a firm may not be sufficient. Therefore they propose four components of customer
engagement value (CEV) for a firm: customer lifetime value, customer referral value,
value of customer influence and customer knowledge value.
With respect to CLV for telecommunications companies (TLCs), we cite Lu
(2003), who estimates CLV using the equation (1) proposed by Gupta et al. (2006).
Compared to previous article, in this work we introduce a statistical analysis of CLV
which examines the distribution with respect to geographical area and size of custo-
mer. Moreover, we use regression analysis to verify what factors affect CLV, obtain-
ing interesting results. Our analyses use SAS because it makes it possible to manage
large datasets and performs advanced statistical analyses.
The rest of this paper consists of the following sections. Section 2 introduces sur-
vival analysis techniques and CLV; Section 3 contains a statistical analysis of CLV
based on the dataset of a TLC company and relevant pieces of SAS code. Finally, in
Section 4 we present our conclusions.
2. STATISTICAL METHODS
al. (2006) as it allows to weigh CLV with customers survival probabilities of our
TLC company.
Therefore, in the sections which follow we provide a brief introduction to the
concept of customer lifetime value and survival analysis techniques.
CLV is defined as the net present value of the cash flows, attributed to the relation-
ship with a customer over time. CLV resembles the discounted cash flow approach
used in finance. However, it differs from it in two ways. Firstly, CLV is defined
and estimated for each customer, it is a disaggregate metric. Secondly, in contrast to
the discounted cash flow approach, according to Gupta et al. (2006) CLV explicitly
incorporates the possibility that a customer may defect to competitors in the future.
Specifically, Gupta et al. (2006) define CLV as follows:
XT
ðpt ct Þrt
CLV ¼ AC ð1Þ
t¼0 ð1 þ iÞt
where:
– pt is the price paid by a customer at time t
– ct is the direct cost of servicing the customer at time t;
– i is the discount rate or cost of capital for the firm;
– tt is the probability of customer to be survived at time t;
– AC is the acquisition cost;
– T is the time horizon for estimating CLV.
When the above formula includes the acquisition cost AC, it represents the CLV
of a potential customer. Without AC, it represents the expected residual lifetime value
of an existing customer.
Survival data have two important features that are difficult to manage with alterna-
tive statistical methods: duration data and censored data. There are three kinds of
censoring: right, left and interval. In the dataset for above TLC company only the
first of these is detected. Survival analysis is characterized by two important func-
tions: the survival function and the hazard function. These are used to estimate sur-
vival probabilities for customers and risk of customer churn for during the study
period.
The main methods used in analyzing survival data are nonparametric, semipara-
metric and parametric. In this paper only the first two are used.
Non-parametric methods are useful in implementing exploratory data analysis. In
particular, Section 3 contains graphs illustrating the survival and hazard functions for
customers of above TLC company. Parametric methods, on the other hand, are more
efficient when the shape of the hazard function follows a known theoretical distribu-
tion. However, we prefer to use Cox’s semiparametric model because, thanks to the
corrections made by Breslow (1974) and Efron (1977) to the Cox partial likelihood
function, it makes it possible to handle tied survival times that are present in the da-
taset of above TLC company. Finally, the SAS procedure proc phreg, which we use
to estimate the Cox model, makes it possible to handle time-dependent covariates
(Allison, 2004).
A property of the Cox model is that different individuals have hazard functions
that are proportional. For example, hðtjx1 Þ=hðtjx2 Þ is the ratio ofthe hazard functions
for two individuals with prognostic factors or covariates x1 ¼ x11 ; x21 ; :::; xp1 and
x2 ¼ ðx12 ; x22 ; :::; xp2 Þ. This ratio is constant. Therefore the ratio of the risk of death
(in this case churn) of two individuals is the same, no matter how long they survive.
The Cox model is characterized by the partial likelihood function:
" #
Pp
exp bj xjðiÞ
Yk
j¼1
LðbÞ ¼ " #
i¼1 P P p
exp bj xjl
l2RðtðiÞ Þ j¼1
where:
– k is the
number of distinct failure times;
– R tðiÞ is the risk set that contains all persons at risk at time ti ;
– xjl are the covariates observed from individual l;
– bj are the unknown coefficients of covariates.
However, in practice, the covariates may be observed more than once during the
study and their values change with time (Lee and Wang, 2003). Therefore, in this pa-
per, we use an extension of the Cox model as the dataset of studied TLC company
contains time-dependent covariates xjl tðiÞ that are observed from individual l at the
ordered uncensored event time tðiÞ .
The hazard function states that the hazard for an individual at time t is the product
of two factors:
– a baseline hazard function h0 ðt Þ, i.e. the hazard function for an individual whose
covariates all have value of zero;
– a linear function of a set p of exponentiated covariates.
!
Xp
hðtjxÞ ¼ h0 ðt Þ exp bj xj ¼ h0 ðt Þ expðbx Þ ð2Þ
j¼1
where the left-hand side of (2) is a function of hazard ratio (or relative risk) and the
right-hand side is a linear function of the covariates and their respective coeffi-
cients.
Therefore, by dividing both sides of (2) by h0 ðt Þ and taking its logarithm, we ob-
tain the Cox regression model, with reference to the ith case (1, 2, ..., n):
hi ðtÞ
log ¼ b1 x1i þ b2 x2i þ ::: þ bp xpi
h0 ðtÞ
An important statistic provided by the Cox model is the hazard ratio (HR), which
is defined as eb . In Section 3.1 we show the interpretation of HR and explain the
method for handling time-dependent covariates using SAS.
3. CASE STUDY
Table 1 reports in detail the above variables that are recorded for each customer:
Table 2 shows that Northern Italy accounts for more than 50% of customers. It is
also significant that customers in Northern Italy generate about twice the total reven-
ues as the rest of the customers. Finally, a Kruskal-Wallis test allowed us to ascertain
that there are statistically significant differences (p-value<.0001) in average revenues
with respect to geographical area.
Table 3 shows that most of the customers of are small customers and we note that
these proportions are maintained across the three geographical areas. Large customers
generate, on average, approximately six times the revenues of small customers.
After checking the distribution of customers with respect to the above factors, we
introduced a nonparametric survival analysis, using proc lifetest, in order to verify
the survival rate of customers. Specifically, proc lifetest is an SAS procedure which
estimates survival and hazard functions using two methods: Kaplan-Meier and life-ta-
ble. The former is suitable for small datasets with precisely measured event times,
whereas the latter is suitable for large datasets or when the measurements of event
times are imprecise. As our TLC dataset is large, we chose the second method.
Below we show the SAS code for proc lifetest. The s parameter produces the sur-
vival plot, the h parameter creates the hazard plot, and the life parameter indicates
the life-table method.
proc lifetest data=tel;
method=life plot=(s,h) width=1 graphics;
time duration*event(0);
strata geo_zone;
run;
Using this procedure, with the strata statement we produce survival and hazard
graphs stratifying with respect to geographical area (respectively Figures 1 and 2)
and then size of customer. These graphs are helpful in highlighting the difference in
the behaviour of customer survival and hazard with respect to the above factors.
The survival plot, displayed in Figure 1, shows that the customer survival prob-
ability behaves in the same way across the three geographical areas. To verify this
statement we used the Log-Rank test, which verifies the null hypothesis that the sur-
vival functions are the same in the three geographical areas. This test confirms our
initial impression and thus we do not reject the null hypothesis (p-value=0.055). We
recall that the number of customers is about 30,000 therefore we can say that the p-
value is not small.
The hazard plot, displayed in Figure 2, shows that in the first month of the study
period the risk of customers churning increases. The risk then decreases sharply and
in the central part of the study period remains stable until the fifth month. Finally, in
the last part of the study, the risk increases again.
We also performed nonparametric analysis with respect to size of customer. How-
ever, the results are similar to those obtained with respect to geographical area.
In conclusion, the nonparametric analysis shows that the above factors do not af-
fect the survival probability of customers of above TLC company. It is interesting to
note that this statement is also confirmed by the Cox model which we illustrate in
Table 4.
Below we show how we verified the proportional assumption. We start by saying
that the proportional hazard model assumes that the hazard ratio of two individuals is
time-independent. This requires that covariates not be time-dependent and, if some
covariates vary with time, this assumption is violated (Lee and Wang, 2003).
In this section, we show two methods for verifying the above assumption: a sta-
tistical test method and a graphical method. Another graphical method consists in
creating a scatter plot of partial residuals of the covariate as y variable and time-to-
event as the x variable. Therefore, below, we show the above methods and in particu-
lar we report the results for the geographical area (geo_zone) factor.
In order to implement the graphical method we use the previous SAS code. How-
ever, in this case we substitute the lls parameter for the s and h parameters.
proc lifetest data=tel;
method=life plot=(lls) width=1 graphics;
time duration*event(0);
strata geo_zone;
run;
This procedure produces the plot displayed in Figure 3, which shows the log-log
survivor functions for each of the three geographical areas. Since the curves are ap-
proximately parallel, we can say, with respect to geographical area, that the propor-
tional assumption is satisfied.
Survival datasets typically take the form of one record per subject. Therefore, for
each subject, there is one measurement for each variable. Then, each of these re-
cords is a vector which assumes a typical format (duration, event, covariates).
However, as mentioned at the beginning of Section 3, the dataset of our TLC com-
The hazard ratio (HR) represents the risk increase of customer churn per unit in-
crease in the covariates. For example, the estimated risk ratio for the dummy variable
flag_preact is 0.77. This means that the hazard of churn for customers who stipulated
a contract with pre-activation is only 77% of the hazard for those who stipulated a con-
tract without pre-activation. For the quantitative covariate val_mobile the risk ratio is
0.74, yielding a decrease of 26%. Therefore, for each one-euro increase in the con-
sumption of mobile telephone traffic, customer churn decreases by an estimated 26%.
It may be observed that the selection of variables excludes the following covariates:
flag_fixed, flag_mobile, geo_zone, dur_fixed, dur_mobile, num_mobile, val_fixed.
Furthermore, Table 4 shows that the Cox model confirms the results of the non-
parametric analysis. In particular, geographical area is not significant, while customer
size is slightly significant (the churn risk for large customers is lower than that for
small customers), whereas the hazard ratio of customer size is very close to one.
Therefore, we can say that, with the available data, these factors do not affect the sur-
vival probability of customers.
Below, we assess the goodness of fit of the estimated Cox model. The graph in
Figure 4 shows the deviance residuals (Therneau et al., 1990) against duration.
The residuals are distributed fairly symmetrically around zero, between -3 and 3,
and in no particular pattern. Therefore we conclude that the model fits the data suffi-
ciently well.
Finally, we evaluate the model for an average customer. Allison (2004) proposes,
after estimating the parameters of Cox model, to estimate baseline survival function
S0 ðt Þ. Thus, it is possible generate, by proc phreg, the estimated survival function
S ðtÞ for average values of covariates by substitution in the equation (3).
The aim of this subsection is not to obtain a definitive result concerning the specific
case in hand but rather to illustrate a method of CLV analysis. We discuss CLV in
terms of market distribution and show some potential applications of CLV in market
strategy analysis. In doing so, we commence a simple, example analysis and discuss
the preliminary results. A more in-depth analysis, in terms of multivariate analysis
and clustering, is beyond the scope of this paper.
In this section, we present a statistical analysis of customer lifetime value based
on two variables: geographical area (geo_zone) and size of customer (cust_size). The
goal of this analysis is to help decision-makers in above TLC company to develop a
more objective marketing strategy. To this end, first we examine a CLV concentra-
tion showing the Lorenz curve and CLV divided into percentiles. Secondly, we ana-
lyse CLV distribution and verify average differences with respect to the above fac-
tors. Finally, we estimate a regression model in order to detect the covariates which
affect CLV. This has produced some interesting results.
Table 5 shows that CLV seems to be highly concentrated in a small 10% of the
customers. In fact, the last 10% of the distribution accounts for approximately 60%
of CLV. Therefore, a small fraction of customers generate a large proportion of rev-
enues for studied TLC company. Specifically, 1% of customers gives a CLV be-
tween A 3,700 and A 81,375.
Figure 6 shows a Lorenz Curve with customers (P axis) and CLV (Q axis).
The concentration area is approximately 70%, confirming that most of the CLV is
highly concentrated among a small share of its customers.
We shall now present the study of CLV distribution with respect to geographical
area and size of customer.
Table 6 shows that Northern Italy generates a large part of CLV. In particular, it
may be noted that, with respect to size of customer, on average, approximately 60%
of CLV is generated by small customers, of whom there are approximately 4 times
as many as there are large customers as we have seen in Table 3. This pattern is es-
sentially the same across the three geographical areas.
It is interesting to verify which customers are the most profitable with respect to
the above two factors. Therefore in Table 7 we report mean CLV with respect to size
of customer.
The Kruskal-Wallis test shows that there is, on average, a statistically significant
difference (p-value<.0001) in CLV generated by customers in the three geographical
areas. In particular, a single customer of Northern Italy on average, generate highest
CLV.
Therefore, recalling the differences in average revenues shown in Tables 2 and 3,
it may be noted that CLV also exhibits differences with respect to the above factors.
Moreover, thanks to the higher costs incurred for large customers, it is interesting to
note that the ratio between the average CLV of large and small customers (Table 7)
is about half the ratio between the average revenues of large and small customers
(Table 3).
To conclude our CLV analysis, we evaluate the impact of the covariates on CLV.
Therefore, we assess how the standardized variables of the dataset of above TLC
company influence CLV. In particular, we are interested in ascertaining how fixed
and mobile telephone traffic contribute to CLV. Table 9 shows the result of our re-
gression analysis. The data are adjusted so well by the model (R2 ¼ 0:92) and the re-
sidual analysis respects the regression assumptions.
It may be observed that the truly significant result is the contribution made by va-
lue generated by mobile and fixed-line telephone traffic. Specifically, the contribu-
tion to CLV of calls made from mobiles is about twice the contribution of calls made
from fixed-line telephones. In fact, as covariates are standardized, for each sigma unit
of mobile or fixed value the CLV increases by approximately A 1,000 and A 500 re-
spectively.
4. CONCLUSIONS
ACKNOWLEDGEMENTS
We would like to thank Nunatac Srl for supporting our research by providing data for the TLC
company. We are also grateful to several anonymous reviewers for helpful comments.
REFERENCES
Aeron H., Bhaskar T., Sundararajan R., Kumar A., Moorthy J. (2008). A metric for customer
lifetime value of credit card customers. Journal of Database Marketing & Customer Strategy
Management, 15, 153-168.
Allison P. (2004). Survival analysis using SAS. A practical guide. SAS Publishing, Cary.
Berger P., Nasr N. (1998). Customer lifetime value: marketing models and applications. Jour-
nal of Interactive Marketing, 12(1), 17-30.
Blattberg R., Deighton J. (1991). Interactive marketing: exploiting the age of addressability.
Fall, 33(1).
Breslow N. (1974). Analysis of censored survival data. Biometrics, 30, 89-99.
Carpenter A., Ake C. (2003). Extending the use of proc phreg in survival analysis. Proceed-
ings of the 11th Annual Western Users of SAS Software, Inc. Users Group Conference. SAS
Institute Inc. Cary.
Efron B. (1977). The efficiency of the Cox’s likelihood function for censored data. Journal of
the American Statistical Association, 72, 557-565.
Gupta S., Hanssens D., Hardie B., Kahn W., Kumar V., Lin N., Ravishanker N., Sriram S.
(2006). Modelling customer lifetime value. Journal of Service Research, 9(2), 139-155.
Jen L., Chou C., Allenby G. (2009). The importance of modeling temporal dependence of tim-
ing and quantity in direct marketing. Journal of Marketing Research, 46(4), 482-493.
Kumar V., Aksoy L., Donkers B., Venkatesan R., Wiesel T., Tillmanns S. (2010). Underva-
lued or overvalued customers: capturing total customer engagement value. Journal of Service
Research, 13 (3), 297-310.
Lee E., Wang J. (2003). Statistical methods for survival data analysis, Wiley, New Jersey.
Peppers D., Rogers M. (1993). The one to one future: building relationships one customer at a
time. Doubleday Business, New York.
Lu J. (2003). Modeling customer lifetime value using survival analysis - An application in the
telecommunications industry. Proceedings of the 28th Annual SAS Users Group International
Conference. SAS Institute Inc., Cary.
Malthouse E., Blattbeg R. (2005). Can we predict customer lifetime value? Journal of Interac-
tive Marketing, 19(1), 1-16.
Reinartz W., Kumar V. (2003). The impact of customer relationship characteristics on profit-
able lifetime duration. Journal of Marketing, 67(1), 77-99.
Therneau T., Grambsch P., Fleming T. (1990). Martingale-based residuals and survival models.
Biometrica, 77(4), 147-160.
Venkatesan R., Kumar V. (2004). A customer lifetime value framework for customer selection
and resource allocation strategy. Journal of Marketing, 68(4), 106-125.