Identifying cross-selling opportunities using lifestyle segmentation and survival analysis

The current issue and full text archive of this journal is available at
www.emeraldinsight.com/0263-4503.htm
MIP
25,4 Identifying cross-selling
opportunities, using lifestyle
segmentation and survival
394
analysis
Received March 2006
Revised February 2007,
Jake Ansell, Tina Harrison and Tom Archibald
March 2007 The Management School, University of Edinburgh, Edinburgh, UK
Accepted March 2007
Abstract
Purpose – To demonstrate the successful use of lifestage segmentation and survival analysis to
identify cross-selling opportunities.
Design/methodology/approach – The study applies lifestyle analysis and Cox’s regression
analysis model to behavioural and demographic data describing 10,979 UK customers of a large
international insurance company.
Findings – There are clear differences between the lifestage segments identified with respect to
customer characteristics affecting the likelihood of a second purchase from the company and the
timeframes within which that is likely to take place. The “mature” segments appear to offer greater
opportunities for retention and cross-selling than the “younger” segments.
Research limitations/implications – The study was limited by the type of data available for
analysis, which related mainly to life insurance and pension products characterised by low transaction
frequency. Different results might be expected for banking or credit-and-loan products. The findings
could be enhanced by incorporating a wider range of customer characteristics into the analysis.
Practical implications – The findings show clear differences in behaviour across the segments
identified, providing a basis on which marketing planners might differentiate marketing and
communication strategies for particular products market segments.
Originality/value – The paper illustrates the adaptation of survival analysis methodology, familiar
in other disciplines but comparatively rare in marketing, to the cross-selling of financial services. It
shows how planners cannot only identify customers most likely to repurchase but also predict the
timeframe in which that will take place.
Keywords Market segmentation, Selling methods, Customer retention, Services, Financial services
Paper type Research paper
Introduction
In mature markets, the need to adopt efficient marketing strategies becomes more
critical. The financial services sector in the UK is an example. Following successive
deregulation initiatives throughout the 1970s and 1980s (for example, abolition of the
“corset” in 1979, changes to the Building Societies’ Act in 1986 and the enactment of the
Financial Services Act 1986), competition has intensified from both traditional players
and new entrants. In this now saturated market, new customers are hard to find, and
Marketing Intelligence & Planning their acquisition tends to be at the expense of competitors. Such aggressive marketing
Vol. 25 No. 4, 2007
pp. 394-410 activity is costly, and does not always lead to long-term gains since customers can
q Emerald Group Publishing Limited
0263-4503
switch easily to other competitors. Consequently, such competition for customers in
DOI 10.1108/02634500710754619 mature markets leads to the phenomenon of “churn” which has also been described as a
“revolving door” (Kamakura et al., 2003) and a “leaking bucket” (Stewart, 1998): a Identifying
constant and varying flow of customers both into and out of the business. cross-selling
In order to reduce the rate of churn, financial institutions are considering ways
of strengthening the relationship with their customers. One way of forging opportunities
stronger ties with customers is through “cross-selling”: the strategy of selling other
products to a customer who has already purchased a product from the vendor,
designed to increase the customer’s reliance on the company and decrease the 395
likelihood of switching to another provider. Another is “up-selling”: inducing the
customer to buy enhanced products, upgrades and add-ons. Given that the costs of
acquiring new customers are increasing, cross-selling represents a cost-effective
means of generating further business from existing customers. According to
Felvey (1982), it is easier for businesses to grow in this way than by attempting to
attract new customers.
However, cross-selling can be a double-edged sword. Too much can upset customers
and make them less responsive, effectively weakening the relationship. Effective use of
the tactic requires that the basic objective of marketing is fulfilled: to offer the right
product to the right person at the right time. The customer database is instrumental in
this process, allowing the financial institution’s marketing planners to learn about its
customers from historic and current behaviour and to make predictions about future
needs and requirements. While most financial institutions gather and retain similar
data on their customers, the value derived from them varies across institutions. This is
due, in part, to the lack of suitable techniques to analyse the data as Kamakura et al.
(2003, p. 3) note:
The development of techniques for the extraction of relevant information from the database
for strategic marketing purposes, often referred to as data mining, has lagged behind the
development of tools for collecting and storing the data.
Thus, many financial institutions possess very large amounts of data from which
limited marketing information is derived.
Cross-selling can allow a firm to develop a continuing relationship with the
customer and hence the potential for further opportunities. In order to identify optimal
cross-selling opportunities, its marketing strategists first need to be able to identify
whether an existing customer is likely to make a subsequent purchase and/or which
product they will buy. In addition to this, it is also important to establish when the
purchase is likely to occur. It may be worthwhile maintaining the relationship if
the likelihood of a purchase is high and the timeframe short, but less rewarding if the
likelihood is low and at some distance in the future. While a number of good techniques
exist to ascertain the probability of subsequent purchases occurring, the timeframe in
which individuals are likely to act remains largely unexplored. In this paper, we apply
“survival analysis” to the task of estimating the timeframe to the next purchase.
Survival analysis is a set of statistical techniques used to determine quantitatively
the impact of a set of variables (such as customer characteristics) on the time to the
occurrence of an event (such as a subsequent purchase). It has been applied
successfully in such areas as medicine (Collett, 1994) and industry (Ansell and Philips,
1994), but has not been used widely in predicting customer behaviours despite clear
potential for successful application in that domain. This paper reports the application
of survival analysis to customer data supplied by a large international insurance
MIP company, to predict which customers are most likely to make a repeat purchase from
25,4 the company and when.
Unlike many other retail customers, those buying financial services generally do not
purchase new products frequently. Purchases tend to be at specific times in the
lifecycle, reflecting the needs and circumstances of particular lifestages. Thus, in order
to achieve better understanding and prediction of customer behaviour in this context,
396 our analysis also takes account of lifestage.
Cross-selling
The rationale for cross-selling, defined in the introduction as “the strategy of selling
other products to a customer who has already purchased a product from the vendor” is
not only to “increase the customer’s reliance on the company and decrease the
likelihood of switching to another provider” but also to exert a generally positive
influence on the relationship with the customer, strengthening the link between
provider and user (Kamakura et al., 2003). Increasing product holding leads to an
increased number of connection points with customers, as well as increasing the
switching costs they would face if they decided to take their custom elsewhere
(Srivastava and Shocker, 1987). The likelihood of defection is thereby reduced.
Increased product holding also creates a situation in which the company can get to
know it customers better through a greater understanding of buying patterns and
preferences. This, in turn, puts it in a better position to develop offerings that meet
customer needs. Consequently, it is argued (Kamakura et al., 2003) that cross-selling
increases the total value of a customer over the lifetime of the relationship.
Despite the apparent importance of cross-selling for relationship development and
profitability, it seems that many financial marketers rely on intuition and experience
for their related strategic decisions (Prinzie and Van den Poel, 2006). Indeed, Evans
(2002) notes that cross-selling rates remain low among banks in Europe. Furthermore,
the subject has received limited attention in the marketing literature. When it is dealt
with, the focus is on methodologies for identifying common patterns in acquisition
products by customers, based on ownership or usage data. While a substantial amount
of research exists on the acquisition sequence for consumer durables, such as studies
by Hebden and Pickering (1974), Kasulis et al. (1979) and Prinzie and Van den Poel
(2006), there is little in the context of financial services.
Stafford et al. (1982) followed their research into acquisition of consumer durables
with a study that found evidence for the existence of a common acquisition pattern
with respect to financial services. They describe an acquisition sequence from cheque
accounts to simple savings accounts, to insurance, stocks, bonds and mutual funds,
which was found to be relatively constant across three cohort groups. A study by
Kamakura et al. (1991) both describes and predicts purchase sequences. It investigated
the influence of the financial maturity of the customer (linked to lifestage) and the
acquisition difficulty of the service (such as resources required, level of risk and
liquidity, information costs) on the acquisition sequence. Financial services and
consumers are positioned along a continuum of “latent” difficulty/ability, expressing a
hierarchy of investment objectives. The probability that an investor owns a particular
financial product is a function of that person’s position on the continuum relative to
that of the financial product. The authors hypothesise that the more “difficult” services
are acquired in the later stages of the family life cycle. The research clearly illustrates
the link between the acquisition sequence and stages of the lifecycle. According to Identifying
Prinzie and Van den Poel (2006), these were the first researchers to make explicit use of cross-selling
purchase sequence for cross-sell purposes.
Similar studies have been conducted, focusing on financial products used for asset opportunities
accumulation (Paas, 1998; Soutar and Cornish-Ward, 1997) and financial products
facilitating financial transactions (Paas, 2001). Kamakura et al. (2003) also extended
their original study to incorporate a data-augmentation tool that combines information 397
from the customer database and from surveys. They argue that the use of single-source
data are not sufficient for understanding of the complete buying needs of the customer:
they exclude the possibility that the customer may already hold competitors’ products.
Marketing planners need to know about each customer’s usage of their own and
competing products, but this depth of market intelligence is not readily available,
unless in survey format, and typically provides only snapshots in time.
The studies summarised here have employed various statistical models and
methods. Broadly speaking, purchase sequences have been described as either a
hierarchical process (such as the Guttman scalogram analysis and latent-trait analysis
used by Kamakura et al.) or a succession of purchases (such as the Markov models
used by Prinzie and Van den Poel). Hierarchical models assume that purchases are
consecutive; the same assumption does not apply to sequence models (Agrawal and
Srikant, 1995). While these models have achieved the marketing objective of knowing
which product to offer next and, in some cases, to whom, they do not address the
question of when to make it available. To do so requires a sequence model with a focus
on time, such as survival analysis. This paper attempts to fill that gap and, in doing so,
to fully address the marketing objective of making the right product available to the
right person at the right time.
Survival analysis
Survival analysis is a set of non-parametric, semi-parametric and parametric statistical
techniques used to determine quantitatively the impact of a set of potentially
influential variable on the elapsed time to the occurrence of an event, such as death or
the failure of a component (Prentice et al., 1981; Ansell and Ansell, 1987; Collett, 1994;
Ansell and Philips, 1994). The techniques are well established, widely accepted and
extensively used in biometrics (where survival analysis was first developed),
engineering and event history analysis. Survival analysis is being used increasingly in
the organisational behaviour and strategy fields (Blodgett, 1992; Chen and Lee, 1993;
Staber, 1992). It has also been applied to credit scoring, to predict the time to default
(Stepanova and Thomas, 2002).
This analytical technique can also be used in the context of cross-selling (and
up-selling), to predict when the next purchase will be made. In other words, the aim is to
predict when an existing customer might carry out one of the actions shown in Figure 1,
which illustrates the behaviour of a customer who has already made two purchases at
time intervals t1 and t2. The objective is to predict the next action and when it will occur
(t3). There are a number of future behaviours that the customer might exhibit, including
taking up further products, surrendering existing ones or defecting from the company
altogether. Figure 1 shows the separate events or behaviours with the passage of time
and also the customer’s progress through lifestages.
MIP Positive
25,4 Past Actions Future Actions
Saving Plan
Extend Mortgage
Client …
398 Actions
Surrender Policy
Defection
Negative
t1 t2 t3 time
Figure 1. Time of
Predicting customer observation
behaviour over time
Lifestages Young Single Setting Up Home Middle Age Retired
In the study reported here, survival analysis is used to estimate when customers are
likely to make their next financial services purchase. The possibility also exists of
providing an indication of the timeframe beyond which subsequent purchases are
unlikely to be made.
Being able to predict when a certain behaviour is likely to occur is of particular
importance to the timing element of marketing campaigns. For example, survival
analysis can assist planners in understanding the time-points at which customers are
most likely to be receptive to marketing communications initiatives, and also those
beyond which further effort is likely to be ineffective, thereby reducing the amount of
wasted marketing effort.
Survival analysis can also be useful in forming a judgement of the value of a
customer. If the likelihood of repurchase in the near future can be established,
assumptions can be made about the future profitability of a customer’s business.
Moreover, being able to predict the specific type of purchase can provide an even more
precise indication of future worth. Though these inputs are clearly useful in the
targeting of potentially profitable customers, the details of profitability analysis are
outside the scope of this paper. For an overview, see Zeithaml (2000).
Survival analysis is particularly suited to the study of cross-selling, in allowing the
time element to be analysed. It does not necessarily make strong assumptions about
the underlying distributions, such as the frequently made assumption of normality.
The approach can deal efficiently with cases when “censoring” occurs: for example, if
the event occurs beyond the observation period or if cases have been removed before
the end of the observation period and before the event occurs. It can also be applied to
longitudinal data.
In using Survival analysis to predict the time to the next event, or lifetime, the
objective is to identify the customer characteristics that affect survival. For example, in
the context of financial services, these “covariates” are to be found in customer
information held by the financial institution in question, such as age, gender, marital
status and other demographic information.
Application Identifying
Survival analysis comprises two elements: time effect and individual effect. The first of cross-selling
these can be described as an underlying survivor function shared among the whole
population. The second describes the difference of the individual from a base point, in opportunities
terms of the covariates. These two functions together describe quantitatively the
probability of an event taking place for a specific individual. A useful feature is the
ability to be able to generate separate survivor functions or hazard functions for 399
different groups within a population. The estimation can be either non-parametric,
semi-parametric or parametric. The first two categories are best suited to exploratory
investigations, since they make limited assumptions about the distribution of the data.
Parametric regression models require the nature of that distribution to be known
(exponential, Weibull, log-normal), and the choice of regression model is chosen
accordingly.
In the study reported here, the underlying population distribution was not known
prior to modelling. The decision was therefore made to proceed on an exploratory
basis, making the weakest assumptions about its nature. The analysis was based on
Cox’s (1972) Proportional Hazard’s Model, a semi-parametric model which allows the
data to determine the underlying distribution. If it had in fact been known a priori, then
a parametric model based on the known distribution would have been more
appropriate.
The analysis is divided into two parts: a description of the time structure of the
population and an assessment of the impact of the covariates on the likelihood of the
event occurring. The model assumes that the covariates have a proportional effect on
the time structure, but this can be relaxed. Thus, Cox’s model considers the probability
of an event occurring in the small interval (t,t þ dt) assuming the event has not
occurred before t. This measure is sometimes called the “hazard” and can be written as:
l0 ðtÞexpðx0 bÞ
where: l0(t) is the time structure; exp(x0 b) represents the effect of the individual
characteristics or covariates, x; and b is a set of parameters to be estimated.
Cox’s model is thus similar to other regression models for estimation of the specific
customer likelihood to repurchase relative to other customers, such as logistic
regression. It also provides a ranking of customers according to the likelihood of
repurchase. This information is of particular value in pinpointing the most suitable
targets for marketing campaigns.
However, the real advantage of survival analysis over other regression models is the
information provided on the time to next purchase. In Cox’s model, time is continuous,
whereas the application of logistic regression (in credit scoring, for example) assumes
discrete episodes of time. Continuous treatment of time is of particular value in
assessing the potential for a continued customer relationship. Assessment of the
quality of fit can be in terms of the likelihood, which can be compared to x2 statistics on
specific degrees of freedom or prediction in terms of such measures as Area Under the
Receiver Operator Curve, AUROC (Hand, 1997; Thomas et al., 2002), which measures
the ability of the model to determine the correct outcome.
Figure 2 shows an “idealised” survival curve, typical of some customer behaviour
we have found in the financial services sector. It plots the probability of not
repurchasing against time. The curve slopes downward, but then seems to level out at
MIP a certain point. The proportion below the asymptote represents those who are unlikely
25,4 to repurchase; in this case, the asymptotic value is 0.25, indicating a quarter of the total
number under study. Depending on how the data are gathered, the time can be
represented in days, months or years, to show the timeframe in which repurchase is
most likely for the remaining three quarters. Combined with the repurchase likelihood
ranking mentioned above, the possibility arises of identifying the prime targets among
400 those likely re-buyers.
As with all forms of statistical analysis, there are limitations with this procedure. If
there are relatively few observations, the survival curves will not be as smooth as in
Figure 2 and will take the shape of a descending staircase. This makes it more difficult
to ascertain when a certain percentage of the population under investigation will have
repurchased. Equally, if the proportion of censored data (unobserved purchases) is
high, it may again be difficult to determine the timescale to a given percentage of
repurchases. Moreover, the precision of the estimates of the coefficients associated with
the covariates will be affected in both these cases.
Clearly, care needs to be taken with the selection of the time variable employed. In
some contexts, calendar time may be less appropriate than some other timeframe, such
as time available to purchase. As with other regression methods, the choice of variables
to be included or excluded will have an impact on the outcome. The issue of co-linearity
arises where covariates are highly correlated. If an important measure is missing from
the analysis, the model might be poorly specified.
Since, Cox’s model is based on an iterative approach, there are occasions when the
estimation procedure does not converge. This may be due to over-representation in the
sample of one of the groups within the older repurchase times.
Data
The data analysed were collected from a randomly generated sample of customer
records in the recently established data warehouse of a large international insurance
company. To protect confidentiality, the sample was limited to 10 percent of the entire
database, amounting to 10,976 customers in total. Although determination of the
sample size was beyond our control, it was considered to be a large enough basis for
the analysis and sufficiently representative. The demographic profile of the entire
customer database was not known, but is assumed to be similar to that of other large
financial institutions, which in turn reflects the demographic profile of the market for
retail financial services.
Probability of
Survival 1
0.25
Figure 2. 0
Idealised survivor
function
Time
Data relating to both customer characteristics and product purchases were available Identifying
within the database. In terms of customer characteristics, they were: current age of the cross-selling
individual; age of the individual at the time of the first purchase from the company;
gender; marital status; and “Financial ACORN” classification. Standing for A opportunities
Classification of Residential Neighbourhoods, ACORN is a geodemographic
classification based on census data. It classifies the population of the UK, by
household, into 17 “groups” and 54 “types”. Financial ACORN focuses exclusively on 401
the consumption of financial services.
The company studied derives the information in its database primarily from
application forms and policy details captured at the time of purchase. For most
variables, the information available was accurate and complete for most variables,
except that marital status suffered from a large proportion of omissions and the
possibility of inaccuracy. This is a challenge that all financial institutions face in
capturing such data, unless they can be updated regularly. Financial ACORN
classification is externally provided information, added to the customer files by postcode
matching. (In the UK, a unique domestic postcode is shared by about 15 households).
The decision was made to capture behavioural data on the first five purchases
from the company only. While some customers would have made more than that
number, the vast majority within the database had made only two; just less than
half (44 percent) of the 10,976 customers in the sample had bought at least twice.
Owing to the relatively small proportion of customers holding multiple products,
the analysis focused only on the prediction of the second purchase.
Table I shows the ratio, ordinal and nominal variables used in the subsequent
analysis and their properties. The variables were not coarse-classified, as is often the
case in credit scoring.
The two-stage analysis procedure used consisted of an initial segmentation of the
population into lifestage segments, followed by the application of Cox’s Proportional
Hazard Model. Owing to the link identified by Kamakura et al. (1991) between financial
purchase sequence acquisition and the life cycle, the decision was taken to segment the
sample first into lifestage segments and to look within the segments. That was
achieved by K-means clustering, chosen as a rapid method for obtaining the clusters
(Punj and Stewart, 1983). Two criteria were used to determine the most appropriate
cluster solution: inspection of the squared mean error value, and the size of the clusters.
Variable Description Property
Age The current age of the customer. Measured in years Ratio, discrete
Age.Sdt The age at which the customer made their first Ratio, discrete
purchase from the company. Measured in years
Cur.Mrtl Current marital status: single, married, divorced, Nominal, discrete
separated or unclassified
ACORN Financial ACORN classification: A – financially Ordinal, discrete
sophisticated, B – financially involved, C –
financially moderate, D – financially inactive and a
final group to represent those unclassified
Gender Male or female Nominal, discrete Table I.
Product A range of product groups including, among others, Nominal, discrete Customer information
personal pensions and investments available for analysis
MIP The segments were defined according to the customer characteristics available from
25,4 the database, as presented in Table I. Cluster descriptors were chosen to reflect the
behaviour and characteristics of the groups, and were influenced by ACORN
terminology. For example, use of the term “moderate” relates to a basic engagement
with financial products often referred to as “foundation products” (Kamakura et al.,
1991). “Financially involved” individuals have bought a broader range, and
402 “sophisticated” defines those who have purchased more complex and riskier
products. The next stage of the analysis considered the propensity of each cluster to
make a subsequent purchase and the estimation of survival functions.
Findings
Cluster analysis
Cluster analysis established six clusters with distinct features, based on the
comparison of within and between cluster variability. Figure 3 shows a graphical
representation of the cluster profiles. The position of the clusters in the
two-dimensional space represents the relative age and degree of financial
sophistication (according to Financial ACORN) of the individuals they contain. The
size of the circle represents the relative size of the cluster, and the gender distributions
are highlighted within each circle. The Appendix shows the relative percentages
within each cluster exhibiting the characteristics used in the analysis.
Cluster 2 is the largest, accounting for 31 percent of the sample but containing the
least financially sophisticated customers; Clusters 3, 5 and 6 contain the most
financially sophisticated customers, but Cluster 6 is the smallest, with just 3 percent of
the total.
In terms of the age, Clusters 1-3 have the lowest average ages 46, 45 and 44,
respectively. Cluster 5 is in the middle of the age range, with an average age of 51,
leaving the oldest customers in Clusters 4 and 6, the average ages being 65 in
both. The pattern with respect to age at first purchase from the company is similar.
6. Sophisticated
late starters
65 Yrs
M F F
M
5. Sophisticated
4. Financially moderate
middle agers
seniors
F
M
50 Yrs
AGE
2. Moderately financially 1. Financially involved

active adults adults
3. Sophisticated
early starters
F F
M F
45 Yrs M
M
Figure 3.
Six lifestage clusters Low High
FINANCIAL SOPHISTICATION
Cluster 3 contains the youngest individuals, with an average age of 27, followed by Identifying
Clusters 2, 1 and 5, which leaves Clusters 4 and 6 with those who were the oldest at first cross-selling
purchase. It is notable that the difference between age at start and current age is higher
for the younger clusters than the older. For example, the difference is 12 years in opportunities
Cluster 1 but only four in Cluster 4.
Turning to gender representation and marital status, the distinctions between the
clusters are less obvious. In terms of gender, Clusters 1-3 contain approximately 403
two-thirds men, whereas Clusters 4-6 show a more even distribution of male and female
individuals. With respect to marital status, a large proportion of the sample remained
unclassified, due to the difficulty in maintaining accurate records of a status that could
change over time. Of the customers for whom a description was at least available,
whether accurate or not, most were married. There is a slightly higher proportion of
single individuals in Clusters 1-3, compared with 4 and 5, and especially with Cluster 6.
In terms of the relative degree of financial involvement and purchase activity,
Cluster 3 contains the most financially active: almost two-thirds of those customers had
made two purchases from the company, whereas only a third of those in Clusters 4 and
6 had done so. Looking at multiple product holding, 19 percent of the individuals in
Cluster 3 had bought 4 products from the company, compared with only 2 percent in
Cluster 6. The rank order of financial activity seems to suggest that Cluster 3 is the
most active, Clusters 4 and 6 the least active, while Clusters 1, 2 and 5 occupy the
middle ground.
Survival analysis
The purpose of the survival analysis was to ascertain the re-purchasing propensity of
each Cluster and to provide an estimate of the likely timeframe. A generic model was
produced, based on the whole sample, along with separate models for each of the
clusters 1-5. Cluster 6 was excluded from the survival analysis on account of its small
size.
The variables used were the same as in the cluster analysis. The difference between
age at start of the relationship and current age represents the duration of the
relationship with the company, which is captured in the difference between the two
measures in several of the models. Age is thus reflected by either age at the start of the
relationship or current age. These two variables were found to be highly correlated,
and it may have been sufficient to use age at the start of the relationship alone. The
decision was taken to retain current age in the analysis for three reasons. First, it had
already been used in the segmentation analysis to identify age-based lifestage
segments; second, age is a useful descriptor for the subsequent profiling of the
segments; third, there were few other customer characteristics available in the
database, and it seemed desirable to retain as much customer information as possible.
An extra variable in the analysis of Clusters 2-5 was the particular product
purchased. It could not be used in either the generic model or the model for Cluster 1
because of instability in the estimation of the parameters involved. The fitting
approach was forward stepwise selection, selecting a variable at a stage to add to the
model until no further variable proved to be significant.
Table II provides a summary of the analysis of the fitting. All models produced a
significant fit to the data at the 5 percent level of significance, with Cluster 4 the
weakest at 3.2 percent. This reflects the size of the sample and hence of the clusters:
MIP Cluster 4 was the smallest after Cluster 6 had been withdrawn from the analysis. Using
25,4 the AUROC criteria, all provide reasonable prediction, with Cluster 3 weakest and the
general model performing best.
The variables included in each model are shown in Table III. The analysis clearly
indicates that cluster behaviour differs in terms of the effects of variables on the
likelihood of a second purchase being made. The results for Clusters 1-3 are similar.
404 Current age has a positive effect on likely subsequent purchase, while age at first
purchase decreases the potential. This is explained partly by the collinearity of these
two variables. The finding means that the older the current age of a customer, the
greater the likelihood that a second purchase will be made. However, the older the
customer at the time of the first purchase from the company, the lower the likelihood of
a second purchase. This reinforces the belief that it will be beneficial to make efforts to
attract younger customers, with the aim of developing longer term relationships with
them.
The analysis suggests that being married increases the likelihood of a second
purchase, and being female decreases it. This is consistent with previous research,
which has found men to be more financially involved than women, in general
(Harrison, 1997). Indeed, research commissioned by the Financial Services Authority
(2001) found that, while the take-up of financial products by women had increased
throughout the 1990s, their level of involvement generally was still below that of their
male counterparts. This is more pronounced for married and co-habiting women, due
to the devolution of financial responsibility to the husband or partner when part of a
couple.
Financial ACORN exhibits an effect on Clusters 1 and 2, indicating that the greater
the financial sophistication, the greater the likelihood of repurchase. The results are the
All Cluster 1 Cluster 2 Cluster 3 Cluster 4 Cluster 5
Likelihood 1,009 234 405 199 12.2 122

DF model 20 9 14 13 5 14
Probability 0.000 0.000 0.000 0.000 0.032 0.000
Sample size 10,976 4,784 3,493 1,495 404 800
Table II. AUROC 0.848 0.738 0.782 0.669 0.813 0.785
Summary of the survival Upper CI for AUROC 0.839 0.721 0.764 0.637 0.752 0.748
analysis Lower CI for AUROC 0.856 0.756 0.800 0.701 0.874 0.822
All Cluster 1 Cluster 2 Cluster 3 Cluster 4 Cluster 5
Variables included Age Age Age Age Cur.Mrtl Age.Sdt

Age.Sdt Age.Sdt Age.Sdt Age.Sdt Gender Curr.Mrtl
Cur.Mrtl Cur.Mrtl Cur.Mrtl Cur.Mrtl ACORN
ACORN ACORN ACORN Product Product
Gender Gender Gender
Table III. Variables excluded Product Product
Variables included in the
survival analysis for Notes: Age.Sdt – age at first purchase; ACORN – Financial ACORN category; Cur.Mrtl – current
specific clusters marital status; Prod – type of product purchased
same for the whole sample in the generic model, given the dominance of Clusters 1 and Identifying
2. Financial ACORN was not significant for Cluster 3, where product purchased had a cross-selling
significant effect on the likelihood of repurchase. This could be because that cluster
contained customers with the lowest average age when they first purchased a product opportunities
from the company. It is at that point in their life cycle that important first purchases are
often made, such as mortgages, which can act as important triggers for subsequent
purchases. 405
Clusters 4 and 5 exhibit distinctly different behaviour. Age is not significant in
either of the two, perhaps due partly to the limited spread of ages within the clusters.
Age at first purchase is not significant for Cluster 4 but is for Cluster 5, indicating that
the higher the age at first purchase the lower the likelihood that a second purchase will
be made. Gender has no effect for Cluster 5 but, in Cluster 4, being female increases the
probability of a second purchase. Being married positively affects the chance of a
second purchase in Cluster 4, but has a negative effect in Cluster 5.
With the exception of the excluded Cluster 6, Cluster 4 contains both the oldest
customers and the largest proportion of women. Given its lifestage profile, it might be
reasoned that many of those females would be widowed and in charge of their own
finances, meaning that the notion of deferring to the male head of the household in
financial matters would not necessarily hold true for this cluster. In Cluster 5, both the
Financial ACORN classification and the product purchased had an effect on
repurchasing behaviour. Neither was significant for Cluster 4.
This analysis shows clearly that Clusters 4 and 5 are different from Clusters 1, 2 and
3 in terms of the variables which affect the likelihood of a second purchase, providing
empirical support for the argument made earlier for lifestage segmentation, and
reinforces the behavioural differences between lifestages.
Turning to the time dimension of the second purchase, the analysis turns up some
interesting cluster differences, further reinforcing the differential behaviour of market
segments with respect to financial services. Clusters 1, 2 and 3 display survival curves
similar to the generic survival curve, as shown in Figure 4. Clusters 4 and 5 are similar
to each other but distinctly different from the others in terms of both the shape and
apparent asymptotic level of the curves. Figure 5 shows the survival curve for Cluster 5.
For Clusters 1-3 and the generic model, the curves decrease rapidly for the first 2,000
days, or approximately six years, and then start to become asymptotic. This suggests
that customers who will purchase again will do so within that period. Beyond this
timeframe, the likelihood of repurchase rapidly decreases. The asymptotic value from
the curves indicates the proportion of customers in this sample unlikely to make a
second purchase. For the generic model the value of 0.35 indicates that to be just over a
third of the total. The result is the same for Cluster 1, whereas the value of 0.50 for
Cluster 2 suggest that as many as half of those customers would be unlikely to make a
second purchase. Cluster 3 exhibits a better repurchase probability than the average,
with the asymptotic value of 0.22 suggesting that somewhat more than three quarters
of those customers might repurchase.
In case of Clusters 4 and 5, the curves show a much gentler decrease and no obvious
asymptotic behaviour. This suggests that individuals in these clusters are still likely to
make a second purchase over time. However, with the exception of Cluster 6, they are
the smallest subsamples in the study.
MIP 1.0
25,4
0.8
Survival probability
0.6
406
0.4
0.2
0.0
Figure 4.
Survival curve for the 0 5000 10000 15000
generic model
Time in days
1.0
0.8
Survival probability
0.6
0.4
0.2
Figure 5. 0.0
Survival curve for
0 2000 4000 6000 8000
cluster 5
Time in days
Discussion
The survival analysis has shown similarities in behaviour among “Moderately
Financially Active Adults” “Financially Involved Adults” and “Sophisticated Early
Starters” in that a proportion of those segments are unlikely to return to make another
purchase. For those who do return, the timeframe within which repurchase is likely to
take place is estimated to be about six years. While this may seem to be a long interval
at first sight, financial products are not normally purchased on a frequent basis. One
exception would be general insurance cover renewed yearly, but mortgages are
renewed or revised much less frequently and pensions may be purchased only once in a
lifetime.
“Sophisticated Middle Agers” and “Financially Moderate Seniors” displayed
convergent behaviour patterns, but differed markedly from the other three
segments. The apparent lack of asymptotic behaviour in the survival curves for
these two suggests that these individuals are still likely to make a further Identifying
purchase. The retention opportunity of these segments extends beyond the cross-selling
timeframe of the others.
Thus, the findings indicate that two broader groups of segments exist, in terms opportunities
of retention propensity. The “mature” segments exhibit a greater likelihood of
retention than the “younger” segments, which is both interesting and consistent
with work by Moschis et al. (1997, 2002) suggesting that mature consumers like to 407
build relationships with companies. Moreover, relationships are more likely to be
maintained with mature consumers because, while they may be cynical at times,
they are more likely to trust companies and their employees than younger
customers are.
This is a significant finding in that the UK, in common with a number of other
industrialised nations, is experiencing an ageing of the population and a consequent
increase in the proportion of older people. Those over 55 are particularly attractive in
financial terms, many of them being comparatively wealthy: income rich, asset rich
and the recipients of substantial windfalls in the form of inheritance (Silman and
Poustie, 1994). Not surprisingly, they have been found to account for more than half of
all discretionary spending (The Economist, 2002).
However, the two “mature” segments identified in this study are the smallest,
accounting for only 10 percent of the sample collectively, compared with the 31 percent
in the “Moderately Financially Active Adults” segment alone. This has clear
implications for marketing strategy.
Managerial implications
The highly competitive nature of the financial services sector requires its marketing
planners to find effective approaches for maintaining relationships with customers.
Unlike other areas of consumption, the time lag between purchases is often relatively
long and the needs of customers tend to vary with lifestage. In such a context, the
importance of maintaining a relationship with customers is paramount, since the cost
of “cross-selling” and “up-selling” to existing customers is likely to be less than that of
acquiring new customers.
This paper has shown that, by using a technique such as survival analysis, it is
possible to ascertain not only the likelihood of subsequent purchases being made but
also the timeframe in which that is likely to occur. In the first instance, it is important
for marketers in financial institutions to understand the relative likelihood that a
customer or set of customers will or will re-purchase. This can be achieved at the level
of the entire customer base, or for specific customer groups if it known that segment
differences exist. Such knowledge forms a sound basis for decisions about the
allocation of targeting effort.
Survival analysis can further estimate the timeframe in which re-purchasing may
take place. For marketing planners, this is especially important in identifying windows
of opportunity for effective marketing communications. The consequences of being
“too early” or “too late” in this respect are well understood by marketers. If initiatives
can be timed to reach customers or prospects when they are likely to be in their “ready
to buy” phase, the impact will be enhanced and wastage will be reduced.
Owing to relatively small number of customers holding three, four or five of one
company’s products among the almost 11,000 whose data were analysed in this study,
MIP it was not possible to conduct a meaningful analysis of transactions after the second
25,4 purchase. In other circumstances, given the availability of the necessary data, it would
be feasible to estimate the time to subsequent repurchasing.
The current study has not considered what the second purchase might be, but that
is the subject of ongoing work by the authors. This analysis could be dealt with in
several ways, for example, by applying a competing risk model or by treating it as a
408 semi-Markovian problem, with times between states modelled using parametric and
non-parametric distributions with an estimated transition matrix.
Conclusion
The objective of the study reported here was to use information gathered from a data
warehouse to develop insights into a customer base and to explore the marketing
opportunities that might arise. Based on a large sample of the customers of an
international insurance company, the paper illustrates the application of the Survival
analysis method to the study of cross-selling. Cox’s proportional hazard regression has
several advantages over other similar regression techniques, chief among which is the
treatment of time as a continuum rather than as discrete episodes, which reflects the
reality of a customer relationship.
The first stage of the analysis used standard clustering approaches to segment the
sample into identifiable sub-groups, and found that those exhibited different buying
behaviours, which were potentially the basis for decisions about the most appropriate
marketing campaign strategy for each one. The study also identifies the variables or
characteristics within the identified market segments that have the greatest effect on
the likelihood of repurchasing, thus allowing a closer targeting of likely targets for
cross-selling initiatives.
References
Agrawal, R. and Srikant, R. (1995), “Mining sequential patterns”, Proceedings of the 11th
International Conference of Data Engineering (ICDE).
Ansell, R.O. and Ansell, J.I. (1987), “Modelling the reliability of sulphur sodium batteries”,
Reliability Engineering, Vol. 17, pp. 127-37.
Ansell, J.I. and Philips, M.J. (1994), Practical Methods for Reliability Data Analysis, Clarendon
Press, Oxford.
Blodgett, L.L. (1992), “Research notes and communications factors in the instability of
international joint ventures: an event history analysis”, Strategic Management Journal,
Vol. 13 No. 6, pp. 475-81.
Chen, K.C.W. and Lee, C-H.J. (1993), “Financial ratios and corporate endurance: a case of the oil
and gas industry”, Contemporary Accounting Research, Vol. 9 No. 2, pp. 667-94.
Collett, D. (1994), Modelling Survival Data in Medical Research, Chapman and Hall, London.
Cox, D.R. (1972), “Regression models and life tables”, Journal of Royal Statistical Society, Series B,
Vol. 74, pp. 187-220.
Evans, M. (2002), “Prevention is better than cure: redoubling the focus on customer retention”,
Journal of Financial Services Marketing, Vol. 7 No. 2, pp. 186-98.
Felvey, J. (1982), “Cross-selling by computer”, Bank Marketing, pp. 25-7.
Financial Services Authority (2001), “Women and personal finance: the reality of the gender
gap”, Consumer Research, Vol. 7, April.
Hand, D.J. (1997), Construction and Assessment of Classification Rules, Wiley, Chichester. Identifying
Harrison, T. (1997), “Mapping customer segments for personal financial services: replication and cross-selling
validation”, Journal of Financial Services Marketing, Vol. 2 No. 1, pp. 39-54.
Hebden, J.J. and Pickering, J.F. (1974), “Patterns of acquisition of consumer durables”, Oxford
opportunities
Bulletin of Economics and Statistics, Vol. 36, pp. 67-94.
Kamakura, W.A., Ramaswami, S.N. and Srivastava, R.K. (1991), “Applying latent trait analysis
in the evaluation of prospects for cross-selling of financial services”, International Journal 409
of Research in Marketing, Vol. 8, pp. 329-49.
Kamakura, W.A., Wedel, M., de Rossa, F. and Mazzon, J.A. (2003), “Cross-selling through
database marketing: a mixed data factor analyzer for data augmentation and prediction”,
International Journal of Research in Marketing, Vol. 20 No. 1, pp. 45-65.
Kasulis, J.L., Lusch, R.F. and Stafford, E.F. Jr (1979), “Consumer acquisition patterns for durable
goods”, Journal of Consumer Research, Vol. 6, pp. 47-57.
Moschis, G.P., Lee, E. and Mathur, A. (1997), “Targeting the mature market: opportunities and
challenges”, Journal of Consumer Marketing, Vol. 14 No. 4, pp. 282-93.
Moschis, G., Bellenger, D. and Curasi, C. (2002), “Financial service preferences and patronage
motives of older consumers”, Journal of Financial Services Marketing, Vol. 7 No. 4.
Paas, L.J. (1998), “Mokken scaling characteristic sets and acquisition patterns of durable and
financial products”, Journal of Economic Psychology, Vol. 19 No. 3, pp. 353-76.
Paas, L.J. (2001), “Acquisition patterns of products facilitating financial transactions: a
cross-national investigation”, International Journal of Bank Marketing, Vol. 19 No. 7,
pp. 266-75.
Prentice, R.L., Williams, B.J. and Peterson, A.V. (1981), “On regression analysis of multivariate
failure data”, Biometrika, Vol. 68, pp. 373-9.
Prinzie, A. and Van den Poel, D. (2006), “Investigating purchasing-sequence patterns for financial
services using Markov, MTD and MTDg models”, European Journal of Operational
Research, Vol. 170 No. 3.
Punj, G. and Stewart, D.W. (1983), “Cluster analysis in marketing research: review and
suggestions for application”, Journal of Marketing Research, Vol. 20, pp. 134-48.
Silman, R. and Poustie, R. (1994), “What they eat, buy, read and watch”, Admap, July/August,
pp. 25-8.
Soutar, G.N. and Cornish-Ward, S.T. (1997), “Ownership patterns for durable goods and financial
assets: a Rasch analysis”, Applied Economics, Vol. 29 No. 11, pp. 903-11.
Srivastava, R. and Shocker, A.D. (1987), “Strategic challenges in the financial services industry”,
in Pettigrew, A. (Ed.), The Management of Strategic Change, Basil Blackwell, Oxford.
Staber, U.H. (1992), “Organizational interdependence and organizational mortality in the
cooperative sector: a community ecology perspective”, Human Relations, Vol. 45 No. 11,
pp. 1191-212.
Stafford, E.F., Kasulis, J.J. and Lusch, R.F. (1982), “Consumer behaviour in accumulating
household financial assets”, Journal of Business Research, Vol. 10, pp. 397-417.
Stepanova, M. and Thomas, L. (2002), “Survival analysis methods for personal loan data”,
Operations Research, Vol. 50, pp. 277-89.
Stewart, K. (1998), “An exploration of customer exit in retail banking”, International Journal of
Bank Marketing, Vol. 16 No. 1, pp. 6-14.
The Economist (2002), “Over 60 and overlooked”, The Economist, US Edition, 10 August.
MIP Thomas, L.C., Edelman, D.B. and Crook, J.N. (2002), “Credit Scoring and its Applications”, SIAM
(Society for Industrial and Applied Mathematics), Philadephia, PA.
25,4
Zeithaml, V.A. (2000), “Service quality, profitability and the economic worth of customers: what
we know and what we need to learn”, Journal of the Academy of Marketing Science, Vol. 28
No. 1, pp. 67-85.
410 Appendix Cluster characteristics, as percentages of sample
Cluster number
Characteristic 1 2 3 4 5 6
Cluster size
Proportion of sample 29 31 18 7 11 3
Financial ACORN
A – Financially sophisticated 100.0 100.0 100.0
B – Financially active 100.0 37.5
C – Financially moderate 40.6 44.9
D – Financially inactive 20.6 5.6
Unclassified 38.8 12.0
Current age
Mean age 45.7 45.1 43.9 64.8 50.9 65.0
Age at first purchase
Mean age at first purchase 33.4 31.7 26.7 60.8 42.3 62.6
Gender representation
Male 61.5 63.9 66.7 57.2 58.3 57.5
Female 38.5 36.1 33.3 42.8 41.7 42.5
Current marital status
Single 7.0 8.1 7.1 1.3 3.3 0.2
Married 18.4 15.4 16.6 12.1 24.0 7.6
Separated 0.4 0.4 0.4 0.1 0.5 0.3
Divorced 0.9 0.9 0.8 1.5 1.4 0.3
Widowed 0.2 0.1 0.1 1.2 0.2 0.8
Unclassified 73.0 75.1 75.0 83.8 70.6 90.7
No of purchases with the same company
1 product 53.5 58.4 36.2 74.9 60.3 80.5
2 products 11.2 10.6 10.1 13.4 13.2 10.3
3 products 12.2 9.9 14.6 5.9 11.3 6.2
4 products 11.6 9.9 19.0 3.4 8.6 1.6
Table AI. 5 or more products 10.6 10.4 19.1 2.0 6.2 1.0
Corresponding author
Tina Harrison can be contacted at: tina.harrison@ed.ac.uk
To purchase reprints of this article please e-mail: reprints@emeraldinsight.com

Or visit our web site for further details: www.emeraldinsight.com/reprints

Identifying cross-selling opportunities using lifestyle segmentation and survival analysis

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Identifying cross-selling opportunities using lifestyle segmentation and survival analysis

Uploaded by

Copyright:

Available Formats

The current issue and full text archive of this journal is available at

Variable Description Property

2. Moderately financially 1. Financially involved

All Cluster 1 Cluster 2 Cluster 3 Cluster 4 Cluster 5

Likelihood 1,009 234 405 199 12.2 122

All Cluster 1 Cluster 2 Cluster 3 Cluster 4 Cluster 5

Variables included Age Age Age Age Cur.Mrtl Age.Sdt

410 Appendix Cluster characteristics, as percentages of sample

To purchase reprints of this article please e-mail: reprints@emeraldinsight.com

You might also like