You are on page 1of 12


Using Segmented Models for Better Decisions

Summary Experienced modelers readily understand the value to be derived

from developing multiple models based on population segment
splits, rather than a single model for the entire population, in any
model development project. Analysis of data and model development
on subpopulations reveals unique predictive patterns that build greater
precision into the segmented model. However, experienced modelers
are also aware of the pitfalls in segmentation analysis. It can be
extremely time-consuming, and result in over-fitting—segmentation
trees with excessive leaf nodes that may or may not add value in
predicting the target of interest. A common response to over-fitting,
selecting the best and second-best splits to “grow” the segmentation
tree, limits the possibility of finding the most promising initial splits,
and splits at each node.

Segmented models are collections of models, where each model is developed for a different
portion of the total population. Since the relationship between predictors and the target is
often different in each subpopulation, building a segmented scorecard frequently results in
more powerful predictions than a single scorecard. Hence segmented models are widely
recognized as a highly effective way to increase predictive power and capture interaction
effects, while maintaining a completely sustainable and tractable scoring formula.

However, the traditional process for deriving such systems is often too laborious to justify the
effort. Simple decision tree growth algorithms are ill-suited to the task. This paper describes a
solution that intelligently automates the search for optimal segmented models, and allows the
analyst to quickly discover, compare, engineer and document the most suitable segmented
models for their predictive modeling application. Make every decision countTM

Using Segmented Models for Better Decisions

Based on experience in developing thousands of segmented models, FICO developed

a segmentation search algorithm within the Segmented Scorecard module of FICO®
Model Builder that enables modelers to combine pure machine learning with their own
domain knowledge. With that, modelers can efficiently test innumerable combinations of
segmentation variables, split points and split sequencing, and quickly derive the best possible
segmentation schemes. Further, the tool provides unique collaboration features to accelerate
the analysis, testing, documentation and implementation of the final segmented model, even
when its segmentation scheme is defined or partially defined a priori.

The Motivation for The process of predictive model development requires the identification of features in the data
Segmented Model that exhibit clear and interpretable patterns with respect to the target variable of interest. For
Development many predictive modeling applications it is common to develop a single predictive model for
the entire population. However, segmented models can often produce better decision-enabling
predictions—through higher precision, lower bias, or both—than a single, standalone model.

With a segmented model, a segmentation tree is used to define nested population splits,
and individual predictive models developed for leaf nodes of that segmentation tree. Each
leaf node is commonly referred to as a subpopulation, and the predictive model, typically a
scorecard, for each subpopulation is unique to that population, including unique choices of
design constraints (fine and coarse binnings, user-defined restrictions, variable selections, etc.).
Other training elements (e.g., target variable, sample weight, reason code list) are generally
identical across the subpopulations. Typically the individual leaf-node models are scaled (or


Single scorecard Segmented scorecard

ScorecardPortfolio High Geography West
Income ScorecardHW

Single scorecard vs. segmented scorecard: a collection of three scorecards works as a hybrid of models to form a cohesive system that
predicts more accurately than a single scorecard on the left.

June 2014 ©2014 Fair Isaac Corporation. All rights reserved. page 2
Using Segmented Models for Better Decisions

“aligned”) in such a manner that they all predict on a coherent, shared scale. By aligning all leaf
node models to a single scale, score interpretation is possible without any reference to the
subpopulation from which it arose. Thus, the scaling objective is another “training” element
common to all leaf node models.

Consider, for example, an analyst charged with the task of ranking a population of prospective
customers by the likelihood that each will respond to an offer for a new low-interest credit
card. One of the key predictors of responsiveness to this offer is the “age” of our prospective
customers. As a standalone predictor, age might exhibit the following pattern:

That is, when the entire population is ranked

FIGURE 2: PROBABILITY OF RESPONSE VS. AGE by age, the analyst notices that younger
prospects are more likely to respond, and
older prospects are less likely to respond. This
discovery could be leveraged to accomplish the
task at hand by including age in a predictive
model (e.g., a scorecard) such that younger
P (Response)

prospects receive a higher score (i.e., indicating
higher response likelihood) and older prospects
receive a lower score. To build a good model,
4% the analyst will continue searching the data to
3% find other predictors in addition to “age” that
2% also exhibit useful patterns with respect to the
15 35 55 75 95
target, and contribute favorably to the total
power of the scorecard.


Lower income sub-segment Higher income sub-segment
12% 12%
11% 11%
10% 10%
9% 9%
P (Response)

P (Response)

8% 8%
7% 7%
6% 6%
5% 5%
4% 4%
3% 3%
2% 2%
15 35 55 75 95 15 35 55 75 95
Age Age

During the course of this investigation, however, the analyst also notices that not all younger
prospects are highly likely to respond to this particular offer. In fact, the nice, intuitive response
pattern shown above begins to look quite different when the population is split into sub-
segments representing different levels of income.

©2014 Fair Isaac Corporation. All rights reserved. page 3

Using Segmented Models for Better Decisions

It turns out that the relation of response to age is much stronger in the lower income sub-
segment when compared to the overall population. In fact, it appears that “age” becomes
almost useless as a predictor of response when income is high.

Rather than build a single model for the entire population, this discovery leads the analyst
to build a separate model in each of these subpopulations created by splitting on income,
with age included as a predictor in the lower income scorecard but not in the higher
income scorecard. As the analyst continues with her analysis in these two subpopulations,
she discovers some additional variables that exhibit different predictive patterns across the
two groups. This leads her to develop a segmented model that yields much more accurate
estimates of responsiveness than can be accomplished with a single model trained from the
total population.

The segmented model is a strong predictor because it is able to capture the heterogeneous
nature of population subsets. For example, in credit card account management, the spending
habits (and more importantly, risk patterns connected to those spending habits) are known to
differ between revolvers (card holders carrying a balance each month) versus transactors (card
holders paying their statements in full each month). For auto insurance providers, the relevance
of different predictors may change by age group. For example, academic measures such as
grade point averages might be useful in predicting driving risk for new drivers, but are entirely
irrelevant for more experienced drivers.

In some cases, we find the oft-imagined (but rarely encountered) perfect interaction, where
the pattern of a good predictor is even reversed along two sides of a subpopulation split. For
subpopulation A, the variable is ascending in risk, while for subpopulation B, it is descending in
risk. However, this is quite rare. More commonly, we see partial reversals—where the bottom of
a u-shaped pattern (or the top of an n-shaped pattern) varies by subpopulation, and thus there
are local areas of a predictor where one subpopulation sees a rising pattern, and the other still
sees a declining pattern.

Successful segmentation schemes create predictive lift by locating and capturing the
additional signal arising from three areas that might vary across well-designed subpopulations:

1. Differences in available information (selected, relevant predictors).

An example of this is a split on “no delinquency versus any delinquency”. On one side of
that split you will have models that include measures (severity, frequency and recency) of
delinquency, and on the other side, by definition, you will have no such predictors.

2. Differences in predictive magnitude (relative strengths of predictors).

A possible example of differences in predictive magnitude is a split on driver age. For very
young drivers, the grade point average (GPA) is a strong predictor, but for post-high school and
post-college age drivers GPA is less of a predictor (if any at all).

3. Differences in predictive pattern (slope or peak of in-common predictors).

In a credit scoring model, on both sides of a split on modest versus no delinquency the
revolving credit utilization will be predictive, but it might be more predictive on the no
delinquency side.

Figure 4 illustrates cases [1] and [2]. Figure 3 is an example of [3].

©2014 Fair Isaac Corporation. All rights reserved. page 4

Using Segmented Models for Better Decisions

FIGURE 4: Why Build Segmented SCORECARDS?

Predictors and Their Patterns Can Vary Across Subpopulations
Past Payment

Dirty Clean

Usage and Signal Strength

Highest utilization ✓
Net fraction revolving burden ✓
Length of credit history ✓ ✓
# of inquiries in last 12 months ✓ ✓
Months since last 30+ days past due ✓
Severity of worst delinquncy ever ✓

The Shortcomings of In the absence of an automated approach, some analysts use a manual search process.
the Traditional Approach Typically, this involves starting with the entire population and building a scorecard to use
to Segmented Model as a baseline. Then, the search begins by using their business knowledge to split the entire
population into two segments. The modeler then builds a model for each segment to see if
making that split enabled them to build a segmented model that is more predictive than the
baseline scorecard. If the baseline is better, the user tries making a different initial split, again
based on business expertise. If the segmented model is better, the modeler continues, using
business knowledge again to select a variable and a split threshold to divide one of the two
populations, thus creating three segments. The modeler builds a model for each segment
and compares to the baseline and the segmented model with two segments. The modeler
continues recursively, until either predictive power stops improving, or the analysts run out of
time or energy.

Another common practice seen in the industry is to begin with the decision tree techniques
such as CHAID or C&RT to develop a decision tree model that predicts the target of interest.
Techniques such as cross validation are used to prune the tree back and reduce over-fitting.
As you can see in the example in Figure 5, even once the tree is pruned it may have more
segments than you would like for developing a segmented model.

©2014 Fair Isaac Corporation. All rights reserved. page 5

Using Segmented Models for Better Decisions


Occupation={Banker Doctor Executive Lawyer}

97.4%: Good (29227)
90.2%: Good (7927)
84.3%: Good (13492)
CbUtilization<=75.00% CbUtilization>24.98% 77.9%: Good (5115)
75.9%: Good (5565) CbInquiriesLast5Months>8
91.1%: Good (81488) FinanceCompany={Citizens Liberty}
53.6%: Good (450)
92.2%: Good (37215)
93.2%: Good (9682)
Occupation={Manager Other} 99.0%: Good (14041)
87.5%: Good (52261)

AutoLoanRisk5 FinanceCompany={EFinance National}

87.4%: Good (100000) 76.1%: Good (15046)

CbInquiriesLast5Months<=6 71.1%: Good (1201)
54.5%: Good (4668) LoanToValueRatio>83
48.8%: Good (3467)
48.0%: Good (6787)
CbUtilization>75.00% 33.6%: Good (2119)
71.3%: Good (18512)
70.3%: Good (3691)
91.4%: Good (8034)

Of course, the fitting objective of CHAID and C&RT trees is to attain a purity in the leaf nodes with
respect to the target, and to make the tree itself a direct predictor of the target variable.
While these are familiar tree-growing algorithms, and they provide useful guidance, they have
no specific abilities to ensure the tree will represent a good segmentation scheme for building a
segmented model.

On the other hand, the goal of the segmentation tree is not to predict the outcome directly, but
rather to discover distinct populations to develop models on that will result in the most powerful
scoring system—based on variations of the predictors and their patterns across leaf models.
Given this distinction, how can we make best use of CHAID and C&RT tools?

Rather than simply proceeding to develop a segmented model based on the best tree that
comes directly out of a CHAID or C&RT algorithm, it is common to use the insights from the
algorithm to develop a handful of candidate segmentations to consider. In the example shown
in Figure 6, we investigated two segmentations, selecting the best and second best initial split
using C&RT.

Looking at the first candidate, we can see that the tree is effective in finding subpopulations
that have high and low bad rates respectively (51.98% vs 2.65%). The second candidate is


Candidate 1 Candidate 2
Occupation=(Banker Doctor Executive Lawyer) LoanToValueRatio<=89.9335
CbUtilization<=75.003 2.65% (29227) FICO®Score<=500.0055 18.53% (9825)
8.94% (81488) Occupation=(Manager Other) 26.58% (21956) LoanToValueRatio>89.9335
AutoLoansRisk5 12.46% (52261) AutoLoansRisk5 33.11% (12131)
12.60% (100000) FICO®Score<=502.5745 12.60% (100000) Income<=5009.24
51.98% (6787) 19.14% (9453)
CbUtilization>75.003 FICOScore>500.0055
28.70% (18512) FICOScore>502.5745 8.66% (78044) Income>5009.24
15.22% (11725) 7.22% (68591)

©2014 Fair Isaac Corporation. All rights reserved. page 6

Using Segmented Models for Better Decisions

not as effective at finding subpopulations with high and low bad rates, although that does
not necessarily mean it is an inferior tree for building a segmented model. Indeed, a poorly
predicting tree can be an excellent segmentation tree. This is because finding a tree with
purity in the leaf nodes is a very different goal than finding a tree that identifies interactions
that will allow productive leaf-node modeling!

A Much-Improved FICO has been a worldwide leader in the development of highly effective and robust segmented
Approach to Segmented model systems for more than 25 years. We have developed unique technologies that help
Scorecard Development streamline the search process and identify the combination of splitters and split points to
define an overall segmentation tree that is best for segmented modeling. Several years ago,
we designed the Adaptive Random Trees technology to use a global procedure for finding a
great segmented model. After extensive research, our analysts have furthered their thinking
on segmentation modeling and have come up with an even better method for searching
thousands of combinations of segmentations and models to find the best segmented model.

The Segmented Scorecard module in FICO® Model Builder searches the vast space of potential
segmentation trees to identify an optimized segmented model system. Based on user-
specified splitter variables, the module’s search algorithm analyzes thousands of candidate
splits using scorecards to calculate precise performance estimates, ensuring only the most
powerful splits are chosen. It then ranks and returns the best segmented model for further
refinement and allows for immediate tuning of the subpopulation models.

Even if you plan to use existing segmentation schemes, the Segmented Scorecard module
helps you build better model suites in less time by centralizing the definition and management
of all leaf node models, enabling collaborative, parallel development, and providing a
complete deployment of the modeling system without manual recoding.

How FICO® Segmented Rather than using inspiration from decision tree analysis to manually construct a handful of
Models Work candidate segmentations, FICO’s best practice solution begins by considering every possible
initial split and then developing scorecards with automated binning and variable selections for
each leaf node. The true segmented model divergence is computed to identify the best splits
as the trees are grown. While a greedy algorithm would immediately focus on building out
the two-way split that resulted in the best two-leaf segmented model, FICO looks at a diverse
handful of the best initial splits and proceeds recursively to identify the best segmented
models with three-leafs, four-leafs, etc., until an analyst’s stopping criteria is obtained. This is
often based on a maximum tree size (number of leaves), or minimum counts requirements
for leaf node. This approach keeps track of not just the best split at each node, but rather a
number of promising splits.

A key principle of FICO’s patented Segmented Scorecard module is the automated training of
scorecards to quantify the true performance of each candidate split as the tree is recursively
grown. An additional benefit is obtained from retaining the best N splits at each node for
further growth, creating a variety of similarly performing but differently structured trees, and
avoiding the pitfalls frequently associated with greedy searches. The algorithm retains the
most promising candidate segmented models. To streamline review, when the algorithm
terminates, comparison reports document summarized and detailed information for the most
predictive segmented models.

©2014 Fair Isaac Corporation. All rights reserved. page 7

Using Segmented Models for Better Decisions

Data Requirements Generally, developing a segmented model has similar data requirements as developing a
of FICO® Segmented single model for the entire dataset. The dataset should contain a target variable to predict
Models (e.g., has the account ever been 60+ days past due), a set of candidate predictor variables, and
optionally a sample weight for each observation. In addition to these general requirements,
it is important that the dataset have enough records, such that when the data is partitioned
into segments, each segment has enough examples to build a robust model. If the target is a
binary outcome, then each segment must have enough examples of each of the outcomes
(e.g., goods and bads) to build a model.

Suppose that the modeler wants to have at least 500 examples of good customers and at
least 500 examples of bad customers for building a model for a single segment. Assume also
that the modeler will use bootstrap validation so that all 500 examples of the good and bad
customers can be used for training the model for that segment. Then, for a segmented model
with two segments, at least 2,000 records would be required. Typically, where segmented
models have anywhere between 5 and 50 segments, the number of records required for
segmented model development could be well above 10,000 records. If the modeler wishes
to use 50% of the data for training and 50% for testing, rather than bootstrap, then that
requirement doubles. The key point is that the modeler must consider the number of records
available for training in each segment and potentially acquire a much larger sample for
developing a segmented model than for building a single model.

User Inputs In addition to providing the data, segmented modeling allows the analyst to provide input
to guide the development of the segmented model. The first key input that an analyst can
provide is a scorecard developed on the entire dataset. This serves as a benchmark against
which to compare all of the segmented models that will be developed and evaluated.

As discussed above, the analyst may wish to constrain the segmentation to ensure a particular
minimum number of examples will exist in each segment. This can be specified both in terms
of the total examples in the segment and/or in terms of the required number of examples
of the rare outcome (e.g., the number of bads). Adding these constraints ahead of time will
ensure that the resulting candidate segmentations will be useful for segmented model

In order to search for a segmentation, the algorithm will need to know the candidate list
of variables to use as splitters. The list of splitters can include the same candidate variables
that will be used in the models developed for each segment, or can include different
candidate variables. For each splitter, the algorithm will also need to know the desired
potential split points. The split points are defined in a binning library so that the analyst can
select split points that are at the desirable level of granularity. For example, when segmenting
on a variable like “Income,” it would make sense to constrain the split points to things like
$30k, $40k, $50k, ... $100k, $200k, etc. so that, first, each potential split contains sufficient
examples to build a model, and second, so that the split points are round numbers that are
presentable to a model reviewer rather than a figure such as $30,531, which would make the
resulting segmentation difficult to interpret. The binning functionality and rounding rules in
FICO® Model Builder allow the analyst to quickly define meaningful candidate split points for
each splitter variable.

©2014 Fair Isaac Corporation. All rights reserved. page 8

Using Segmented Models for Better Decisions

With the candidate split points defined, the analyst may wish to provide business guidance to
the search process. This can be done by providing a partial ordering on the splitter variables,
defining starting trees to grow out further, or by manually defining fixed segmentation trees
for comparison. First, sometimes business users want to specify in which order the variables
are used in the segmented model. For example, the user might require that the first split be
on “Product Type,” the second split be on “Risk Score,” or on “Attrition Score,” and that the
algorithm is then free to choose any variables for the subsequent splits. It is also common that
businesses will want to define the first few splits in a segmented model. For example, the first
split might be mandated to be on the distinction between transactors and revolvers. Then
the algorithm might be asked to consider making additional splits for the transactors, for the
revolvers, or for both. Finally, the user may wish to specify one or more champion trees that are
to be evaluated by automatically building models for each segment they define without any
modification to the given champion trees.

Once a segmentation is defined, it will be evaluated by building a model for each segment.
That process can also be controlled. The analyst may desire to specify the candidate variables
to be used in the model that will be automatically built for each segment. If the model is a
scorecard, the analyst may also desire to define the initial binning of each candidate variable.
With guidance provided to the algorithm, the system can proceed to identify the optimized
segmented model.

Search Algorithm The following diagram (figure 7) illustrates the search for the optimized set of segmented
models. The algorithm will return the top segmented model found. FICO® Model Builder will
also provide a comparison of the structure and performance of the top models, an analysis
of the segmentation logic, segment statistics, variable usage in each segment and details
regarding the variables and bins in each segment’s model.

There are a number of key points to notice when reviewing the algorithm:

• First, the algorithm starts with a population of trees where each tree splits on a different
candidate splitter variable. The algorithm then attempts to grow out each of the trees.
This is important to avoid a greedy result where even though the best first split might
be on Variable A, a better result may be obtained by splitting on Variable B and making
subsequent splits.
• Second, the quality of a segmented model is evaluated not based on how well the tree
predicts, but on how well the segmented model predicts when the tree is used to route
data to segment-level models that are specialized for the segment.
• Third, the algorithm returns not only the best segmented model, but the top segmented
models along with segmented models for any segmentations manually entered by the
analyst. This enables the analyst to understand performance with respect to alternative
segmented models that vary in terms of complexity, palatability and model performance.

©2014 Fair Isaac Corporation. All rights reserved. page 9

Using Segmented Models for Better Decisions


Input Initialize
• Train Data Start with given
• Validation Data initial trees or one
• Settings empty tree.
• Parent Model

More Trees to Grow? • Population of
Yes No
Are any leaf nodes trees with their
further splittable? optimization measure

Initial Statistics • Best tree

Compute initial statistics on the data.

Next Split Node

For every tree find the next node
to split (node with greatest number
of records).

Generate Splits
For every splitter variable generate
candidate splits for the next split
node of every tree.

Node and Split Stats

Train new pairs of predictive scorecard
models for each side of every candidate
split. Compute net increase in objective
function to assess the value of each
candidate split.

Choose Best N Splits

For each input tree, retain the top
N augmented trees, as ranked by
increase in objective function from
their newly added splits.

Iteration Result
Take the chosen split on each tree
to generate next tree population.

©2014 Fair Isaac Corporation. All rights reserved. page 10

Using Segmented Models for Better Decisions

Giving the The flexible nature of the Segmented Scorecard module gives the analyst unprecedented
Analyst Control control over the segmented model. This tool assumes that the analyst’s best thinking will bring
about the best model and therefore gives the analyst room to experiment to best determine
the path to effective decisions.

The Segmented Scorecard module in FICO® Model Builder provides capabilities to:

• Automatically rank the performance of each segmented model in a variety of measures.

• Visualize and quickly navigate segmentation trees.
• Enable business experts to refine discovered trees to infuse domain knowledge, or add
splits required by the business process.
• Empower analysts to tune every aspect of the search algorithm, including:
• Comparing results against existing trees.
• Specifying initial, partial trees to be empirically grown.
• Setting binning, constraint and variable selection criteria to use in automated
model training.
• Controlling the trade-off between depth of search and execution time.
• Specifying splitters to consider for the first, or any, level of the tree.
• Controlling maximum tree size.

Global lender improves profit by 26%

A global lender wanted to improve its bottom line by reducing behavior. The lender projects a 26% profit increase—or $440k
its bad debt rates. FICO applied its segmented modeling per 100k new applicants—to its portfolio. This approach is also
technology to the lender’s model development process. The projected to yield a significant 8.5% profit improvement over
result was a new segmented scorecard designed to achieve the lender’s old approach of manually designing a segmented
maximum precision in predicting delinquency and default model system.

Business Value of By automating the process of identifying optimized segmented models, the Segmented
Segmented Models Scorecard module delivers substantial benefits to your model development process. The
Segmented Scorecard module allows you to:

• Boost the precision of your models. You can realize the bottom-line benefits of
optimally tuned segmented model on nearly every modeling project. Segmented models
naturally capture the most important interactions in the dataset. Unlike other powerful
modeling paradigms, such as neural networks, the resulting segmented model gives the
user immediate insights into the interactions among variables and the predictive patterns
that are most important for modeling the populations in each segment.
• Accelerate the model development process. Automating key steps dramatically
reduces the time to develop effective segmented models. The Segmented Scorecard
module not only finds great segmentations, but guides the user through the process
of choosing, refining and deploying the final segmented model with all the required
elements such as reason code assignments and score scaling parameters.

©2014 Fair Isaac Corporation. All rights reserved. page 11

Using Segmented Models for Better Decisions

• Improve efficiency and reduce costs. The segmentation discovery process can yield
superior segmentation trees with fewer leaves than manually created trees, reducing the
number of models required, and thus lowering the costs to develop, deploy and maintain
the best segmented model.

Segmented Modeling FICO has applied segmented modeling across a wide range of industry applications. As the
in Practice inventor of this technology, FICO’s own model development teams have benefited from
the accelerated segmented modeling process and have been able to reap the benefits of
segmented modeling on virtually any modeling project with ample data. Our clients have
also benefited from much more effective segmentation solutions enabled by a smarter tree
search technology.

Segmented Scorecard The Segmented Scorecard is available as an optional module for the FICO® Model Builder
Module Availability platform. The Segmented Scorecard module builds upon Model Builder’s core capabilities for
importing data, visualizing and exploring predictive patterns, defining predictive variables,
creating scorecards and other types of predictive models, evaluating their quality and swiftly
deploying predictive analytics.

FICO (NYSE: FICO) is a leading analytics software company, helping businesses in 90+ countries make better decisions that drive higher levels of growth, profitability
and customer satisfaction. The company’s groundbreaking use of Big Data and mathematical algorithms to predict consumer behavior has transformed entire
industries. FICO provides analytics software and tools used across multiple industries to manage risk, fight fraud, build more profitable customer relationships,
optimize operations and meet strict government regulations. Many of our products reach industry-wide adoption—such as the FICO® Score, the standard measure of
consumer credit risk in the United States. FICO solutions leverage open-source standards and cloud computing to maximize flexibility, speed deployment and reduce
costs. The company also helps millions of people manage their personal credit health. Learn more at

For more information North America Latin America & Caribbean Europe, Middle East & Africa Asia Pacific +1 888 342 6336 +55 11 5189 8222 +44 (0) 207 940 8718 +65 6422 7700

FICO and “Make every decision count” are trademarks or registered trademarks of Fair Isaac Corporation in the United States and in other countries. Other product and company names herein may be trademarks of their
respective owners. © 2014 Fair Isaac Corporation. All rights reserved.

3085WP 06/14 PDF