Chapter 19 A Predictive Model For Disability Underwriting Profitability ...

19 A PREDICTIVE MODEL FOR
DISABILITY UNDERWRITING PROFITABILITY
19.1 BACKGROUND
A disability insurer developed a new underwriting manual for small groups. As with other rate
manuals, the company’s underwriting manual provides pricing, based on a number of variables,
to be applied to groups that apply for disability insurance. Daniel Skwire, writing in Chapter
25 (“The Rate Manual”) in Daniel Skwire (ed.) Group Insurance (7th ed., Actex Learning, 2016
[12]) defines the rate manual as consisting of “rates which vary by allowable case characteris-
tics (such as age, gender, and family composition) and with rating factors to be applied for
other rating characteristics, like geographic area, group size, industry, trend factor, and mor-
bidity factors” (Skwire, ed. [12]). Chapter 26 (“Long term Disability,” in Skwire [12]) ad-
dresses the type and size of rate adjustments appropriate for a number of risk characteristics
when underwriting and pricing a disability risk:
 Social Security Offsets (the probability that a claimant will receive a disability benefit
from Social Security, and the amount of that benefit);
 Plan variations
o Benefits as a percentage of income, between 50% and 70%;
o Maximum benefits;
o Elimination period (3-, 6- and 12-month elimination periods);
o Benefit period (to age 65, to Social Security normal retirement age, or for life);
o Definition of disability (“any reasonable occupation” vs. “own occupation,” with
the latter increasing the basic premium from the “any reasonable occupation” level);
o Offsets (e.g., for workers compensation benefits, state sickness plans, or pension
benefits);
o Limits on certain conditions, such as mental nervous, alcoholism and drug abuse.
 Group size and composition;
 Employee contribution and participation rates;
 Employment type (white collar vs. manual);
 Industry; and
 Average earnings.
As we discussed in Chapter 2, a detailed rating manual developed using age, sex, and other
factors should enable the carrier to make reasonably accurate predictions of the likely cost of
disability coverage for employer groups. Rating factors, however, are frequently developed in
a single dimensional way and combined within a multivariate rating model. Within individual
rate cells, therefore, it is possible that manual rates will not accurately reflect the combination
of risk factors that the specific rate cell represents. Consequently, some rating “cells” will ex-
hibit greater profitability than others.
373
374  CHAPTER 19
In this chapter, we consider the feasibility and process of refining a basic manual rating model
using predictive modeling. By identifying those cells or combinations of rating factors that
represent better or worse risks, a refined model can be obtained that would better identify risk
factors associated with disability costs, and allow the company’s underwriters to segment and
rate risks more appropriately. This chapter describes the modeling and model evaluation pro-
cess.
19.2 THE MODEL
The model was developed to predict Excess Profit Margin (EPM) based on five years of active
and lapsed policy data. The manual rates contain a margin for profit, risk and contingencies,
so any profitability within the book greater than zero represents excess profit over and above
that assumed in pricing. The data is adjusted and summarized over five years at the employer
group (“policy”) level. The original 5 years of data contained 10,438 policies. Adjustments
were applied in order to limit the effect of large policies. Larger policies were truncated at a
maximum of 500 lives for each policy. A large number of variables are available for modeling
purposes. Variables are described (including, where appropriate, values of categorical varia-
bles) in Table 19.1.
One derived variable was added to the database described in Table 19.1. This variable, called
EPM_Set, is a categorized dependent variable in which the continuous EPM Dependent variable
is categorized by “0.1” intervals. This is necessitated because Quest and C5.0 trees (tested in this
project) accept only a categorical variable for their target field.
The need for a model that could be used by underwriters in an operational setting suggested the
use of a tree form, which would generate profitable and unprofitable combinations of risk factors.
19.3 CHOICE OF DECISION TREE
Three different types of decision trees were considered for developing the model: Quest, C5.0,
and C&R Tree. The evaluation was performed using Clementine 9.0, manufactured by SPSS.
Clementine makes different models available. Decision trees were evaluated by running train-
ing and testing datasets to help us decide on the ultimate decision tree model.
Test of Three Different Decision Tree Models
Short Description – Decision Trees

 QUEST. The Quick, Unbiased, Efficient Statistical Tree method is efficient to compute
and avoids other methods’ biases in favor of predictors with many categories. Predictor
fields can be numeric ranges, but the target field must be categorical. All splits are binary.
 C5.0. This method splits the sample based on the field that provides the maximum infor-
mation gain at each level to produce either a decision tree or a rule set. The target field
must be categorical. Multiple splits into more than two subgroups are allowed.
A PREDICTIVE MODEL FOR DISABILITY UNDERWRITING PROFITABILITY  375
 C&RT. The Classification and Regression Trees method is based on minimization of

impurity measures. A node is considered “pure” if 100% of cases in the node fall into a
specific category of the target field. Target and predictor fields can be range or categor-
ical; all splits are binary (only two subgroups).
TABLE 19.1
Independent Variables
Independent
Variable Name Explanation Categories
SIC1 category SIC Group (defined by the first two dig- Group 1: 01-47, 49, 74-77
its of a case’s SIC code) Group 2: 48,60-69,73,81,84-89
Group 3: 80
Group 4: 50-59, 70-72, 78-79
Group 5: 82-83, 90-99
Northeast (1), South (2),
Region indicator Region
Midwest (3), West (4)
M_LT_40 index % Males < 40 (based on covered salary) < 20% (1),  20% (2)
M_OT_40 index % Males  40 (based on covered salary) < 40% (1),  40% (2)
F_LT_40 index
% Females < 40 (based on covered sal- < 20% (1),  20% (2)
ary)
% Females  40 (based on covered sal- < 20% (1),  20% (2)
F_OT_40 index
ary)
PWC index
Professional/White-collar % (based on < 60% (1),  60% (2)
covered salary)
Avgsalary index Average Covered Salary < $45k (1),  $45k (2)
Ownocc index Own Occupation Period < 5 years (1), > 5 years (2)
Integrated indicator Integrated Indicator No (0), Yes (1)
Case size Case Size < 100 (1),  100 (2)
Ben% Group Benefit %  60% (1), > 60% (2)
SSI indicator Social Security Integration Full (1), Other (2)
BC index Blue-collar % < 10% (1),  10% (2)
Contributory ind Contributory Indicator Yes (1), No (2)
Multi-location ind Multi-location Indicator Yes (1) No (2)
Pop Den ind Population Density  1 million (1), < 1 million (2)
EP index Elimination Period  90 days (1), > 90 days (2)
Ben Per Benefit Period < to age 65 (1),  to age 65 (2)
MN limit Mental/Nervous Limit Yes (1), No(2)
DA limit Drug/Alcohol Limitation Yes (1), No(2)
Definition of Disabil- Partial and Residual (1),
Definition of Disability
ity Other (2)
COLA indicator Cost of Living Adjustment Indicator Yes (1), No (2)
Agency polcnt index Agency Policy Count < 10 (1),  10 (2)
Maximum Benefit Maximum Benefit < $10,000 (1),  $10,000 (2)
1 The Standard Industrial Classification (abbreviated SIC) is a United States government system for classifying
industries by a four-digit code. Established in 1937, it is being replaced by the six-digit North American Industry
Classification System, which was released in 1997. This insurer continues to use the SIC codes.
376  CHAPTER 19
The total number of records in the dataset is 10,438, representing from 1999 to 2003. This
dataset is divided into three subsets: training (4,233 records—about 40%), testing (3,129 rec-
ords—about 30%), and evaluation (3,089—about 30%). The training dataset is used to develop
a model, and the testing dataset is used to check the developed models. If a test result is not
satisfactory, the model is adjusted. The models that pass the testing phase are evaluated using
the evaluation datasets.
19.3.1 TREE MODEL RESULTS
Four models were tested with the three different types of decision trees. The four models are
based on different choices of independent variables:
1. Model 1 included all independent variables as input variables.

2. Model 2 used only a subset of nine variables from an initial model.
3. Model 3 is based on company (demographic) variables only (i.e. excluding variables that per-
tain to benefit design). Company information variables are variables that are inherent to an
employer group because of its location or business type. SIC category, Region indicator,
M_LT_40_index, M_OT_40 index, and F_OT_40 index are company information variables.
4. Inputs used with Model 4 are variables such as Avgsalary index, Ownocc index, Integrated
indicator, Ben% Group, SSI indicator, Contributory ind, Ep index, Ben per, MN limit, DA
limit, Definition of Disability, COLA ind, and Maximum benefit, excluding variables per-
taining to the company profile.
Company and Benefit variables were tested separately to determine whether each of the two
groups of variables influences the model’s performance.
Each model was run at least 30 times in each type of decision tree resulting in approximately
100 runs for each of three types of tree algorithms. The testing indicated that Model 1, which
included all independent variables as input, outperformed the other three models. The C&R
Tree performs better than either of the two other types of decision trees. Thus, the final model
choice was Model 1 (all variables) within the C&R tree. Note that C & R Tree accepts any
variable types for its target field, and therefore the original dependent variable (continuous)
could be used.
The following are the results of training and testing of the three decision trees with all inde-
pendent variables as inputs.
TABLE 19.2
Correct/Incorrect Assignment of Groups by Three Models

40% Training 30% Test
Correct Wrong Correct Wrong
Quest 1,667 2,566 1,191 1,938
C5.0 2,567 1,666 1,190 1,939
C&R T 1,885 2,348 1,395 1,734
TABLE 19.3
Evaluation Statistics for C&R T Model

40% Training 30% Test
Minimum Error  125.83  302.21
Maximum Error 1.88 1.88
Mean Error 0.00  0.046
Mean Absolute Error 0.861 0.916
Standard Deviation 3.362 5.73
Linear Correlation 0.123  0.005
19.3.2 ANALYSIS AND CHOICE OF TREE MODEL (C&R TREE)
The C&R Tree was chosen for two reasons:

 The C&R Tree showed the best performance in preliminary testing. The Quest model
shows a “correct” rate of 39.4% and 38.1% in training and testing, while C5.0 has the
best performance, 60.6% correct rate in training, but only 38.0% in testing (because of
over-fitting to the training data). The C&R Tree model predicts a continuous dependent
variable (Excess Profit) while for the C5.0 and Quest models, “correct” means that the
dependent variable falls in discrete categories. For example, if the EMP variable is
0.0999, then the EMP_Set variable falls in the range: “0.0  EMP  0.1.” If the model
predicts “0.1  EMP  0.2” for its dependent variable (EMP_Set) when the actual
EMP is 0.0999, this predicted observation will be classified as incorrect, because
0.0999 belongs to the category: “0.0  EMP  0.1,” even though there is 0.0001 dif-
ference between the correct and incorrect values. The C&R Tree shows 44.5% and
44.6% in training and testing. The C&R Tree is preferable and more stable than the
other two models.
 The C&R Tree allows the use of a continuous dependent variable for its target field.
19.4 CHOICE OF VARIABLES AND DEVELOPMENT OF FINAL MODEL
Although the C&R Tree was chosen for final model development, the percentage of correct
assignments using the C&R Tree was less than 50% (44.6%). To overcome this accuracy prob-
lem and to increase the model performance, the first and second predictive variables are first
selected and then “fixed.” The data are then divided into four subgroups according to the cho-
sen two variables as the first and second nodes in each decision tree. The decision process is
then rerun for each subgroup. This process results in different models for each subgroup, which
may have some additional operational implications for implementation but can improve model
accuracy. The C&R Tree is then rerun within each subgroup to develop each of the four models.
After this process, subgroups were appended again to develop an overall decision tree model.
Figure 19.1 illustrates the output of the tree classification process for one subgroup (Male >
40). Table 19.1 also provides an interpretation of variable values.
SAMPLE TREE DIAGRAM (LIMITED TO NODES 1-12 OF FINAL MODEL)
$R-EPM 99-03
Node 0
n 4138
% 100.00
Predicted 0.00
Region Indicator
-
378  CHAPTER 19
1.00 2.00, 3.00, 4.00

Node 1 Node 16
n 1393 n 1393
% 33.66 % 33.66
Predicted 0.13 Predicted 0.13
Agency Polcnt Index - F_OT_40 Index -
1.50 > 1.50 1.50 > 1.50
Node 2 Node 11 Node 17 Node 30

n 990 n 403 n 2094 n 651
% 23.92 % 9.74 % 50.60 % 15.73
Predicted 0.20 Predicted -0.03 Predicted 0.00 Predicted -0.26
Sic Category
- Ownocc Index
- -
Sic Category
2.00 1.00; 3.00; 4.00; 5.00 1.50 > 1.50

Node 3 Node 6 Node 12 Node 15 Node 18 Node 25
n 330 n 660 n 271 n 132 n 1275 n 819
% 7.97 % 15.95 % 6.55 % 3.19 % 30.81 % 19.79
Predicted 0.03 Predicted 0.28 Predicted 0.17 Predicted 0.45 Predicted -0.14 Predicted 0.22
PWC Index
- - - - -
Multi Location Ind Sic Category Ben% Group Pop Den Ind
1.50 > 1.50 1.50 > 1.50 1.00; 4.00 2.00; 3.00; 5.00 1.50 > 1.50 1.50 > 1.50
Node 4 Node 4 Node 7 Node 8 Node 13 Node 14 Node 19 Node 22 Node 26 Node 22
n 55 n 55 n 65 n 595 n 146 n 125 n 1018 n 257 n 613 n 257
% 1.33 % 1.33 % 1.57 % 14.38 % 3.53 % 3.02 % 24.60 % 6.21 % 14.81 % 6.21
Predicted -0.68 Predicted -0.68 Predicted -0.09 Predicted 0.32 Predicted 0.04 Predicted 0.33 Predicted -0.04 Predicted -0.54 Predicted 0.29 Predicted -0.54
- - - -
Multi Location Ind
Ben% Group Avg Salary Index Regional Indicator
1.50 > 1.50 1.50 > 1.50 2.00 3.00; 400 1.50 > 1.50
Node 9 Node 10 Node 20 Node 21 Node 23 Node 24 Node 27 Node 28

n 509 n 88 n 420 n 598 n 71 n 186 n 80 n 533
% 12.30 % 2.08 % 10.15 % 14.45 % 1.72 % 4.49 % 1.93 % 12.88
Predicted 0.29 Predicted 0.51 Predicted -0.34 Predicted 0.17 Predicted 0.27 Predicted -0.84 Predicted -0.08 Predicted 0.35
FIGURE 19.1
19.5 MODEL EVALUATION
As the partial model in Figure 19.1 shows, the C&R Tree process identifies “nodes,” or groups
of independent variables that segment the entire database into groups with common characteris-
tics that predict the subgroup’s profitability. The model may be evaluated in at least two ways:
1. Statistically in terms of “fit.” We will examine the statistical performance of the model first
in this section; or
2. In terms of whether the model serves a business purpose. In this case, a comparison model
exists—the rate manual—and the alternative, tree-based model may be evaluated in terms of
its ability to discriminate between potentially profitable and potentially unprofitable groups.
We will examine the business performance of the model in the latter part of this section.
19.5.1 STATISTICAL EVALUATION
The model was evaluated by assessing the correlation between predicted and actual profitabil-
ity, using a number of different datasets.
Using the entire dataset (100% of the data)
N = 10,453
Minimum Error  0.9600
Maximum Error 301.9000
Mean Error 0.0286
Standard Deviation 4.8747
Correlation 0.0200
Significant at the 95% level
As an “overall” validation, the fact that the linear correlation statistic is significant at the 95%
level is a good sign. This means the model is correctly predicting EPM, i.e., there is a “corre-
lation” between the predicted EPM and the actual EPM.
Using the training set (50% of all data)
N = 5,178
Minimum Error  0.9600
Maximum Error 301.9000
Mean Error 0.0221
Standard Deviation 4.7685
Correlation 0.0240
Significant at the 95% level
380  CHAPTER 19
As with any of the validation statistics, there is some variation in the data not explained by the
model. But for predicting the higher EPMs versus the lower EPMs, the model is, in general, capa-
ble of doing so. Again, the relationship explained by the model is significant at the 95% level—
which means we are fairly certain (95%) that the model is useful in predicting EPM.
The statistically significant correlation statistics seen when the sample size (N) is larger is a posi-
tive sign. This shows that model is correctly predicting EPM and can be used to predict EPM and
to assess which variables can be used to determine which employer groups are more likely to have
a higher EPM.
There still is significant variation around the predicted EPM numbers on a case-by-case basis,
but taken across the entire population, the model will have a positive impact on identifying
candidates who are likely to have higher EPM numbers.
19.5.2 BUSINESS EVALUATION
The dependent variable for this model is the level of excess profitability. Table 19.4 summa-
rizes the excess profitability of the book of business by model node. The baseline excess prof-
itability of the entire book is zero. The underlying book, of course, may generate profits at the
level anticipated in pricing. The dependent variable is profit in excess of the pricing level. As
the table shows, certain nodes are relatively profitable and others are relatively unprofitable.
Overall, the relatively profitable nodes are large, so that identification of these nodes holds the
promise of increased profitability of the book.
The model assigns groups prospectively to different nodes (“predicted number in node”) de-
pending on the values of the group’s variables. We can then identify the actual classification
of groups based on each group’s outcomes. Similarly, we can predict the profit of groups in
each node, based on the model, and compare predicted and actual profit for each node. Ideally,
a model would correctly assign groups to each node so that expected and actual numbers are
the same. Because it is highly unlikely that a model will do this, we examine, instead, ways
that management may use the model results to improve its underwriting process.
TABLE 19.4
Predicted and Actual Profitability of Groups, by Model Node

Predicted Predicted Actual Actual
Node Number in Node Average Profit Number in Node Average Profit
1 17 (3.03) 17 (0.60)
2 212 0.19 243 0.07
3 513 (0.20) 609 (0.06)
4 225 0.09 258 0.10
5 168 (0.40) 2 0.02
6 86 (0.27) 76 0.16
7 160 0.11 181 0.04
8 47 0.53 47 (0.01)
9 284 (0.13) 291 0.03
10 336 0.27 374 0.04
11 385 0.38 392 (0.07)
12 79 0.08 83 0.08
13 3,022 0.06 2,952 0.02
14 592 0.27 641 0.21
15 133 (1.07) 132 (0.03)
16 2,484 0.07 2,495 (0.08)
17 345 (0.33) 325 (0.10)
18 1,100 0.11 1,110 0.08
19 249 (0.13) 210 (0.11)
Total 10,438 0.04 10,438 0.00
Total Profit 411.1 41.22
Table 19.5 shows that the predictive model, while it does not accurately predict the level of
profit within individual nodes, predicts both the number and direction of the profitability by
node with reasonable accuracy. Thirteen of the nineteen nodes are predicted accurately in terms
of direction, accounting for 7,135 or 68% of all groups. Overall profitability of these groups
(even with those that are predicted to be profitable but turn out to be unprofitable in actuality)
is 0.035. In total, profits amount to $247.8, or about 6 times the level of the book using the rate
manual only.
382  CHAPTER 19
TABLE 19.5
Directional Predictive Accuracy

Predicted Actual
Directionally
Node Number in Average Number Average Correct
Node Profit in Node Profit
1 17 (3.03) 17 (0.60) 
2 212 0.19 243 0.07 
3 513 (0.20) 609 (0.06) 
4 225 0.09 258 0.10 
5 168 (0.40) 2 0.02 x
6 86 (0.27) 76 0.16 x
7 160 0.11 181 0.04 
8 47 0.53 47 (0.01) x
9 284 (0.13) 291 0.03 x
10 336 0.27 374 0.04 
11 385 0.38 392 (0.07) x
12 79 0.08 83 0.08 
13 3,022 0.06 2,952 0.02 
14 592 0.27 641 0.21 
15 133 (1.07) 132 (0.03) 
16 2,484 0.07 2,495 (0.08) x
17 345 (0.33) 325 (0.10) 
18 1,100 0.11 1,110 0.08 
19 249 (0.13) 210 (0.11) 
Total 10,438 0.039 10,438 0.00
Total Profit 411.1 41.22 247.8
Directionally correct 7,135 13
Directionally incorrect 3,303 6
Profit Margin 0.039 0.035
Table 19.6 identifies only those nodes (11) that are predicted to be profitable. These nodes
account for considerable profitability, both at the group level and in total.
TABLE 19.6
Predicted and Actual Profitability by Node

Predicted Actual Predicted
Directionally
Node Number in Average Number Average to be
Correct
Node Profit in Node Profit Profitable
1 17 (3.03) 17 (0.60) 
2 212 0.19 243 0.07  
3 513 (0.20) 609 (0.06) 
4 225 0.09 258 0.10  
5 168 (0.40) 2 0.02 x
6 86 (0.27) 76 0.16 x
7 160 0.11 181 0.04  
8 47 0.53 47 (0.01) x 
9 284 (0.13) 291 0.03 x
10 336 0.27 374 0.04  
11 385 0.38 392 (0.07) x 
12 79 0.08 83 0.08  
13 3,022 0.06 2,952 0.02  
14 592 0.27 641 0.21  
15 133 (1.07) 132 (0.03)  
16 2,484 0.07 2,495 (0.08) x 
17 345 (0.33) 325 (0.10) 
18 1,100 0.11 1,110 0.08  
19 249 (0.13) 210 (0.11) 
10,438 0.039 10,438 0.00
Total Profit 411.1 41.22 126.6
Directionally correct 7,135 13 11
Directionally incorrect 3,303 6
Profit Margin 0.039 0.034 0.014
19.6 USING THE MODEL FOR UNDERWRITING
A question that arises is why a predictive model is required at all in underwriting? The model
results suggest that there are some “nodes” or groupings of underwriting risk factors that are
more profitable than others. The obvious implication of this result is that the rate manual that
was the basis for the study should be updated. Updating a rate manual, however, is a large
undertaking. Using a predictive model (that can be easily and frequently updated) in combina-
tion with the Rate Manual is therefore a practical and cost-effective solution.
384  CHAPTER 19
In Table 19.6, we demonstrated that profitability of individual subgroups, as identified by the

nodes of the C&R T model, is different. Since the underlying rating structure of the rate manual
is constructed to generate approximately equal profitability across the entire book, Tables 19.4,
19.5 and 19.6 demonstrate that the model provides information that is additional to that in the
rating manual. Depending on how this information is used, profitability of the book may be
increased, compared with the level of the rating manual.
Using the model for underwriting requires two things:
1. A means of practically applying the risk factors that result in the model “node” classifica-
tions. Node 1, for example, an unprofitable node, consists of groups that meet the following
criteria:
Multi-location Indicator: 1  Multi-locations
Own Occupation Indicator: 1  Less than 5 years
COLA Indicator: 1  Yes
Accounts that meet these criteria could either be avoided or an additional rating factor
could be introduced, as discussed below.
Conversely, Node 11 is a profitable node. Groups in Node 11 meet the following criteria:
Multi-location Indicator: 1  Multi-locations
Own Occupation Indicator: 2  Greater than 5 years
PWC (Prof White-collar % based on covered salary): 1  Less than 60%
Benefit Percentage: 1  Less than 60%
Accounts that meet these criteria are potentially more profitable and therefore more attrac-
tive and could be pursued.
2. A reaction mechanism. The model is reasonably effective at discriminating between po-

tentially unprofitable and profitable accounts. In Table 19.7, we compare different under-
writing strategies.
Scenario 1 is the baseline—rating manual—scenario.
Scenario 2 is a straightforward underwriting strategy in which the predicted unprofitable

groups are excluded and only predicted profitable groups are written. Under this strategy,
profitability in total is about 3 times the base case, while the number of cases written falls
by 16%.
Scenario 3 combines an underwriting and pricing strategy. Groups that are predicted to be
unprofitable are rated up by 10%. Assuming that the increased rates do not result in loss of
business, the overall profitability of the block will be higher than Scenario 2, at about 5
times that of the base case.
Scenario 4 requires that those groups that are identified as potentially profitable and con-
firmed by actual experience (testing) of the models be written. This results in almost 50%
reduction in number of groups written but results in significantly higher profits.
In Scenario 5, cases are accepted if the model indicates profitability (and this is confirmed
by the actual model test) while unprofitable groups are rated up 10%. In this scenario, some
groups reject the rating increase or seek another carrier, so the number of groups written
falls by about one-third while the profitability of the book is significantly increased. Over-
all, Scenario 5 increases total profitability, although profitability per account is lower, be-
cause it identifies a larger number of potentially profitable accounts than Scenario 4.
TABLE 19.7
Book Profitability Under Different Underwriting Strategies

Average Total
Underwriting Profit Profit Cases Percentage
Scenario Decision per case (millions) written of book
Accept all cases as rated by
1 rate manual.
$3,947 $41.2 10,438 100%
Accept all cases predicted to

2 be profitable; reject all pre- $14,426 $126.6 8,776 84%
dicted unprofitable cases.
Accept all cases predicted to
be profitable; rate up all pre-
3 dicted unprofitable cases by
$19,870 $207.4 10,438 100%
+10%.
Accept all positive profit
cases for which model is di-
4 rectionally correct; reject re-
$60,613 $354.1 5,842 56%
maining cases.
Accept all positive profit
cases for which model is di-
rectionally correct; rate di-
5 rectionally correct
$55,374 $377.1 6,810 65%
unprofitable cases + 10%;
reject remaining cases.
While no model will ever replace an underwriter’s judgment (particularly our analysis of the
model and possible underwriter reactions), this analysis shows that a predictive model, used in
combination with underwriter rules, has a capacity to increase profitability of a book of busi-
ness.

Chapter 19 A Predictive Model For Disability Underwriting Profitability ...

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter 19 A Predictive Model For Disability Underwriting Profitability ...

Uploaded by

Copyright:

Available Formats

19 A PREDICTIVE MODEL FOR

DISABILITY UNDERWRITING PROFITABILITY

19.2 THE MODEL

19.3 CHOICE OF DECISION TREE

Test of Three Different Decision Tree Models

Short Description – Decision Trees

 C&RT. The Classification and Regression Trees method is based on minimization of

19.3.1 TREE MODEL RESULTS

1. Model 1 included all independent variables as input variables.

Correct/Incorrect Assignment of Groups by Three Models

Evaluation Statistics for C&R T Model

19.3.2 ANALYSIS AND CHOICE OF TREE MODEL (C&R TREE)

The C&R Tree was chosen for two reasons:

19.4 CHOICE OF VARIABLES AND DEVELOPMENT OF FINAL MODEL

1.00 2.00, 3.00, 4.00

Node 2 Node 11 Node 17 Node 30

2.00 1.00; 3.00; 4.00; 5.00 1.50 > 1.50

Node 9 Node 10 Node 20 Node 21 Node 23 Node 24 Node 27 Node 28

19.5 MODEL EVALUATION

19.5.1 STATISTICAL EVALUATION

Using the entire dataset (100% of the data)

Using the training set (50% of all data)

19.5.2 BUSINESS EVALUATION

Predicted and Actual Profitability of Groups, by Model Node

Directional Predictive Accuracy

Predicted and Actual Profitability by Node

19.6 USING THE MODEL FOR UNDERWRITING

In Table 19.6, we demonstrated that profitability of individual subgroups, as identified by the

Using the model for underwriting requires two things:

2. A reaction mechanism. The model is reasonably effective at discriminating between po-

Scenario 1 is the baseline—rating manual—scenario.

Scenario 2 is a straightforward underwriting strategy in which the predicted unprofitable

Book Profitability Under Different Underwriting Strategies

Accept all cases predicted to

You might also like