Insurance Analytics: Prof. Julien Trufin

Insurance Analytics
Prof. Julien Trufin
Année académique 2020-2021
1
Performance evaluation
2
Introduction
Context
• Actuaries resort to advanced statistical tools to be able to accurately assess

the risk profile of the policyholders.
• If the data are subdivided into groups determined by many features, actuaries
are often faced with sparsely populated risk classes so that simple averages
become suspect and regression models are needed.
• Regression models predict a response variable from a function of features.
• Actuarial pricing models are generally calibrated so that a measure of the
goodness-of-fit is optimized (deviance or log-likelihood, in most cases).
- Include in-sample errors and out-of-sample errors (predictive performance
criteria).
3
Introduction
Context
• In this chapter, we aim to evaluate performance of a candidate premium

based on the two following aspects :
- the variability of the resulting premium amounts, as larger premium
differentials induce more lift.
- the ability of the premium income to match the true one for increasing risk
profiles.
• The first objective can be formalized with the help of the convex order that
can be characterized by means of the Lorenz curves.
• The second objective can be assessed by means of concentration curves.
4
True and working pure premiums
Regression function
• Consider a response Y and a set of features X1 , . . . , Xp gathered in the

vector X .
• The dependence structure inside the random vector (Y , X1 , . . . , Xp ) is
exploited to extract the information contained in X about Y .
• In actuarial pricing, the target is µ(X ) = E[Y |X ].
• µ(X ) is generally unknown and approximated by a (working, or actual)
premium π(X ).
• The merits of a given pricing tool can be assessed using the pair
µ(X ), π(X ) .
- The premium π(X ) has to be as close as possible to the true premium µ(X ).
5
Technical assumptions
• All predictors π(X ) under consideration and µ(X ) are continuous random
variables.
• A predictor π(X ) is supposed to be correct on average, that is,
E[π(X )] = E[µ(X )] = E[Y ].
6
Notation
• Fπ (t) the distribution function of π(X ), i.e.
Fπ (t) = P[π(X ) ≤ t], t ≥ 0,
• fπ the probability density function of π(X ), i.e.

Z t
Fπ (t) = fπ (s)ds, t ≥ 0.
0
• Fπ−1 the associated quantile function (or Value-at-Risk) defined as the

generalized inverse of Fπ , i.e.
Fπ−1 (α) = inf{t|Fπ (t) ≥ α} for a probability level α.
• Our continuity assumption ensures that the identity

Fπ Fπ−1 (α) = α holds true for all probability levels α.
7
Convex order
• The more π(X ) is dispersed, the more information it contains about the true
premium.
- The constant predictor π(X ) = E[Y ], the least dispersed one, does not bring
any information about the relative riskiness of the different policies.
• Definition :
Consider two non-negative random variables Z1 and Z2 . Then, Z1 is said to
be smaller than Z2 in the convex order, henceforth denoted as Z1 cx Z2 , if
E[g(Z1 )] ≤ E[g(Z2 )]
for all the convex functions g for which the expectations exist.
8
Convex order
• We have
Z1 cx Z2 ⇒ V[Z1 ] ≤ V[Z2 ].
⇒ cx is a variability order : it only applies to random variables with the
same expected value and compares the dispersion of these variables.
• We can interpret Z1 cx Z2 as “Z2 is more variable than Z1 ”.
- The variability in question extends beyond the simple comparison of standard
deviation.
9
Performance curves
Concentration curve
• Definition :
The concentration curve of the true premium µ(X ) with respect to the
working premium π(X ) is defined as
E µ(X )I[π(X ) ≤ Fπ−1 (α)]

α 7→ CC[µ(X ), π(X ); α] = .
E[µ(X )]
• Interpretation
:
CC µ(X ), π(X ); α represents the proportion of the total true premium
income corresponding to the sub-portfolio π(X ) ≤ Fπ−1 (α), i.e. to the
100α% of contracts with the smallest premium π.
10
Performance curves
Concentration curve
• Idea :
Policies with low-risk profiles are at risk of leaving the portfolio, being
attracted by a competitor.
- It is therefore important not to over-charge this group of policyholders.
- Hence the importance of the concentration curve to assess the appropriateness
of the premium π.
11
Performance curves
From premiums to ranks
• Notice that
π(X ) ≤ Fπ−1 (α) ⇔ Fπ π(X ) ≤ α.

- It is enough to consider the ranking induced by the predictor.

- We are free to replace every predictor π(X ) with the corresponding rank

Π = Fπ π(X ) ∼ Uni(0, 1).
• Π is the rank of a policyholder, once all contracts have been ordered

according to their corresponding premiums.
• A concentration curve alone is not enough to assess performance of π. Only
the rank induced by π matters, not the actual values of π.
⇒ We also rely on the Lorenz curve.
12
Performance curves
Lorenz curve
• Definition :
The Lorenz curve LC associated with the predictor π(X ) is defined as
α 7→ LC[π(X ); α] = CC[π(X ), π(X ); α]

E π(X )I[π(X ) ≤ Fπ−1 (α)]

= .
E[π(X )]
• Interpretation :
A Lorenz curve is thus strictly related to dispersion (or variability) by
definition.
- It is known that increasing the predictor π(X ) in the convex order moves its
Lorenz curve lower.
13
Performance curves
Concentration curve and Lorenz curve
• If π(X ) = µ(X ) then
LC[π(X ); α] = CC[µ(X ), π(X ); α]
for all probability levels α.

- The sub-portfolio corresponding to π(X ) ≤ Fπ−1 (α) is in equilibrium.
• In practice, π(X ) only approximates µ(X ) : π(X ) 6= µ(X ).
⇒ We resort to the pair of curves CC[µ(X ), π(X ); α] and LC[π(X ); α] to
evaluate performance of a pricing model.
• A large difference between the two performance curves thus suggests that
π(X ) poorly approximates µ(X ).
14
Performance curves
Estimation concentration curve

• µ(X ) is not observed in reality ⇒ CC µ(X ), π(X ); α ?
• Actually, from the law of total expectation :
E Y I[π(X ) ≤ Fπ−1 (α)]

CC µ(X ), π(X ); α = CC[Y , π(X ); α] = .
E[Y ]
⇒ A concentration curve can also be interpreted as the proportion of the

total losses Y attributable to the sub-portfolio gathering a given proportion
α of policies with the lowest predictions.
15
Performance curves
Estimation concentration curve
• We can equivalently replace the pure premium µ(X ) with the response Y in
the concentration curve.
• Assuming the samples (Yi , X i ), i = 1, . . . , n, to be iid, the concentration
curve can be estimated as follows :

CC
c µ(X ), π(X ); α = CC[Y c , π(X ); α]
1 X
= Yi
nY bπ−1 (α)
i|b
π (X i )≤F
P
i|b
π (X i )≤F b −1 (α) Yi
= Pn π .
i=1 Yi
• CC
c expresses the total sub-portfolio loss in relative terms, as a percentage of
the aggregate loss at the entire portfolio level.
16
Performance curves
Estimation Lorenz curve
• The empirical version of the Lorenz curve is obtained as

P
i|b b −1 (α) π
π (X i )≤F b(X i )
LC π(X ); α =
c Pn π .
i=1 π
b(X i )
• LC
c expresses the percentage of the total premium income corresponding to
the 100α% smaller premiums when the latter are computed using a predictor
π.
17
Performance curves
Properties
• The Lorenz curve inherits the properties of the concentration curve as a

special case.
• Monotonicity :
The concentration
curve
is based on the function
t 7→ E Y I[π(X ) ≤ t] /E[Y ] evaluated at quantiles of π(X ). This function is
clearly non-decreasing, starting from (0, 0) to reach (1, 1).
⇒ α 7→ CC[Y , π(X ); α] is non-decreasing and satisfies
lim CC[Y , π(X ); α] = 0 and lim CC[Y , π(X ); α] = 1.

α→0 α→1
18
Performance curves
Properties
• Line of independence/equality :
- If Y and π(X ) are independent then
E[Y ]P π(X ) ≤ Fπ−1 (α)

CC[µ(X ), π(X ); α] = = α.
E[Y ]
- If π(X ) brings a lot of information about the true premium µ(X ), then the
concentration curve should be far from the line of independence.
19
Performance curves
Properties
• Line of independence/equality :
- Proposition :
If µ(X ) is positively expectation dependent on π(X ), that is, if the inequality

E[µ(X )] ≥ E µ(X )π(X ) ≤ t
holds for all t, then
CC[µ(X ), π(X ); α] ≤ α for all probability levels α.
Proof :
It suffices to write

E µ(X )I[π(X ) ≤ t] P[π(X ) ≤ t]E µ(X )π(X ) ≤ t
=
E[Y ] E[Y ]
≤ P[π(X ) ≤ t].
The announced then follows by replacing t with Fπ−1 (α).
20
Performance curves
Properties
• Convexity :
- Proposition :
The concentration curve α 7→ CC[µ(X ), π(X ); α] is convex if, and only if,
µ(X ) is positively regression dependent on π(X ), that is, if the function

t 7→ E µ(X )π(X ) = t
is non-decreasing.
- The increments of the function
E Y I[Fπ−1 (α) < π(X ) ≤ Fπ−1 (α + ∆)]

CC[Y , π(X ); α + ∆] − CC[Y , π(X ); α] =
E[Y ]
are thus non-decreasing in α.
21
Performance curves
Measuring goodness-of-lift
• Performance of a predictor :
- The performances of a predictor π(X ) is assessed by means of the respective
positions of the two curves
α 7→ LC[π(X ); α] and α 7→ CC[µ(X ), π(X ); α].
- As the total expected income of π and µ match the total expected loss, the
two ratios are directly comparable.
- As actuaries, we would like that the graph of CC is as close as possible to the
graph of LC.
⇒ The smaller the area between the two curves the better.
22
Comparison of the performances of two predictors
Concentration and Lorenz curves
• We have two predictors π1 and π2 .

• Definition :
The premium π1 (X 1 ) is more discriminatory than π2 (X 2 ) if, and only if,
π2 (X 2 ) cx π1 (X 1 ) ⇔ LC[π1 (X 1 ); α] ≤ LC[π2 (X 2 ); α] for all α
and the inequality
CC[µ(X ), π1 (X 1 ); α] ≤ CC[µ(X ), π2 (X 2 ); α]
holds for all probability levels α.
23
Concentration and Lorenz curves
• Proposition :
If
π2 (X 2 ) cx π1 (X 1 ) and (Y , Π2 ) conc (Y , Π1 )
then predictor π1 (X 1 ) is more discriminatory than predictor π2 (X 2 ) for
response Y .
• π1 (X 1 ) is more discriminatory than π2 (X 2 ) if π1 (X 1 ) is simultaneously
more variable in the sense of the convex order) and more correlated (in the
sense of the concordance order) with the response Y than π2 (X 2 ).
24
Integrated concentration and Lorenz curves
• The preference relation proposed earlier only forms a partial ranking :

- Two predictors might well be incomparable because their respective
concentration or Lorenz curves intersect.
• In such a case, we can base the comparison on the integral of the
concentration curves :
Z α
ICC[µ(X ), π(X ); α] = CC[µ(X ), π(X ); ξ]dξ
0

α E µ(X )I[Π ≤ ξ] E µ(X )(α − Π)+
Z
= dξ =
0 E[Y ] E[Y ]

Cov µ(X ), (α − Π)+
= + E (α − Π)+ ,
E[Y ]
where
α α2
Z

E (α − Π)+ = (α − ξ)dξ = .
0 2
25
Integrated concentration and Lorenz curves
• Again, as

E Y (α − Π)+ = E E[Y (α − Π)+ |X ]

= E µ(X )(α − Π)+
we are allowed to replace µ(X ) with Y in the definition of the integrated

concentration curve.
• ICC is the integral of the concentration curve over the whole interval [0, 1],
i.e.
ICC = ICC[µ(X ), π(X ); 1]

Cov µ(X ), 1 − Π 1
= +
E[Y ] 2

1 Cov µ(X ), Π
= − .
2 E[Y ]
26
Some useful insurance metrics : Area Between the Curves
• The area between the two curves CC and LC turns out to be a better
performance indicator.
• This area between the curves, ABC in short, is given by
Z 1
ABC[π(X )] = CC[Y , π(X ); α] − LC[π(X ); α] dα
0
1
Z 1
= E Y I[Π ≤ α] − E π(X )I[Π ≤ α] dα
E[π(X )] 0
Z 1Z ∞
1
= P π(X ) ≤ y, Π ≤ α] − P Y ≤ y, Π ≤ α] dydα
E[π(X )] 0 0
1
= Cov π(X ), Π − Cov Y , Π .
E[π(X )]
27
Numerical examples
Assumptions
• π(X ) ∼ Gam(µ, σ 2 ) with µ = 1.

- Ordered in the cx -sense with σ.
2
• µ(X ) ∼ Gam(µ, σY ) with µ = 1.
• Remark : E[µ(X )] = E[π(X )] = 1.
• Dependence structures linking µ(X ) and π(X ) :
- Clayton copula :
−1/θ
Cθ (u, v ) = u −θ + v −θ − 1 , θ > 0.
- Frank’s copula :

1 (exp(−θu) − 1)(exp(−θv ) − 1)
Cθ (u, v ) = − ln 1 + , θ 6= 0.
θ exp(−θ) − 1
Express positive dependence ⇒ Previous results hold true.
28
Numerical examples
Variability
Line type π(X ) µ(X ) Copula C ABC

medium dash Gam(1, 1) Gam(1, 2) Clayton(τ = 0.5) 6.33%
short dash Gam(1, 1) Gam(1, 1) Clayton(τ = 0.5) 9.66%
dotted Gam(1, 1) Gam(1, 0.5) Clayton(τ = 0.5) 13.08%
29
Numerical examples
Dependence
Line type π(X ) µ(X ) C ABC

medium dash Gam(1, 1) Gam(1, 1) Clayton(τ = 0.75) 3.46%
short dash Gam(1, 1) Gam(1, 1) Clayton(τ = 0.50) 9.66%
dotted Gam(1, 1) Gam(1, 1) Clayton(τ = 0.25) 17.04%
30
Numerical examples
Crossing copulas
• Consider a Clayton copula C1 and a Frank copula C2 .
• There exists a function f such that C1 (u, v ) − C2 (u, v ) ≤ 0 if v ≤ f (u) and
C1 (u, v ) − C2 (u, v ) ≥ 0 if v ≥ f (u).
⇒ Not ordered according to the concordance order.
short dash Gam(1, 1) Gam(1, 1) Frank(τ = 0.5) 7.79%
dotted Gam(1, 1) Gam(1, 1) Clayton(τ = 0.5) 9.66%
31
Numerical examples
Non-regression dependent copula impact

• Consider the mixture
C (u, v ) = (1 − θ) min{u, v } + θ max{0, u + v − 1}.
- Does not exhibit positive quadrant dependence.

- Positive expectation dependence if, and only if, θ ≤ 21 .
dotted Gam(1, 1) Gam(1, 1) θ = 0.8 10%
32
Case study
Data set
• French motor third-party liability insurance portfolio freMTPL2freq available

in the CASdatasets package in R.
- 678,013 observations ;
- Response Y : number of claims ;
- 9 explanatory variables (X = (X1 , . . . , X9 )) :
Policyholder : age, density of inhabitants in the home city, region, area,
bonus-malus ;
Car : power, age, brand, fuel type ;
- Exposure-to-risk.
• Partition the data set into
- Training set of 610,000 observations ;
- Validation set comprising the remaining observations.
33
Case study
Models
• Models of Noll et al. (2018) for the predictors πk (X k ) :

- glm1 : Poisson GLM with a log-link function and all explanatory variables
- glm3 : same as glm1 but without area and region variables
- pbm1 : boosted SBS (Standardized Binary Splits) tree (depth = 1, iterations
= 30)
- pbm3 : boosted SBS tree (depth = 3, iterations = 50)
- pbm3.s2 : boosted SBS tree (depth = 3, iterations = 50, shrinkage = 0.5)
- glm1.pbm3 : boosted SBS tree starting from glm1 fit (depth = 3, iterations =
50)
- nn : shallow neural network (20 neurons with one hidden layer).
34
Case study
In- and out-of-sample errors
35
Case study
Goodness-of-lift metrics
36
Case study
Goodness-of-lift metrics
37
References
References
Denuit, M., Sznajder, D., Trufin, J. (2019).
Model selection based on Lorenz and concentration curves, Gini indices and convex order.
Insurance : Mathematics and Economics 89, 128-139.
Frees, E.W., Meyers, G., Cummings, A.D. (2011).
Summarizing insurance scores using a Gini index.
Journal of the American Statistical Association 106, 1085-1098.
Frees, E.W., Meyers, G., Cummings, A.D. (2013).
Insurance ratemaking and a Gini index.
Journal of Risk and Insurance 81, 335-366.
Gourieroux, C. (1992).
Courbes de performance, de sélection et de discrimination.
Annales d’Économie et de Statistique 28, 107-123.
Gourieroux, C., Jasiak, J. (2011).
The Econometrics of Individual Risk : Credit, Insurance, and Marketing.
Princeton University Press.
38

Insurance Analytics: Prof. Julien Trufin

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Insurance Analytics: Prof. Julien Trufin

Uploaded by

Copyright:

Available Formats

Insurance Analytics

Prof. Julien Trufin

Année académique 2020-2021

• Actuaries resort to advanced statistical tools to be able to accurately assess

• In this chapter, we aim to evaluate performance of a candidate premium

• Consider a response Y and a set of features X1 , . . . , Xp gathered in the

E[π(X )] = E[µ(X )] = E[Y ].

• Fπ (t) the distribution function of π(X ), i.e.

Fπ (t) = P[π(X ) ≤ t], t ≥ 0,

• fπ the probability density function of π(X ), i.e.

• Fπ−1 the associated quantile function (or Value-at-Risk) defined as the

Fπ−1 (α) = inf{t|Fπ (t) ≥ α} for a probability level α.

• Our continuity assumption ensures that the identity

E µ(X )I[π(X ) ≤ Fπ−1 (α)]

From premiums to ranks

- It is enough to consider the ranking induced by the predictor.

• Π is the rank of a policyholder, once all contracts have been ordered

α 7→ LC[π(X ); α] = CC[π(X ), π(X ); α]

Concentration curve and Lorenz curve

• If π(X ) = µ(X ) then

LC[π(X ); α] = CC[µ(X ), π(X ); α]

for all probability levels α.

Estimation concentration curve

E Y I[π(X ) ≤ Fπ−1 (α)]

⇒ A concentration curve can also be interpreted as the proportion of the

Estimation concentration curve

Estimation Lorenz curve

• The empirical version of the Lorenz curve is obtained as

• The Lorenz curve inherits the properties of the concentration curve as a

lim CC[Y , π(X ); α] = 0 and lim CC[Y , π(X ); α] = 1.

E[Y ]P π(X ) ≤ Fπ−1 (α)

holds for all t, then

CC[µ(X ), π(X ); α] ≤ α for all probability levels α.

The announced then follows by replacing t with Fπ−1 (α).

are thus non-decreasing in α.

α 7→ LC[π(X ); α] and α 7→ CC[µ(X ), π(X ); α].

Concentration and Lorenz curves

• We have two predictors π1 and π2 .

π2 (X 2 ) cx π1 (X 1 ) ⇔ LC[π1 (X 1 ); α] ≤ LC[π2 (X 2 ); α] for all α

and the inequality

holds for all probability levels α.

Concentration and Lorenz curves

Integrated concentration and Lorenz curves

• The preference relation proposed earlier only forms a partial ranking :

Integrated concentration and Lorenz curves

we are allowed to replace µ(X ) with Y in the definition of the integrated

ICC = ICC[µ(X ), π(X ); 1]

Some useful insurance metrics : Area Between the Curves

• π(X ) ∼ Gam(µ, σ 2 ) with µ = 1.

Express positive dependence ⇒ Previous results hold true.

Line type π(X ) µ(X ) Copula C ABC

Line type π(X ) µ(X ) C ABC

Non-regression dependent copula impact

- Does not exhibit positive quadrant dependence.

• French motor third-party liability insurance portfolio freMTPL2freq available

• Models of Noll et al. (2018) for the predictors πk (X k ) :

In- and out-of-sample errors

You might also like

π2 (X 2 ) cx π1 (X 1 ) ⇔ LC[π1 (X 1 ); α] ≤ LC[π2 (X 2 ); α] for all α