7 views

Uploaded by Patrick Mugo

khb

- Business Studies SA -1
- The Importance of Insurance Risk Management
- Case Study Skandia
- Earthquake_document.pdf
- Insurance Notebook (1)
- New Business Development Financial Director in Denver CO Resume
- LIC Book English
- Inventory
- Untitled
- 1977 Annual Letter
- Final
- Mahesh_Chitte
- Unit 15 - Risk Assessment Form
- Trad
- 1778945 Business Financial Hazard Part 4_FamilySuccessionEqualization
- General Insurance Final Print
- Kaonga
- Role of Microcredit and Microinsurance in Coping With Natural Hazard Risks
- Tourbier Renewal Notice
- One-Way

You are on page 1of 26

with Applications to Experience Rating in General

Insurance

To cite this article: Xiaoqiang Cai, Limin Wen, Xianyi Wu & Xian Zhou (2015)

Credibility Estimation of Distribution Functions with Applications to Experience

Rating in General Insurance, North American Actuarial Journal, 19:4, 311-335, DOI:

10.1080/10920277.2015.1057649

http://www.tandfonline.com/action/journalInformation?journalCode=uaaj20

Download by: [University of Nairobi Library] Date: 03 November 2016, At: 05:53

North American Actuarial Journal, 19(4), 311335, 2015

Copyright

C Society of Actuaries

DOI: 10.1080/10920277.2015.1057649

Applications to Experience Rating in General Insurance

1

Department of Systems Engineering and Engineering Management, Chinese University of Hong Kong,

Hong Kong, China

2

Institute of Mathematics and Information Science, Jiangxi Normal University, Jiangxi, China

3

Center of International Finance and Risk Management, East China Normal University, Shanghai, China

4

Department of Applied Finance and Actuarial Studies, Macquarie University, Sydney, Australia

This article presents a new credibility estimation of the probability distributions of risks under Bayes settings in a completely

nonparametric framework. In contrast to the Fergusons Bayesian nonparametric method, it does not need to specify a mathematical

form of the prior distribution (such as a Dirichlet process). We then show the applications of the method in general insurance premium

pricing, a procedure commonly known as experience rating, which utilizes the insureds claim experience to calculate a proper premium

under a given premium principle (referred to as a risk measure). As this method estimates the probability distributions of losses, not

just the means and variances, it provides a unified nonparametric framework to experience rating for arbitrary premium principles. This

encompasses the advantages of the well-known Buhlmanns and Fergusons approaches, while it overcomes their drawbacks. We first

establish a linear Bayes method and prove its strong consistency in nonparametric settings that require only knowledge of the first two

moments of the loss distributions considered as a stochastic process. Then an empirical Bayes method is developed for the more general

situation where a portfolio of risks is observed but no knowledge is available or assumed on their loss and prior distributions, including

their moments. It is shown to be asymptotically optimal. The performance of our estimates in comparison with traditional methods is

also evaluated through theoretical analysis and numerical studies, which show that our approach produces premium estimates close to

the optima.

1. INTRODUCTION

Pricing or measuring risks is one of the central tasks of financial enterprises and regulators in risk management. Numerous

risk measures have been proposed for this purpose, including value-at-risk, conditional value-at-risk, coherent risk measures, and

distortion risk measures, to name just a few; see, for example, Dhaene et al. (2006), Natarajan et al. (2009), Szego (2002), Wu

and Zhou (2006), and the references therein. In the context of general insurance, to compensate insured losses in property, health,

business, employment, etc., pricing risks is carried out by premium calculation principles or premium principle for short, which are

the translations of risk measures in insurance markets. Excellent reviews of premium principles can be found in Kaas et al. (2001,

chapter 5), Sundt (1999), and Young (2004), among others. In this article, the terms risk measure and premium principle are

alternately used to indicate a rule H (X) that assigns a premium to a risk X in terms of a functional of its distribution function.

In practice, it is only the first step in the process of risk pricing to determine an appropriate premium principle, because the

involved distributions are generally unknown so that H (X) can be estimated only by using the insureds claim experience together

with certain effective statistical methods. These procedures form the so-called experience rating. We are concerned with solutions

to this problem under a Bayesian setting: The decumulative distribution function Pr(X > x) (abbreviated as ddf hereafter) of a

risk X is identified by an unknown and unobservable parameter (vector) , written formally as S(x, ) = Pr(X > x| ), and is a

random variable. The distribution of , denoted by ( ) (specified or unspecified), is referred to as a prior distribution in statistics

and a structural function in actuarial science. The majority of the literature has mainly focused on the situations where S(x, ) is

fully specified given and ( ) is known (or assumed); thus the problems can be solved with the standard parametric Bayesian

methodology through posterior update. Relevant work in this area includes Heilmann (1989), Klugman (1992), Makov et al. (1996),

Address correspondence to Xianyi Wu, Department of Statistics and Actuarial Science, East China Normal University, 200241 Shanghai,

China. E-mail: xywu@stat.ecnu.edu.cn

311

312 X. CAI ET AL.

Pai (1997), Schmidt (1998), and Gomez et al. (2000, 2006). Although the applications of the parametric Bayesian methodology

have been extensively investigated for general insurance, the reality is that the knowledge of the mechanism underlying the

contingent loss is generally insufficient to specify ( ), making these applications impractical. To deal with such situations, one

approach is to allow some unknown parameters in ( ) but retain an assumed mathematical form of ( ). Then the risk pricing

can be carried out by means of empirical Bayes analysis introduced by Robbins (1955, 1964).

In many insurance practices, however, even a mathematical formula of S(x, ) is not available due to the scarcity of information.

Under this circumstance, the analysis can be done only in a distribution-free or nonparametric basis. The best-known solution in the

community of actuarial science so far is Buhlmanns approach to the net premium principle (known as credibility theory), whose

optimality is demonstrated by Buhlmann (1967). Under the net premium principle, Buhlmanns credibility premium of a future

claim Xn+1 can be simply expressed as a weighted average between the empirical individual mean (indicating the information

delivered by historical data of the risk itself) and the collective mean (indicating the aggregated information obtained from all

possible insureds):

where X n = n1 ni=1 Xi is the empirical mean of the claims data {X1 , X2 , . . . , Xn }, 0 = E[X] is known as the collective mean

of the risk, and z is the credibility factor. This linear weighted average has been so well received today that it is almost regarded

as a synonym for credibility theory. Quite a few remarkable contributions have been reported in this direction since Buhlmanns

pioneering work, including the weighted credibility models of Buhlmann and Straub (1970), the credibility for linear regression

models by Hachemeister (1975), the strong consistency of credibility estimates by Schmidt (1991), the asymptotic optimality of

empirical credibility by Mashayekhi (2002), and the credibility for seemingly uncorrelated regression models by Pitselis (2004);

see Norberg (2004) or Buhlmann and Gisler (2005) for more comprehensive review.

Another important solution stream to this problem originated from the seminal paper of Ferguson (1973) and a following paper

by Antoniak (1974) on Bayesian nonparametrics. It was subsequently applied to experience rating by Zehnwirth (1977, 1979,

1981) and Lau et al. (2006), among others. Its idea is briefly outlined below; more details can be found in the monograph of Ghosh

and Ramamoorthi (2003).

Let X and be two random variables on a measurable space (R, B), where X has a (conditional) ddf S(x, ) = Pr(X > x| )

and the distribution of is determined by a finite Borel measure () on the real line R such that Pr( A) = (A)/(R) for all

A B. We can consider {S(x, ), x R} as a stochastic process indexed by x in the sense that given each x R, S(x, ) is a

random variable (as a function of ). Denote by P the probability measure with ddf S(x, ) such that P ((x, )) = S(x, ) for all

x R. Then for each A B, P (A) is a random variable with distribution determined by (), and {P (A), A B} is a stochastic

process indexed by A B. It is in the above sense that P is considered as a random probability measure and referred to as a

Dirichlet process with time horizon B and prior distribution determined by ().

losses X1 , . . . , Xn are observed, the posterior distribution of P is also a Dirichlet process but with being updated to

After

+ ni=1 Xi , where Xi is the probability measure degenerated at Xi . Consequently, S(x, ) can be estimated by the posterior

mean

((x, )) n

S(x, ) = + Sn (x), (1.2)

(R) + n (R) + n

where Sn (x) = n1 ni=1 I (Xi > x). The corresponding empirical Bayes versions when the prior Dirichlet process contains a few

superparameters are discussed in Zehnwirth (1981). Then the experience rating can be carried out by inserting the estimated ddf

of X into any premium principle.

For Buhlmann-type solutions, unfortunately, as noted by Buhlmann (1970) and Gerber (1980), it is not easy to directly transplant

Buhlmanns method to other premium principles, so that almost all contributions in this area have been largely limited to the net

premium principle. The few exceptions are, chronologically, the variance premium principle (Buhlmann, 1970, chapter 4), Esscher

premium principle and a Buhlmann-type credibility estimation of the variance premium principle (Gerber 1980, Goovaerts et al.

1990 and Pan et al. 2008), and the general weighted loss function premium principle (due to Furman and Zitikis 2008) studied

by Wen et al. (2009). These exceptions prove that it is not feasible to directly apply the idea of Buhlmann (1967) to experience

ratemaking under arbitrary premium principles. On the other hand, for Ferguson-type solutions, while the approach provides a

unified solution for all premium principles, it requires the assumption of a Dirichlet prior and thus has an apparent drawback: like

any other prior in Bayesian methodology, the Dirichlet prior is proposed mainly due to mathematical convenience but is hardly

justifiable in practice. Without the Dirichlet assumption, the estimator in (1.2) would lose its justification, and the risk premium

H (Xn+1 ) by Fergusons approach would not be credible when the true prior is far from the Dirichlet.

CREDIBILITY ESTIMATION OF DISTRIBUTION FUNCTIONS 313

In this article we introduce a new distribution-free approach for experience rating under arbitrary premium principles without

precise specification of the prior distribution. It has two direct advantages: (1) truly distribution-free settings as in the Buhlmanns

credibility theory and (2) generating experience rating for arbitrary premium principles as achieved by Fergusons Bayesian

nonparametric method. The main idea is to first derive an estimate of S(x, ) by minimizing the L2 -distance and then embed

the estimated ddf, such as S(x, ), into the premium principle functional to obtain an estimate of the corresponding premium,

(X, ) = H (

H S(x, )). In addition, compared with Jewell (1974), who estimated S(x, ) for every fixed point x so that the

estimators are not necessarily ddfs, and hence may not be feasible to produce empirical premiums by plugging the estimators of

the distributions into mathematical formulae of premium principles, our estimates work smoothly for empirical ratemaking by

using a ddf to estimate S(x, ).

Specifically, we consider the following two models:

(1) Model I: The first two moments of S(x, ) with respect to the prior can be specified. In this case the data X1 , X2 , . . . Xn , . . .

are (conditionally) i.i.d. copies of X given . The investigation of Model I brings insight into and motivates the estimation

methods of the more practically meaningful Model II next.

(2) Model II: This is a general model in which we do not need to specify any moments of S(x, ). Suppose that we have a

portfolio of risks Xi , i = 1, 2, . . . , K, where each Xi is identified with a parameter (vector) i and contributes a series of data

Xi1 , Xi2 , . . . Xini , . . . , which are i.i.d. copies of Xi , given i . The purpose is again to estimate the distributions and then assign

proper premiums to the future claims Xi,ni +1 , i = 1, 2, . . . , K under a certain premium principle.

This case belongs to the framework of empirical Bayes methods first introduced by Robbins (1955, 1964), then applied to

credibility theory by Norberg (1980) under the net premium principle, and Zehnwirth (1981) for any premium principle under

Dirichlet priors.

Our work leads to a new and purely nonparametric estimation method for the ddf under the framework of Bayesian methodology.

To overcome the restriction of Dirichlet priors, we develop an approach in a quite different way from the so-called Bayesian

nonparametrics initiated by Ferguson (1973). It turns out, however, that our estimation happens to coincide with that of Ferguson

(1973) when the prior distribution is Dirichlet; see Remark 2.2 in Section 2 for more details. This is a surprising by-product and

shows that our results actually generalize those of Ferguson (1973).

The remainder of the article is organized as follows. In Section 2, starting with a naive idea of using Buhlmanns method

considered by Jewell (1974), we show by a counterexample that Jewells credibility does not estimate a ddf by a function that is

itself a ddf, and then we develop first the criterion for deriving the linear Bayes estimation and then discuss the optimal estimation

of S(x, ) under Models I and II together with the asymptotic properties of the estimators. Section 3 treats the experience rating

under a number of well-known premium principles. Concluding remarks are given in Section 4. To smooth the flow of presentation,

most of the technical proofs and auxiliary materials are relegated to appendices.

We first study Model I; that is, both the ddf S(x, ) of the risk X and the prior distribution ( ) of the unknown parameter

are unspecified, but their first two moments are known. This assumption will be removed when the empirical Bayes estimation

method is utilized. This section analyzes first a naive application of Buhlmanns idea to estimate S(x, ) (Jewell 1974) to show

that such a scheme does not work for Model I and therefore a new framework is needed.

Write

S0 (x) = E [S(x, )] , 02 (x) = Var (S(x, )) and, 02 (x) = E Var I(X>x) | , (2.1)

where I{X>x} is the indicator of event {X > x} and E and Var indicate that the expectations are computed with respect to the

distribution of . We also denote ( ) = E[X| ], 2 = E [Var(X| )] and 2 = Var (E[X| ]). The existence of the expectations

in (2.1) is obvious.

2.1. The Performance Measure to Be Optimized

Following Jewell (1974), because S(x, ) = E [I (X > x)| ] for a fixed x, the idea of Buhlmann (1967) suggested a credibility

estimate of S(x, ) given by

where Sn (x) = n1 ni=1 I (Xi > x) is the empirical ddf based on available data set (X1 , X2 , . . . , Xn ), and Z(x) =

n02 (x)/[02 (x) + n02 (x)] is the credibility factor with 02 (x) and 02 (x) defined in (2.1); x is included to indicate the

314 X. CAI ET AL.

02 (x) = S0 (x) E S 2 (x, ) and 02 (x) = E S 2 (x, ) S02 (x) (2.3)

and inserting Z(x) into (2.2), S(x, ) can be more precisely rewritten as

n+1

(n i + 1)E [S 2 (x, )] (n i)S 2 (x) S0 (x)E [S 2 (x, )]

S(x, ) = 0

I[X(i1) ,X(i) ) (x), (2.4)

i=1

(n 1)E [S 2 (x, )] nS02 (x) + S0 (x)

where = X(0) < X(1) X(2) X(n) < X(n+1) = are the order statistics of the sample.

This S (x, ) , however, is not generally monotone in x and thus not suitable to be a proper estimate of the ddf, as explicitly

shown in the counterexample below.

Counterexample: Consider the situation where X is nonnegative with E [S 2 (x, )] = q[(1 x)+ ]2 and S0 (x) = q(1 x)+ for

all x 0 and a known constant q (0, 1), where (a)+ = max(a, 0). One example is that takes only values 0 (with probability

1 q) and 1 (with probability q) and X degenerates at 0 for = 0 and S(x, 1) = (1 x)+ for all x 0, where a+ = max(a, 0),

so that 02 (x) = qx(1 x) and 02 (x) = q(1 q)(1 x)2 for all x [0, 1] and zero otherwise. For x (X(n) , 1), the credibility

estimate of S(x, ) is

qx(1 x)

S(x, ) = (1 Z(x))S0 (x) = ,

n(1 q) (n(1 q) 1)x

dS(x, ) q[( n(1 q) 1)x n(1 q)][( n(1 q) + 1)x n(1 q)]

= .

dx [n(1 q) (n(1 q) 1)x]2

It is easy to check that dS(x, ) /dx is strictly positive at every x (0, b) and negative at every x (b, 1], where b =

n(1 q)/( n(1 q) + 1) (0, 1). Consequently, as long as X(n) < b, S(x, ) is increasing in x (X(n) , b) and decreasing in

x (b, 1). Thus S(x, ) is not monotone.

The lack of monotonicity in the naive credibility estimate S(x, ) makes it difficult to, for example,

give an intuitive interpretation for S (x, ) and

compute premiums under S (x, ) .

This difficulty is obviously due to the dependence of the credibility factor Z(x) on x. A natural and tractable remedy is to restrict

the credibility factor Z to be a constant free of x and construct a convex set of ddfs containing the empirical distributions, and

then find an optimal estimate S(x,

) in that convex set. In this article, we propose to obtain such an optimal estimate S(x, ) by

seeking a function of the form

n

0 (x) + i I (Xi > x) (2.5)

i=1

2

+

n

min E S(x, ) 0 (x) i I (Xi > x) dx, (2.6)

0 (x),1 ,...,n i=1

where 0 (x) is a nonincreasing function of x, independent of the claims history, and 1 , . . . , n are real-valued decision variables of

the optimization problem. Though initially it appears that we need to require the estimating function in (2.5) to be a ddf, this turns

out to be unnecessary because the solution to (2.6) meets this condition automatically and 0 (x) is proportional to the marginal

ddf S0 (x) of X; see Theorem 2.1 below.

Remark 2.1. If X has a finite mean, it can be easily checked that the integration in (2.6) as well as 02 (x) dx and 02 (x) dx

are finite and all the theory below is valid. Otherwise, if the integral in (2.6) may be infinite, so that the criterion (2.6) fails, an

CREDIBILITY ESTIMATION OF DISTRIBUTION FUNCTIONS 315

easy remedy is to use a probability distribution function W (x) and replace dx with dW (x). Then all the results below remain valid

under W (x).

The following theorem gives an estimator of the ddf by solving the optimization problem (2.6).

Theorem 2.1. The credibility estimator of the ddf S(x, ), which minimizes (2.6), is given by

S(x, ) = ZSn (x) + (1 Z)S0 (x), (2.7)

where

n02

Z= with 02 = 02 (x) dx and 02 = 02 (x) dx, (2.8)

n0 + 02

2

which is referred to as credibility factor as well. Moreover, the mean integrated squared error for the optimal estimate is

+

2 02 02

E

S(x, ) S(x, ) dx = . (2.9)

02 + n02

Proof. By temporarily writing Yi = I (Xi > x) and using the notation in (2.1), the integrand in (2.6) can be computed by

2
2

n

n

n

E S(x, ) 0 (x) i Yi = Var S(x, ) i Yi + 1 i S0 (x) 0 (x)

i=1 i=1 i=1

n

n 2

n 2

= 2i 02 (x) + 1 i 02 (x) + 1 i S0 (x) 0 (x)

i=1 i=1 i=1

and thus is minimized with respect to 0 (x) at

0 (x) = 1 ni=1 i S0 (x). The corresponding minimum is

2

n

n

n

Var S(x, ) i Yi = 2i 02 (x) + 1 i 02 (x).

i=1 i=1 i=1

Integrating it with respect to x and then taking the minimization procedure with respect to i leads to j = 02 /(n02 + 02 ),

j = 1, 2, . . . , n, and the final minimum 02 02 /(02 + ni 02 ). This completes the proof.

, A, P) be a probability space and S(x, ) = Pr(X > x| ) = P ((x, )), where P is a random probability

measure on (R, B) (a stochastic process indexed by B B, for which the randomness comes from ),

. If P is a

Dirichlet process with parameter () (a finite Borel measure on the real line R) as defined in Ferguson (1973), then S(x, )

Beta((x), (R) (x)) with density

((R))

( |x) = (x)1 (1 )(R)(x)1 ,

((x))((R) (x))

S0 (x) = E [S(x, )] = and E S 2 (x, ) = .

(R) (R) ((R) + 1)

316 X. CAI ET AL.

Consequently, by (2.3),

02 (x) = and 02 (x) = .

(R) ((R) + 1) 2 (R) ((R) + 1)

Inserting these into (2.8) gives the credibility factor Z = n02 /(n02 + 02 ) = n/((R) + n). Thus the estimate in (2.7) is the same

as (1.2). In this aspect, Theorem 2.1 acts as a linearized version of Fergusons theory with weaker prior assumptions: Theorem 2.1

does not require the Dirichlet prior as in Ferguson (1973).

Since Z (0, 1), as a weighted sum of the empirical ddf Sn (x) and the marginal ddf S0 (x) of X (the so-called collective ddf),

the credibility estimator (2.7) is clearly a ddf. In addition, Z 1 as n and Z 0 as n 0, which allows the classical

credibility interpretation: More data lead to more credible empirical ddf Sn (x), where n = 0 indicates the extreme situation where

no sample is observed. Furthermore, we have the following theorem on the strong consistency of the estimator.

S(x, )| = 0 almost surely (a.s.).

n x

Proof. This follows from the well-known Glivenko-Cantelli Theorem on empirical distributions:

sup |S(x, )

S(x, )| Z sup |S(x, ) Sn (x)| + (1 Z) sup |S(x, ) S0 (x)| 0

x x x

We now consider the general situation where both the conditional ddf S(x, ) and the structure function ( ) are completely

unknown, including the structure parameters S0 (x), 02 and 02 . We propose an empirical Bayes method to estimate S0 (x),02 , and

02 based on the claim experiences over a number of risks in the same portfolio. More specifically, let X1 , X2 , . . . , XK denote K

risks under observation. Each Xi has a ddf characterized by a risk parameter i and contributes a sequence of claim experiences

denoted by vector Xi = (Xi1 , Xi2 , . . . , Xini ) over ni time periods, i = 1, . . . , K, subject to the following usual assumptions.

Assumption 2.1

1. Conditional on i , the random variables Xij (j = 1, 2, . . . , ni ) are i.i.d. Xi with common unknown ddf and moments:

S(x, i ) = Pr Xij > x|i , 2 (x, i ) = Var I(Xij >x ) |i , i = 1, 2, . . . , K, j = 1, 2, . . . , ni .

2. The random variables 1 , . . . , K are K i.i.d. random variables with a common but unknown prior distribution ( ), so that

S0 (x) = E [S(x, )], 02 (x) = Var (S(x, )) and 02 (x) = E [ 2 (x, )].

The task of estimating the individual ddf in a distribution-free setting is then accomplished in two steps. The first step involves

homogeneous estimation of the ddfs, and the second estimates the structural parameters 02 and 02 .

ni

Write Si (x) = n1

i j =1 I (Xij > x), i = 1, 2, . . . , K. To obtain the credibility estimator of S(x, i ), consider the class of ddfs

K

ns

K

ns

L= ast I (Xst > x) , ast R, ast = 1 (2.10)

s=1 t=1 s=1 t=1

CREDIBILITY ESTIMATION OF DISTRIBUTION FUNCTIONS 317

+

min E [g S(x, i )]2 dx . (2.11)

gL

The solution is stated in the next theorem. While the proof of the first part is similar to that of Theorem 2.1, hence omitted, the

one for the second part is put in Section A.1.1 of the appendices.

Theorem 2.3. The homogeneous credibility estimator of S(x, i ) as the solution to (2.11) is

S (x, i ) = Zi Si (x) + (1 Zi )

S(x), (2.12)

where

ni 02 1

K

Zi = , i = 1, 2, . . . , K, and

S(x) = K Zr Sr (x). (2.13)

0 + ni 02

2

r=1 Zr r=1

S (x, i ) is

r=i Zr + 1

2

02 02

E S (x, i ) S(x, i dx = . (2.14)

r=i Zr + Zi 0 + ni 0

2 2

is decreasing in every nr , r = 1, 2, . . . , K, indicating that adding samples to any policies improves the estimate

converges to zero as ni , regardless of how other nr (r = i) change, indicating that increasing the sample size of

policy i will pinpoint the true S(x, i ) and

does not tend to zero if ni , indicating that observing the claims from the policies other than i cannot pinpoint the

true risk characteristic of policy i.

One can also consider the class

K ns

0 (x) + ast I (Xst > x) : a0 , ast R, and

L1 = s=1 t=1 (2.15)

0 (x) is an arbitrary ddf

to derive an inhomogeneous estimator. The resulting estimator, however, is just the same as the one presented in Equations

(2.7)(2.8). This is a typical phenomenon in credibility theory, see, for example, Buhlmann and Gisler (2005), section 3.1.4.

We now proceed to the estimation of the structure parameters 02 and 02 , which needs the estimation of S0 (x). At this moment,

however, S(x) is not a suitable estimate of S0 (x) because Zi s involve 02 and 02 (cf. [2.13]). On the other hand, if we use an

K

estimator that minimizes (S(x) S0 (x))2 dx, where S(x) = K i=1 wi Si (x), with respect to w1 , w2 , . . . , wK , i=1 wi = 1, the

solution for S(x) is also S(x) =

S(x). Therefore, it is reasonable to select the unbiased estimate

1

K K

S(x) = ni Si (x), where N = ni depends on the portfolio size K. (2.16)

N i=1 i=1

Recall that under the general strong law of large numbers, if the summands are mutually independent, then cn1 nj=1 (Xj

E[Xj ]) 0 for any sequence {cn } of real numbers, provided that

a.s. 2

k=1 ck Var(Xk ) < ; see, e.g., theorem 3.1

ofK DasGupta

(2008, p. 35). It is then easy to see that S(x) is strongly consistent: Taking cK = N , since Var[nK SK (x)] = Var[ nj =1 I (XKj >

318 X. CAI ET AL.

a.s.

x)] 2n2K , it follows that S(x) S0 (x) as K under the condition

n2K

< . (2.17)

K=1

N2

In view of the definitions of 02 (x) and 02 (x), the parameters 02 and 02 can be estimated using

K

ni

2

K

2

SSE(x) = I Xij > x Si (x) and SSA(x) = ni Si (x) S(x) , (2.18)

i=1 j =1 i=1

an idea borrowed from analysis of variances (ANOVA). The estimators can be formally defined as

1

02 = SSE(x) dx (2.19)

N K

and

N K 1

02 = K SSA(x)dx SSE (x) dx . (2.20)

N2 i=1 n2i N K

Increasingly order the claims of individual i as Xi(1) , Xi(2) , . . . , Xi(ni ) and all N = n1 + + nK claims jointly as R1 , R2 , . . . , RN .

Then some algebraic computations give rise to

K ni

2j 1

SSE(x) dx = 1 Xi(j ) and (2.21)

i=1 j =1

ni

1 1

K ni N

SSA(x) dx = (2ni 2j + 1) Xi(j ) (2N 2j + 1)Rj , (2.22)

n

i=1 i j =1

N j =1

The theorem below provides the properties of estimators (2.19) and (2.20), whose proof is given in A.1.2 of the appendices.

Theorem 2.4.

02 and

02 have the following properties.

02 and

1. 02 are unbiased estimators of 02 and 02 , respectively.

2. Under condition (2.17), 02 02 and 02 02 almost surely as K .

Finally, by inserting the estimates of the structure parameters 02 and 02 into (2.12) and (2.13), we can get empirical Bayes

estimators of the ddfs S(x, i ) as

S(x, i ) = Zi

Zi Si (x) + 1 S0 (x), i = 1, 2, . . . , K, (2.23)

where

ni 02 1

K

Z i = , i = 1, 2, . . . , K, and

S0 (x) = K Z r Sr (x). (2.24)

02 + ni 02 r=1 Z r r=1

We now establish the asymptotic optimality of

S(x, i ) (a similar treatment for the credibility estimate of Buhlmann [1967] can

be found in Mashayekhi [2002], whereas an earlier discussion is Norberg [1980]). For simplicity, we discuss the balanced case in

CREDIBILITY ESTIMATION OF DISTRIBUTION FUNCTIONS 319

which all the ni take the same value equal to, for example, ni n, and thus Zi are all the same over i = 1, 2, . . . , K (so are Z i )

and, consequently,

1

K

S0 (x) = Sr (x).

K r=1

+ /(2+)

02 (x)

+ 02 (x) dx < for some constant > 0. (2.25)

n

The result is stated in the theorem below, and its proof is found in Section A.1.3 of the appendices.

Theorem 2.5. Under conditions (2.17) and (2.25), the S(x, i ) defined by (2.23) and (2.24) are asymptotically optimal in the

sense that

+ +

2 2

lim max E

S(x, i ) S(x, i ) dx S(x, i ) S(x, i ) dx = 0,

(2.26)

K 1iK

where

S(x, i ) is the linear estimator defined by (2.7) and (2.8).

As aforementioned, in this article we are concerned with the premium principle H (X) that is expressed as a functional of the

ddf of X; for example, for a nonnegative loss X the net premium principle is expressed as H (X) = E[X] = 0 S(x) dx, where S(x)

indicates the ddf of X. Having estimated the distributions, the applications to the experience ratemaking are apparent: replacing the

ddf S(x) by its estimate S(x) to produce experience ratemaking; for example, H (X) =

0 S(x) dx for the net premium principle;

details for every premium principle can be found below.

For a premium principle H, write H (X| ) for the risk premium of a risk X that is calculated by applying H to X with ddf S(x, ),

referred to also as the risk premium of X. By replacing S(x, ) with the credibility estimators S(x, ) (cf. [2.7]), or the empirical

Bayesian estimator S(x, i ) (cf. [2.23]) under Model II for risk i , an estimator H (X| ) of the risk premium H (X| ) is obtained.

We discuss the properties of the experience premium by inserting S(x, ) for S(x, ) in Section 3.1 and compare it with existing

methods in Section 3.2. The performance of the empirical Bayesian estimation S(x, i ) of S(x, ) and the corresponding version

of experience ratemaking using S(x, i ) are discussed in Section 3.3 by means of numerical studies.

Because the target of pricing is to pinpoint the risk premium H (X| ), it is a basic requirement that, as an estimator of the risk

premium, H (X| ) should pinpoint the risk premium if the experience can provide perfect statistical information. In the statistical

language, this can be achieved by the strong consistency of the ratemaking in the sense that there is a 100% chance that the estimate

(X| ) will approach the true value of H (X| ) as the available information increases unlimitedly. We first present the strong

H

consistency of H (X| ) in the theorem below, which follows directly from Theorems 2.1 and 2.2.

Theorem 3.1. The estimator H (X| ) is strongly consistent for H (X| ) if the premium principle H is continuous with respect to

the L -norm of ddfs, where the L -norm of a function f (x) on R is denoted and defined by
f (x)
= supxR |f (x)|. In other

S(x, ) satisfies supxR |

words, if the estimator (X| )

a.s. a.s.

S(x, ) S(x, )| 0, then H H (X| ).

According to Theorem 3.1, the strong consistency of H (X| ) is guaranteed under certain regularity conditions. This is a

significant improvement over such literature as Gerber (1980), where the credibility estimator is not generally (strongly) consistent,

and as Pan et al. (2008) and Wen et al. (2009), where the consistency of the credibility estimators needs to be proved separately in

every case. Quite a few premium principles H can be represented as a continuous function of expectations of certain functions of X,

for example, Kamps premium H (X| ) = E[X(1 eX )| ]/E[(1 eX )| ]. When the functions are bounded and continuous on

the support of X, it is well known that H is continuous with respect to the weak convergence of the distribution of X (Portmanteau

theorem; cf. DasGupta, 2008, Theorem 1.4), which is a stronger requirement than the continuity with respect to the L -norm in

320 X. CAI ET AL.

our Theorem 3.1. If the limiting ddf is continuous in x, however, these two conditions are equivalent; see theorem 1.3 of DasGupta

(2008).

There are many well-known and extensively discussed premium principles (cf. Young 2004) for which strong consistency of the

experience ratemaking can be easily checked, although not as a result of Theorem 3.1. They include the net premium H (X| ) =

E[X| ], variance premium H (X| ) = E[X| ] + Var(X| ), modified variance premium H (X| ) = E[X| ] + Var(X| )/E[X| ],

standard deviation premium H (X| ) = E[X| ] + Var(X| ), Esscher premiums H (X| ) = E[XehX | ]/E[ehX | ], Kamps

premium H (X| ) = E[X(1 eX )| ]/E[(1 eX )| ], conditional tail expectation premium H (X| ) = E[X|X > , ], and

exponential premium H (X| ) = 1 log[E(eX | )]. The following are two exceptions in which the consistency is not straightfor-

ward.

1. Dutchs premium principle: H (X| ) = E[X| ] + E[(X E[X| ])+ | ], where 0 < 1 and 1. Observe that

ESn [(X E[X| ])+ ] = n1 ni=1 (Xi ( ))+ , where ESn [g(X)] denotes the expectation of g(X) with respect to the

distribution generated by Sn (x). The estimate of H (X| ) is then given by

Z n

(X| ) =

H ( ) + (Xi

( ))+ + (1 Z) (x

( ))+ dF 0 (x) , (3.1)

n i=1

n

where dF0 (x) = 1 S0 (x) is the marginal cumulative distribution function of X,

( ) = ZX n + (1 Z)0 , and Xn = i=1 Xi

(which will be used thereafter). Its consistency is proved in Section A.2.1 of the appendices.

2. Distortion premium principle:

0

H (X| ) = g (S(x, )) dx g(1 S (x, )) dx. (3.2)

0

0

(X| ) =

H g (ZSn (x) + (1 Z) S0 (x)) dx g (ZFn (x) + (1 Z) F0 (x)) dx,

0

(X| ) H (X| ) as n

where the credibility factor is given by (2.8). For example, if X has bounded support, then H

a.s.

if g is a Lipschitz function on [0, 1] such that |g(x) g(y)| C|x y| for some constant C, which follows easily from

Glivenko-Cantellis theorem. For the case of unbounded support, the consistency still holds if g is a Lipschitz function on

[0, 1]; see Section A.2.2 of the appendices for more details.

In this section, we will evaluate the performance of this newly proposed approach by comparing it with some well-known

representative methods in the literature. More specifically, the experience rating developed in the last section is compared with the

methods of Buhlmann (1967) for a net premium, Buhlmann (1970) for a variance premium, and the credibility premiums of Gerber

(1980) and Pan et al. (2008) under the Esscher premium principle. This section will use the notation Xn = (X1 , X2 , . . . , Xn ).

Under the net premium, H (X| ) = E[X| ] = ( ). Replacing S(x, ) with the estimate S(x,

) presented in (2.7), the estimate

of H (X| ) is

(X| ) = ZX n + (1 Z) 0 ,

H (3.3)

which has the same form as Buhlmanns credibility premium c ( ) = Z c Xn + (1 Z c ) 0 but uses a different credibility

factor, where the superscript c denotes classical, Z = n 2 /(n 2 + 2 ) is the credibility factor, 2 = Var(( )), and

c

c ( ) differs from

( ) only in the credibility factors Z c and

Z = n0 /(n0 + 0 ). The following theorem provides the expected squared errors of the two estimators, of which the proof is

2 2 2

straightforward.

CREDIBILITY ESTIMATION OF DISTRIBUTION FUNCTIONS 321

( ) and

c ( ) are, respectively,

2 c 2 2

( ) ( )]2 = Z 2

E [ + (1 Z)2 2 and E

( ) ( ) = (Z c )2 + (1 Z c )2 2 . (3.4)

n n

As a result,

( ) ( )]2

E[

lim = 1. (3.5)

c ( ) ( )]2

n E[

To see how (3.5) follows from (3.4), just plug Z = n02 /(n02 + 02 ) and Z c = n 2 /(n 2 + 2 ) in (3.4) to obtain

n04 2 + 04 2 c 2 2 2

( ) ( )]2 =

E [ and E

( ) ( ) = .

(n02 + 02 )2 n 2 + 2

( ) ( )]2

E[ (04 2 + 04 2 /n)( 2 + 2 /n) 04 2 2

lim = lim = = 1.

c ( ) ( )]2

n E[ n (02 + 02 /n)2 2 2 (02 )2 2 2

c ( ) is optimal and hence theoretically better than

( ). The limit in (3.5), however, indicates

that the two are actually asymptotically equivalent. On the other hand, the worst estimate of the conditional mean should be the

collective mean 0 = E [( )], which has an expected squared loss 2 , as it does not take into account any information from the

historical data. The following example presents a clear figure on the performance of ( ) for small sample sizes, which shows that

( ) lies between 0 and

c ( ).

Example 3.1. Assume that X1 , X2 , . . . , Xn are i.i.d. as S(x, ) = ex , x > 0, and Gamma(, ) with density ( ) =

1 e / (), > 0, > 2, > 0, where and are known quantities. By some algebraic computations, the expected

squared errors of ( ),

c ( ), and 0 can be shown to satisfy the equalities

( ) ( ))2 ]

E[( 2n c ( ) ( ))2 ]

E[( 1

= 1 + and = . (3.6)

c ( ) ( ))2 ]

E[( ( 1)(n + 2 1)2 E[(0 ( )) ]2 n+1

The two equalities clearly show that the estimate c ( ) and both have the MSEs that are only of n1

( ) is slightly worse than

order of the MSE of the collective premium 0 .

Under the variance premium principle, the risk premium is H (X| ) = E[X| ] + Var(X| ) for some positive constant . Thus

(X| ) = ZX n + (1 Z) 0 + ZD2n + (1 Z) 2 + 2 + Z (1 Z) X n 0 2 ,

H (3.7)

n

where Dn2 = n1 i=1 (Xi Xn )2 . Recall Buhlmanns credibility premium (Buhlmann 1970, chapter 4):

HBul (Xn ) = bX n + (1 b) 0 + csn2 + (1 c) 2 + (1 b) 2 , (3.8)

where

1

n

n 2 Var 2 ( )

sn2 = (Xi Xn )2 , b= 2 , c= .

n 1 i=1 n + 2 Var 2 ( ) + E Var sn2 |

322 X. CAI ET AL.

Because the quantity Var(sn2 | ) in the credibility factor c involves the fourth moment of Xi given , Buhlmann suggested to use

2 4 ( )/(n 1) as an approximation under certain assumptions, so that c was in fact approximated by Var ( 2 ( ))/{Var ( 2 ( )) +

E [2 4 ( )/(n 1)]}. Therefore, in contrast to Buhlmanns version, a direct advantage of (3.7) is that we need only the first two

moments of the risk distribution.

The following example numerically illustrates how H (X| ) in (3.7) is close to Buhlmanns credibility estimator HBul .

Example 3.2. Poisson-exponential model: Let X be Poisson distributed with Pr(X = k| ) = k e /k! given , k = 0, 1, . . . ,

and has an exponential prior density ( ) = e ( > 0), such that ( ) = , 2 ( ) = , 0 = 1/, 2 = 1/2 and

2 = 1/. The risk and collective premiums are H (X| ) = (1 + ) and Hcol (X) = 1/ + ( + 1)/2 , respectively. The posterior

distribution of given Xn is Gamma(nXn +1, n+), with (conditional) mean (nXn +1)/(n+) and variance (nX n +1)/(n+)2 .

The estimators of H (X| ) include the Bayes premium

(n + )( + 1) +

HB (Xn ) = E[Xn+1 |Xn ] + Var(Xn+1 |Xn ) = (nXn + 1),

(n + )2

the premium Hcu (Xn ) in (3.7), Buhlmanns credibility premium HBul in (3.8), and collective premium Hcol (X), where and

henceforth, the subscript cu indicates the current experience premium. Consider the mean squared error of these estimators as

V& = E[(H& (Xn ) H (X| ))2 ] for & = B, Bul, cu, col. While Vcol = ( + 1)2 /2 + 2 /4 and VB = 2 [M 2 n + (Mn 1

)2 +(M +Mn1)2 ] are both exact, where M = [(n+)( +1)+]/(n+)2 , the values of VBul and Vcu can be approximated

only by the Monte Carlo method. We approximated the values of VBul and Vcu for fixed = 0.2 and a variety of values with

sample size n = 30 and n = 100. Accordingly, we also computed their relative efficiencies Eff & = (Vcol V& )/(Vcol VB )

(Eff col = 0 and Eff B = 1, and a larger value of Eff & stands for higher efficiency of the method &). The results are reported in

Table 1 which shows that (1) the estimate Hcu (Xn ) is better than the collective premium Hcol , (2) the Vcu is slightly larger than

VBul but the differences are negligible as the sample size increases, and (3) the estimates Hcu (Xn ), HBul (Xn ) are both very close

to the Bayes premium HB (Xn ).

The Esscher premium principle (Buhlmann 1980), expressed as H (X) = E[XehX ]/E[ehX ], is the optimal solution to minimizing

min

the expected exponentially weighted loss PR E[ehX (X P )2 ]. Gerber (1980) was the first to propose a version of its credibility

premium (referred to as Gerbers premium below). Recent work by Pan et al. (2008) found that Gerbers premium does not

converge to the risk premium in general and suggested a new credibility premium that does so (written as Pans premium below).

Generally, it is very difficult to compute Gerbers and Pans premiums; see Pan et al. (2008) or Wen et al. (2009) for detailed

accounts. The variety versions of Esscher premiums can be regarded as the solutions to the unified minimization problem

min E (Xn+1 P )2 ehXn+1 (3.9)

P

TABLE 1

Numerical Results of V& and Eff & for n = 30 and n = 100

n = 30 n = 100

VB Vcu Eff cu VBul Eff Bul Vcol VB Vcu Eff cu VBul Eff Bul Vcol

0.2 0.2405 0.4050 0.9822 0.3937 0.9832 61.000 0.0720 0.0958 0.9996 0.1180 0.9992 61.000

0.3 0.1530 0.1582 0.9886 0.1538 0.9893 20.938 0.0479 0.0842 0.9982 0.0779 0.9985 20.938

0.4 0.1189 0.1365 0.9907 0.1546 0.9863 10.562 0.0359 0.0496 0.9986 0.0503 0.9986 10.562

0.5 0.0947 0.18 0.9885 0.0840 0.9939 6.4000 0.0270 0.0306 0.9994 0.0285 0.9997 6.4000

0.6 0.0786 0.0803 0.9886 0.0792 0.9919 4.3086 0.0238 0.0288 0.9988 0.0270 0.9992 4.3086

0.7 0.0671 0.0769 0.9906 0.0720 0.9929 3.1053 0.0180 0.0207 0.9991 0.0199 0.9993 3.1053

0.8 0.0585 0.0847 0.9829 0.0876 0.9812 2.3476 0.0160 0.0174 0.9993 0.0170 0.9995 2.3476

0.9 0.0518 0.0399 0.9916 0.0383 0.9927 1.8387 0.0158 0.0162 0.9998 0.0160 0.9999 1.8387

1.0 0.0465 0.0656 0.9795 0.0664 0.9789 1.4800 0.0142 0.0171 0.9980 0.0171 0.9980 1.4800

Note: Vcol is independent of the sample size and thus is shared by n = 30 and n = 100.

CREDIBILITY ESTIMATION OF DISTRIBUTION FUNCTIONS 323

= the collection of all measurable functions g( ) of the parameter for the risk premium H [Xn+1 | ] =

E[Xn+1 ehXn+1 | ]/E[ehXn+1 | ]

= the collection of all measurable functions P (X1 , X2 , . . . , Xn ) of the samples for the Bayes premium HB (Xn ) =

E[Xn+1 ehXn+1 |Xn ]/E[ehXn+1 |Xn ]

G = q0 + ni=1 qi Xi : q1 , . . . , qn R for Gerbers premium HG (Xn ) = Z G Xn + (1 Z G E [( )]/Hcol (X))

Hcol (X) and

P = {p + q ni=1 Xi ehXi / ni=1 ehXi : p, q R} for Pans premium HP (Xn ) = Z P Hn + (1 Z P E [hn ( )]/H (x))

H (x), where

ZP = , ZG = ,

Var [hn ( )] + E Var Hn Xn | Var [( )] + n1 E [Var(X| )]

Hn = ni=1 Xi ehXi / ni=1 ehXi and hn ( ) = E (Hn | ) with E , Var , and Cov denoting, respectively, the expectation, vari-

ance, and covariance with respect to a fictitious distribution of , defined in terms of density by ( ) = ( )mh ( )/mh

with mh ( ) = E(ehX | ) and mh = E[mh ( )] = E(ehX ). See, for example, Pan et al. (2008) or Wen et al. (2009). In addi-

tion, Z = Zhn /(Zhn + (1 Z)h0 ), where hn = n1 ni=1 ehXi , h0 = E[ehX ] and Z is given by (2.8) in Theorem 2.1.

Note that the individual and collective premiums are independent of n.

Now that the Esscher premium principle is obtained by minimizing the exponentially weighted quadratic error in (3.9); we use

the weighted quadratic loss E[L(H& (Xn ))] = E[(Xn+1 H& (Xn ))2 ehXn+1 ], for & = col, B, P , G, and cu, to measure the closeness

of the experience premiums H& (Xn ). This is similar to what we have done in the case of the Buhlmanns credibility formula with

quadratic errors in Section 3.2.1. Note that it can be represented in terms of the risk premium H (X| ) as

E[L(H& (Xn ))] = E (Xn+1 H (X| ))2 ehXn+1 + E (H (X| ) H& (Xn ))2 ehXn+1 , (3.10)

where the first term of the right-hand side is independent of &. Hence, comparing E[(H& (Xn ))] can be reduced to comparing

2 2

V& = E H (X| ) H& Xn ehXn+1 = E H (X| ) H& Xn mh ( ) . (3.11)

Obviously E[L(HB (Xn ))] E[L(H& (Xn ))] for all & and E[L(H& (Xn ))] E[L(Hcol (Xn ))] for & = B, P , and G. Thus, HB (Xn )

is the best of H& (Xn ) over all values of &. While we do not generally know whether Hcu is better than HP , HG , and Hcol , it is

highly interesting that Hcu (Xn ) is optimal under the Bernoulli-Uniform model: It coincides with the Bayes premium HB (Xn ). This

is stated in Example 3.3 below. Example 3.4 compares V& for & = G, P and cu under the Poisson-Gamma model.

Example 3.3 (Bernoulli-Uniform Model). Let X be a Bernoulli variable given with Pr(X = 1| ) = 1 Pr(X = 0| ) = and

uniformly distributed over interval (0, 1). Then Hcu (Xn ) = HB (Xn ); see Section A.3.2 of the appendices for a proof.

i.i.d

Example 3.4 (Poisson-Gamma Model). Let Xi P oisson ( ) and Gamma(, ) with density ( ) = 1 e / (),

> 0, > 2, > 0. It follows that E(X| ) = Var(X| ) = and, given Xn , the posterior distribution of is Gamma( +

nX n , + n). The corresponding premiums are listed in Table 2, where

The term approx indicates that Pans premium HP X n can be computed only by a Monte Carlo approximation (an algorithm

is presented in Algorithm A.1)

For Hcu (Xn ), the credibility factor is Z = Zhn /(Zhn + (1 Z) h0 ), with Z = n 20 /(n 20 + 02 )

min(i, j ) ( + i + j ) ( + i) ( + j )

02 = (3.12)

i=1 j =1

i!j ! () ( + 2)+i+j () ( + 1)2+i+j

324 X. CAI ET AL.

TABLE 2

Experience Premiums under Poisson-Gamma Model

Premium and Individual Collective Bayes Pan Gerber Current

denotation H (X| ) Hcol (X) HB (Xn ) HP (Xn ) HG (Xn ) Hcu (Xn )

(eh +1) +neh +1 +neh +1

and

min(i, j ) ( + i + j )

02 = ; (3.13)

i=1 j =1

i!j ! () ( + 2)+i+j

see Section A.3.4 of the appendices for proofs of both (3.12) and (3.13) and

It is also interesting that

HG Xn = HB Xn ; (3.14)

We then compute V& . While Vcol = E[(H (x) H (X| ))2 ehXn+1 ] = e2h /( eh + 1)+2 is easy to obtain, the quantities V&

for & = B, P can only be computed numerically. In the following computation, h = 0.6 and = 6 are taken. For different values

of , the sample sizes n = 10 and n = 100 are considered. For the pairs of (, n), because HG (Xn ) = HB (Xn ), only the values of

V& for & = col, B, and P are necessary. The outcomes of a numerical experiment are listed in Table 3, where the efficiency, which

is defined as Eff & = (Vcol V& )/(Vcol VB ), is a measure of how well the experience premium H& (Xn ) performs, by comparing

with the best HB (Xn ). Table 3 shows that VB < VP < Vcu < Vcol for n = 10 and VB < VP Vcu < Vcol for n = 100. It is also

apparent that although Hcu is not the best, it is sufficiently good for practical use.

We conclude this section with two remarks: (1) there are cases where Hcu is optimal, and (2) even if Hcu is not optimal, it is

tightly close to the optima, which is strongly supported by the numerical results in Table 3, where the lowest efficiency of Hcu is

0.9296 at = 5 and n = 10 (a very small sample size).

TABLE 3

Numerical Results of V& and Eff & for n = 10 and n = 100

n = 10 n = 100

VB Vcu Eff cu VP Eff P Vcol VB Vcu Eff cu VP Eff P Vcol

2.0 0.1128 0.1740 0.9475 0.1521 0.9663 1.2776 0.0151 0.0368 0.9828 0.0369 0.9827 1.2776

2.5 0.1320 0.1847 0.9668 0.1697 0.9762 1.7191 0.0217 0.0398 0.9893 0.0391 0.9897 1.7191

3.0 0.2087 0.2574 0.9758 0.2423 0.9833 2.2207 0.0277 0.0517 0.9891 0.0515 0.9892 2.2207

3.5 0.2245 0.3002 0.9705 0.2829 0.9772 2.7889 0.0292 0.0651 0.9870 0.0654 0.9869 2.7889

4.0 0.3390 0.5018 0.9474 0.4518 0.9635 3.4311 0.0390 0.1177 0.9768 0.1121 0.9784 3.4311

4.5 0.3898 0.5987 0.9445 0.5144 0.9669 4.1551 0.0484 0.1423 0.9771 0.1453 0.9764 4.1551

5.0 0.4718 0.7885 0.9296 0.6411 0.9624 4.9698 0.0733 0.2139 0.9713 0.2166 0.9707 4.9698

5.5 0.4463 0.7589 0.9425 0.6646 0.9598 5.8848 0.0647 0.1809 0.9800 0.1678 0.9823 5.8848

6.0 0.5736 0.8912 0.9499 0.7999 0.9643 6.9106 0.1004 0.2412 0.9793 0.2403 0.9795 6.9106

6.5 0.6296 1.1399 0.9313 0.9642 0.9550 8.0590 0.1181 0.3274 0.9736 0.3052 0.9764 8.0590

7.0 0.8354 1.3253 0.9424 1.1425 0.9639 9.3425 0.1248 0.3946 0.9707 0.3773 0.9726 9.3425

7.5 1.0700 1.8302 0.9217 1.5353 0.9520 10.7752 0.1745 0.6393 0.9562 0.6018 0.9597 10.7752

8.0 1.3422 2.0515 0.9357 1.7321 0.9646 12.3724 0.1542 0.5338 0.9689 0.5107 0.9708 12.3724

CREDIBILITY ESTIMATION OF DISTRIBUTION FUNCTIONS 325

TABLE 4

Averages of 100 ISEs of the estimates of the ddf

Policy No. i 1 2 3 4 5 6 7 8 9 10

ISE of

S(x, i ) 1.832 1.752 1.801 1.797 1.870 1.859 1.838 1.739 1.861 1.796

ISE of

S (x, i ) 1.914 1.799 1.868 1.854 1.937 1.936 1.903 1.793 1.933 1.870

ISE of

S(x, i ) 1.973 1.858 1.979 1.909 2.006 2.011 1.958 1.860 1.991 1.915

This section reports the numerical results of a small simulation study conducted under Model II so as to show the closeness

of the empirical Bayes estimate S(x, i ) to the true S(x, ), with comparisons to that of the inhomogeneous S(x, ) and the

homogeneous S (x, i ), as well as the performance of empirical premiums obtained by plugging in

S(x, i ) for S(x, ) under,

respectively, net, variance, modified variance, and standard deviation premium principles. This simulation was performed under

the following settings:

The size of the simulation is set to K = 10 and ni = 20, i = 1, 2, . . . , 10

The experiential claims Xij were sampled from the exponential density f (x, i ) = i exp (i x) , x > 0, in which the

risk parameter i followed the Gamma distribution with shape parameter = 4 and scale parameter = 3

To fix the premium principles, their riskloading coefficients (i.e., [risk premium mean loss] divided by the standard

deviation) were set to a common value 0.3.

The corresponding averages of the integrated squared errors (ISE) of the empirical Bayesian estimates

S(x, i ), inhomogeneous

S(x, i ) and homogeneous estimates

estimates S (x, i ), that is, the averages of

(

S(x, i ) S(x, i ))2 dx, (

S(x, i ) S(x, i ))2 dx and (

S (x, i ) S(x, i ))2 dx,

were computed and, after being multiplied by 100 to make the values in a moderate scale, are listed in Table 4. In this table, in

terms of the average of the ISEs, the empirical Bayes estimation is slightly worse than the inhomogeneous estimation, and the

latter is slightly worse than the homogeneous estimation. This loss of accuracy is clearly caused by the additional estimation of

the unknown structure parameters 02 , 02 , and S0 (x).

This simulation also computed the averages of the squared errors (SE) of the empirical premiums obtained by plugging in

S(x, i ). The squares of differences between the empirical and theoretical premiums, under net, variance, modified variance, and

standard deviation principles, after being multiplied by 10, are listed in Table 5.

To measure the efficiency of the empirical premium computed by S(x, ), we computed the quantities

ASE(col) ASE(EB)

Eff = , (3.15)

ASE(col) ASE(Bayes)

where ASE(H ) is the average of the squared errors obtained by applying premium principle H: col means collective premium H (X),

EB the empirical premium computed from S(x, ), and Bayes the Bayesian premium computed as follows. Under the probability

distribution setting in the simulation, the predictive distribution of a future loss, such as Xi,ni +1 , given {Xij , j = 1, 2, . . . , ni } is

TABLE 5

Averages of 10 SEs for Empirical Premiums

Policy No. i 1 2 3 4 5 6 7 8 9 10

SE of Net 0.731 0.658 0.778 0.679 0.848 0.699 0.794 0.683 0.764 0.753

SE of Var. 5.310 5.246 40.51 5.794 13.30 4.771 7.171 8.373 10.28 11.93

SE of ModVar 1.579 1.522 1.842 1.460 1.771 1.562 1.695 1.548 1.704 1.691

SE of StDev 1.303 1.189 1.439 1.188 1.484 1.249 1.403 1.232 1.374 1.362

326 X. CAI ET AL.

TABLE 6

Efficiency of Empirical Premiums with Respect to Collective Premium

Policy No. i 1 2 3 4 5 6 7 8 9 10

Eff of Net 0.988 0.974 1.000 0.991 0.971 0.982 0.978 0.973 0.956 0.957

Eff of Var. 0.868 0.614 0.793 0.917 0.834 0.851 0.906 0.949 0.889 0.801

Eff of ModVar 0.934 0.930 0.958 0.956 0.937 0.931 0.943 0.931 0.922 0.924

Eff of StD 0.975 0.966 0.991 0.985 0.965 0.970 0.973 0.963 0.948 0.953

ni

the Pareto distribution with shape parameter ni + and scale parameter j =1 Xij + , so that

ni ni

j =1 Xij + j =1 Xij +

E[Xi,ni +1 |Xi1 , . . . , Xini ] = and Var Xi,ni +1 |Xi1 , . . . , Xini = .

ni + 1 ni + 1

The Bayes premium for Xi,ni +1 was then computed by substituting the predictive distribution into a risk premium for the risk

distribution S(x, ) under the net, variance, modified variance, and standard deviation premium principles. The resulting efficiencies

from the simulation under the four principles above are listed in Table 6, which shows that the empirical Bayes premiums under

all four premium principles are of high efficiencies, though they vary over premium principles.

4. CONCLUDING REMARKS

We have developed a completely nonparametric estimation for loss distributions and established a unified distribution-free

approach to experience rating for arbitrary premium principles. The method combines the advantages of Buhlmanns credibility

theory and Fergusons nonparametric Bayes premiums and thus provides a powerful tool to generate appropriate experience rating

given the growing body of premium principles developed in general insurance. It is demonstrated under a number of principles

that, although this new approach does not guarantee theoretical optimality, it does produce solutions that are close to the optima.

In examples we have examined (Section 3.2.3) for the Esscher premium principle, the efficiencies with respect to the optimal

premium range between 92.17% and 97.58% even with a small sample size of n = 10 (cf. Table 3).

This new approach can be broadly applied in almost all premium pricing problems in general insurance, including health care,

income protection, property, financial products, and business. More broadly, our distribution-free approach to estimate distribution

functions can be applied to many other areas, such as reserve evaluation (including incurred but not reported and reported but

not settled claims) to predict outstanding claim losses, Bonus-Malus insurance systems (cf. Ferreira 1974; Lemaire 1995) that

give premium discount to low risks in the past year, the optimal claim decision problem of policyholders (see, e.g., Haehling von

Lanzenauer 1974; Braun et al. 2006), health care cost analysis (Bertsimas et al. 2008; Enthoven and Fuchs 2006; Stephens et al.

2005), and simulation of health insurance markets (Feldman and Dowd 1982). This approach is also useful in economics, finance,

and other areas where previous experiences influence present and future risks.

The data structure we have used is, however, limited to the Buhlmann type (conditionally i.i.d.). The extension of the approach

to the Buhlmann-Straub model (Buhlmann and Straub 1970) is not difficult. There are, however, further interesting topics for future

researches, including problems where the data possess certain types of hierarchical settings, losses or risks of regression structures

dependent on covariates, and correlation structures such as panel data. It will be desirable and of practical importance to investigate

if results parallel to what we have found here could be derived for problems with different data settings. On the other hand, our

approach has been established by means of optimal estimation of the risk distributions under the L2 -distance measure, where

optimization could be performed based on derivative equations. It will be interesting to investigate if distribution-free approaches

of comparable performance could be developed under other distance measures. Another interesting topic is to theoretically identify

the conditions under which the experience ratings deduced by inserting the estimated distribution would agree with existing ones

such as Buhlmanns credibilities for the net premium principle and variance premiums, and the Gerbers and Pans versions for

Esscher premiums.

FUNDING

The authors acknowledge the support of GRF Grants No. 410211 and 410213 from the Research Grants Council of Hong

Kong, for X. Q. Cai, NSFC Grant No. 71361015, Jiangxi Provincial Natural Science Foundation Grant No. 20142BAB201013,

No. 2013M540534 from the China Postdoctoral Science Foundation, No. 2014T70615 from the China Postdoctoral Fund Special

CREDIBILITY ESTIMATION OF DISTRIBUTION FUNCTIONS 327

Project for L. M. Wen, and Shanghai Philosophy and Social Science Foundation Grant No. 2010BJB004, the 111 Project under

Grant No. B14019, and NSFC Grant No. 71371074 for X. Y. Wu.

REFERENCES

Antoniak, C. E. 1974. Mixtures of Dirichlet Processes with Applications to Bayesian Nonparametric Problems. Annals of Statistics 2(6): 11521174.

Bertsimas, D., M. V. Bjarnadottir, M. A. Kane, J. C. Kryder, R. Pandey, S. Vempala, and G. Wang. 2008. Algorithmic Prediction of Health-Care Costs. Operations

Research 56: 13821392.

Braun M., P. S. Fader, E. T. Bradlow, and H. Kunreuther. 2006. Modeling the Pseudodeductible in Insurance Claims Decisions. Management Science 52(8):

12581272.

Buhlmann, H. 1967. Experience Rating and Credibility. ASTIN Bulletin 4: 199207.

Buhlmann, H., 1970. Mathematical Methods in Risk Theory. Berlin: Springer-Verlag.

Buhlmann, H. 1980. An Economm Prerium Principle. ASTIN Bulletin 11: 5260.

Buhlmann, H., and A. Gisler. 2005. A Course in Credibility Theory and Its Applications. Amsterdam: Springer.

Buhlmann, H., and E. Straub. 1970. Glaubwudigkeit fur Schadensaze. Bulletin of the Swiss Association of Actuaries 70(1): 11133.

DasGupta, A., 2008. Asymptotic Theory of Statistics and Probability. New york: Springer Science+Business Media.

Dhaene, J., S. Vanduffel, M. J. Goovaerts, R. Kaas, Q. Tang, and D. Vyncke. 2006. Risk Measures and Comonotonicity: A Review. Stochastic Models 22:

573606.

Enthoven, A. C., and V. R. Fuchs. 2006. Employment-Based Health Insurance: Past, Present, and Future. Health Affairs 25: 15381547.

Feldman, R. D., and B. E. Dowd. 1982. Simulation of a Health Insurance Market with Adverse Selection. Operations Research 30: 10271042.

Ferreira, J. 1974. The Long-Term Effects of Merit-Rating Plans on Individual Motorists. Operations Research 22: 954978.

Ferguson, T. 1973. A Bayesian Analysis of Some Non-parametric Problems. Annals of Statistics 1(2): 209230.

Furman, E., and R. Zitikis. 2008. Weighted Premium Principles. Insurance: Mathematics and Economics 42(1): 459465.

Gerber, H. U. 1980. Credibility for Esscher Premium. Mitleilungen der Vereinigung schweiz. Versicher ungsmathematiker 3: 307312.

Ghosh, J. K., and R. V. Ramamoorthi. 2003. Bayesian Nonparametrics. Springer Series in Statistics. New York: Springer-Verlag.

Gomez, E., A. Hernandez, and F. J. Vazquez-Polo. 2000. Robust Bayesian Premium Principles in Actuarial Science. Journal of the Royal Statistical Society, Series

D 49(2): 241252.

Gomez, E., A. Hernandez, and F. J. Vazquez-Polo. 2006. On the Use of Posterior Regret -Minimax Actions to Obtain Credibility Premiums. Insurance:

Mathematics and Economics 39(1): 115121.

Goovaerts, M. J., R. Kaas, A. E. Van Heerwaarden, and T. Bauwelinckx. 1990. Effective Actuarial Methods. Amsterdam: North-Holland.

Hachemeister, C. A. 1975. Credibility for Regression Models with Application to Trend. In Credibility: Theory and Applications, Proceedings of the Berkeley

Actuarial Research Conference on Credibility, New York: Academic, pp. 129163.

Haehling von Lanzenauer, C. 1974. Optimal Claim Decisions by Policyholders in Automobile Insurance with Merit-Rating Structures. Operations Research 22:

979990.

Heilmann, W. R. 1989. Decision Theoretic Foundations of Credibility Theory. Insurance: Mathematics and Economics 8: 7795.

Jewell, W. S. 1974. The Credible Distribution. ASTIN Bulletin 7(3): 237269.

Kaas, R., M. Goovaerts, J. Dhaene, and M. Denuit. 2001. Modern Actuarial Risk Theory. New York: Kluwer Academic.

Klugman, S. A. 1992. Bayesian Statistics in Actuarial Science: With Emphasis on Credibility. Boston: Kluwer.

Lau, J. W., T. K. Siu, and H. Yang. 2006. On Bayesian Mixture Credibility. ASTIN Bulletin 36(2): 573588.

Lemaire, J. 1995. Bonus-Malus Systems in Automobile Insurance. New York: Kluwer Academic.

Makov, U. E., A. F. M. Smith, and Y. H. Liu. 1996. Bayesian Methods in Actuarial Science. Journal of the Royal Statistical Society, Series D 45(4): 503515.

Mashayekhi, M. 2002. On Asymptotic Optimality in Empirical Bayes Credibility. Insurance: Mathematics and Economics 31: 285295.

Natarajan, K., D. Pachamanova, M. Sim. 2009. Constructing Risk Measures from Uncertainty Sets. Operations Research 57(5): 11291141.

Norberg, R. 1980. Empirical Bayes credibility. Scandinavian Actuarial Journal. 1980: 172194.

Norberg, R. 2004. Credibility Theory. In Encyclopedia of Actuarial Science, edited by J. Teugels and B. Sundt. Chichester, UK: Wiley.

Pai, J. S. 1997. Bayesian Analysis of Compound Loss Distributions. Journal of Econometrics 79(1): 129146.

Pan, M., R. Wang, and X. Wu. 2008. On the Consistency of Credibility Premiums Regarding Esscher Principle. Insurance: Mathematics and Economics 42:

119126.

Pitselis, G. 2004. A Seemingly Unrelated Regression Model in a Credibility Framework. Insurance: Mathematics and Economics 34: 3754.

Robbins, H. 1955. An Empirical Bayes Approach to Statistics. In Proceedings of the Third Berkeley Symposium on Mathematics, Statistics and Probability 1:

157164.

Robbins, H. 1964. The Empirical Bayes Approach to Statistical Decision Problems. Annals of Mathematics and Statistics 35: 120.

Schmidt, K. D. 1991. Convergence of Bayes and Credibility Premiums. ASTIN Bulletin 20(2): 167172.

Schmidt, K. D. 1998. Bayesian Models in Actuarial Mathematics. Mathematical Methods of Operations Research 48: 117146.

Stephens, C. R., H. Waelbroeck, and S. Talley. 2005. Predicting Healthcare Costs Using GAs. In Proceedings of the 2005 Workshops on Genetic and Evolutionary

Computation, June 2526, Washington, D.C., GECCO 05. ACM, New York, pp. 159163. http://doi.acm.org/10.1145/1102256.1102291.

Sundt, B. 1999. An Introduction to Non-life Insurance Mathematics. 4th edition. Karlsruhe: Verlag Versicherungswirtschaft.

Szego, G. 2002. Measures of Risk. Journal of Banking and Finance 26: 12531272.

Wen, L., X. Wu, and X. Zhao. 2009. The Credibility Estimators under Generalized Weighted Loss Functions. Journal of Industrial and Management Optimization

5(4): 893910.

Wu, X., and X. Zhou. 2006. A New Characterization of Distortion Premiums Via Countable Additivity for Comonotonic Risks. Insurance: Mathematics and

Economics 38: 324334.

Young, V. R. 2004. Premium Principles. In Encyclopedia of Actuarial Science, edited by J. Teugels and B. Sundt, pp. 13221331. New York: Wiley.

Zehnwirth, B. 1977. The Mean Credibility Formula is a Bayes Rule. Scandinavian Actuarial Journal 212216.

328 X. CAI ET AL.

Zehnwirth, B. 1979. Credibility and the Dirichlet Process. Scandinavian Actuarial Journal 1323.

Zehnwirth, B. 1981. A Note on the Asymptotic Optimality of the Empirical Bayes Distribution Function. Annals of Statistics 9: 221224.

Discussions on this article can be submitted until July 1, 2016. The authors reserve the right to reply to any discussion. Please see

the Instructions for Authors found online at http://www.tandfonline.com/uaaj for submission instructions.

APPENDICES

A.1. Proofs of Theorems in Section 2.3

A.1.1. Proof of Theorem 2.3

Proof. Note that the mean squared error of

S (x, i ) can be decomposed as

2 2

E

S (x, i ) S(x, i dx = E

S(x, ) S(x, i ) + S (x, i ) S(x, ) dx

2

=E

S(x, ) S(x, i ) dx + 2E

S (x, i ) S(x, ) S(x, ) S(x, i ) dx

2

+E

S (x, i )

S(x, ) dx . (A.1)

2

E

S(x, i ) S(x, i ) dx = (1 Zi )2 02 . (A.2)

K K

Second, it follows

from the equalities S (x, i )

S(x, i ) = (1 Zi ) r=1 Zr (Sr (x) S0 (x)) / r=1 Zr and Cov S(x, i )

S(x, i ), Sr (x) = 0, r = 1, 2, . . . , n, that

E S (x, i )

S(x, i ) S(x, ) S(x, i ) = 0. (A.3)

Third, as Var (Sr (x)) dx = 02 /nr + 02 = 02 /Zr ,

2 2

E S (x, i )

S(x, i ) dx = Var S (x, i )

S(x, i ) dx

(1 Zi )2 2

K

2 (1 Zi )2

= K 2 Zr Var (Sr (x)) dx = 0K . (A.4)

r=1 Zr r=1 r=1 Zr

Inserting (A.2), (A.3), and (A.4) into (A.1) leads to the desired equality:

r=i Zr + 1

2

02 (1 Zi )2 02 02

E S (x, i ) S(x, i ) dx = K + (1 Zi ) 0 =

2

.

r=i Zr + Zi 0 + ni 0

2 2

r=1 Zr

Proof. Define Yij = I (Xij > x) and write Yi = (Yi1 , Yi2 , . . . , Yi,ni ) , i = 1, 2, . . . , K. Then E[Yi ] = S0 (x)1 and

Var(Yi ) = 02 (x)I + 02 11 , where I is the identity matrix and 1 is the column vector of 1s, both with proper dimensions.

i 2

Since nj =1 I Xij > x Si (x) = Yi (I 11 /ni )Yi , it is easy to check

11

K K K

11

E [SSE(x)] = E Yi I Y = trace I Var(Y ) = (ni 1)02 (x),

i=1

ni i=1

ni i=1

CREDIBILITY ESTIMATION OF DISTRIBUTION FUNCTIONS 329

which implies the unbiasedness of 02 . Furthermore, write n = (n1 , n2 , . . . , nK ) , N = diag(n1 , n2 , . . . , nK ) (the diagonal matrix

with diagonal elements n1 , n2 , . . . , nK ) and S = (S1 (x), S2 (x), . . . , SK (x)) . Then E[S] = S0 (x)1 and Var(S) = 02 (x)N1 +02 (x)I ,

because

2

Var(Si (x)) = Var(1 Yi /ni ) = n2

i 1 0 (x)I + 0 11 1 = 0 (x)/ni + 0 (x).

2 2 2

nn nn 2

E[SSA(x)] = S02 (x)1 N 1 + trace N 0 (x)N1 + 02 (x)I

1n 1n

K K 2

( i=1 ni ) i=1 ni 2

2

= (K 1)02 (x) + K 0 (x),

i=1 ni

02 .

We next prove the consistency of 02 . First note that (2.21) yields

K

Ti ni

2j ni 1

02 = K i=1

, where Ti = Xi(j ) .

i=1 (ni 1) j =1

ni

Since

2

ni

2j n 1 1 ni

1 ni

E[Ti2 ] = E Xi(j ) (2j ni 1)2 E 2

i

Xi(j )

j =1

n i ni j =1

ni j =1

1

ni

2 2 (n2i 1) 2

= 4j 4j (ni + 1) + (ni + 1)2 E Xi1 = E Xi1 ,

ni j =1 3

we have

Var (TK ) E TK2 1 2 (n2K 1)

K K E Xij K <

K=1 ( i=1 (ni 1))2 K=1 ( i=1 (ni 1))2 3 K=1 ( i=1 (ni 1))

2

due to condition (2.17). Thus the consistency of 02 follows from Kolmogorovs strong law of large numbers for independent but

not identically distributed series. To show the consistency of

02 , note the expression

1 1

K

2

K SSA(x) dx = K ni (Si (x) S0 (x)) dx

2

S0 (x) S(x) dx.

i=1 ni i=1 ni i=1

First, as |x| dS(x) |x| dS0 (x) by the strong law of large numbers and maxx |S0 (x) S(x)| 0 (Glivenko-Cantellis

theorem), under condition (2.17) we have

2

S0 (x) S0 (x) dx = x S0 (x) S(x) d S0 (x) S(x)

= x S0 (x) S(x) dS(x) x S0 (x) S(x) dS0 (x)

max S0 (x) S(x) |x| dS(x) + |x| dS0 (x) 0.

x

330 X. CAI ET AL.

K

We next treat i=1 ni (Si (x) S0 (x))2 dx/ Ki=1 ni . Note that

(Si (x) S0 (x)) dx = 2

x (Si (x) S0 (x)) d (Fi (x) F0 (x))

0

= x (Si (x) S0 (x)) d (Fi (x) F0 (x)) + x (F0 (x) Fi (x)) d (Fi (x) F0 (x))

0

0 0

= xSi (x) dF i (x) + xS0 (x) dF 0 (x) xFi (x) dF i (x) xF0 (x) dF 0 (x)

0 0

0 0

+ xF0 (x) dF i (x) + xFi (x) dF 0 (x) xS0 (x) dF i (x) xSi (x) dF 0 (x).

0 0

Define Hi (x) = Si (x)I (x 0) + Fi (x)I (x < 0) and H0 (x) = S0 (x)I (x 0) + F0 (x)I (x < 0). Then

(Si (x) S0 (x)) dx = 2

|x|Hi (x) dF i (x) + |x|H0 (x) dF 0 (x)

|x|H0 (x) dF i (x) |x|Hi (x) dF 0 (x)

|x|Hi (x) dF i (x) + |x|H0 (x) dF 0 (x).

Thus

2

Var ni (Si (x) S0 (x))2 dx n2i E (Si (x) S0 (x))2 dx

2

n2i E |x|Hi (x) dF i (x) + |x|H0 (x) dF 0 (x)

2 2

2n2i E |x|Hi (x) dF i (x) + E |x|H0 (x) dF 0 (x)

2n2i E x 2 dF i (x) + x 2 dF 0 (x) = 4n2i E[Xi1 2

].

K

1

K ni (Si (x) S0 (x))2 dx E (Si (x) S0 (x))2 dx 0

i=1 ni i=1

+ +

Proof. Write D = E[ S(x, i ) S(x, i )]2 dx E[

S(x, i ) S(x, i )]2 dx. It can be rearranged as

+

D= E (

S(x, i ) +

S(x, i ) 2S(x, i ))(

S(x, i )

S(x, i )) dx

+ +

= E (

S(x, i )

S(x, i ))2 dx + 2 E ( S(x, i ) S(x, i ))(

S(x, i )

S(x, i )) dx.

CREDIBILITY ESTIMATION OF DISTRIBUTION FUNCTIONS 331

+

|D| E (

S(x, i )

S(x, i ))2 dx

+ + 1/2

+2 E (

S(x, i ) S(x, i ))2 dx E (

S(x, i )

S(x, i ))2 dx

+ + 1/2

2 2

= E (

S(x, i )

S(x, i ))2 dx + 2 2 0 0 2 E (

S(x, i )

S(x, i ))2 dx , (A.5)

0 + n 0

+

lim max E (

S(x, i )

S(x, i ))2 dx = 0. (A.6)

K 1iK

Note |Z i Zi | 1 and

n02 n 2 1 2

/ 2

2

/ 2 2

0 02

|Z i Zi | = 2 0 = 0 0

0 0

.

0 + n02 02 + n 20 n 1 + 02 /n02 1 + 02 /n 20 02 02

It follows that

2

2

max |Z i Zi | min 02 02 , 1 = A (say).

1iK 0 0

S(x, i ) 0 (x) S0 (x)] + (Zi Zi ) (Si (x) S0 (x)), we see that

2

2 1

K

S(x, i )

S(x, i ) 2(1 Z i )2 Sr (x) S0 (x) + 2(Z i Zi )2 (Si (x) S0 (x))2

K r=1

K
2

2

2 (Sr (x) S0 (x)) + 2A2 (Si (x) S0 (x))2 .

K r=1

Consequently,

+

max E (

S(x, i ) S(x, i ))2 dx

1iK

K
2

2 + +

2

max E (S r (x) S0 (x)) dx + 2 E A (Si (x) S0 (x)) 2

dx

1iK K 2

r=1

2 +

2 0

= + 02 + 2 max E A2 (Si (x) S0 (x))2 dx.

K n 1iK

+ + 2

2

2+ 2/(2+)

2/(2+)

E A (Si (x) S0 (x)) dx E[A ]

2

E (Si (x) S0 (x)) 2(2+)/

dx

+ 2

2/(2+) 2/(2+)

E[A2+ ] E (Si (x) S0 (x))2 dx

332 X. CAI ET AL.

2/(2+) 2

2/(2+) +

02 (x)

= E[A2+ ] + 02 (x) dx .

n

+ 2

max E S(x, i )

S(x, i ) dx

1iK

+ 2 2/(2+)
2

2 02

2+ 2/(2+) 0 (x)

+ 0 + 2 E[A ]

2

+ 0 (x)

2

dx 0 as K

K n n

A.2.1. Dutchs Premium Principle

( ))+ (Xi ( ))+ | |

( ) ( )| and

( ) = ZX n + (1 Z)0 E[X| ] a.s.,

Z n

Z

n

(Xi

( ))+ (Xi ( ))+ Z |

( ) ( )| 0a.s.

n n

i=1 i=1

This is equivalent to

Z 1

n n

lim (Xi

( ))+ = lim Z lim (Xi ( ))+ = E [X ( )]+ . (A.7)

n n n n n

i=1 i=1

It follows that

(x

( ))+ dF 0 (x) (x ( ))+ dF 0 (x) + (x

( ))+ (x ( ))+ dF 0 (x)

(x ( ))+ dF 0 (x) + |

( ) ( )| dF 0 (x)

= (x ( ))+ dF 0 (x) + |

( ) ( )| .

Thus

(1 Z) (x

( ))+ dF 0 (x) (1 Z) (x ( ))+ dF 0 (x) + |

( ) ( )| 0 (A.8)

Z

n

(X| ) =

H ( ) + (Xi

( ))+ + (1 Z) (x

( ))+ dF 0 (x)

n i=1

converges to E[X] + E [(X ( ))+ ] almost surely. This completes the proof.

CREDIBILITY ESTIMATION OF DISTRIBUTION FUNCTIONS 333

Proof. First, note that for any S(x), H (S(x)) can be represented also as H (S(x)) = xg (S) dF(x). We thus have

(X| ) H (X| ) =

H xg (ZSn (x) + (1 Z) S0 (x)) d(ZFn (x) + (1 Z) F0 (x))

xg (S(x, )) dF(x, )

= x g (ZSn (x) + (1 Z) S0 (x)) g (S(x, )) d(ZFn (x) + (1 Z) F0 (x))

+ xg (S(x, )) d(ZFn (x) + (1 Z) F0 (x)) xg (S(x, )) dF(x, ).

Therefore, the consistency follows from Theorem 2.2, the strong law of large numbers, and Z 1 as n :

x g (ZSn (x) + (1 Z) S0 (x)) g (S(x, )) d(ZFn (x) + (1 Z) F0 (x))

|x| g (ZSn (x) + (1 Z) S0 (x)) g (S(x, )) d(ZFn (x) + (1 Z) F0 (x))

C max |ZSn (x) + (1 Z) S0 (x) S(x, )| |x| | d(ZFn (x) + (1 Z) F0 (x))| 0

x

and

xg (S(x, )) d(ZFn (x) + (1 Z) F0 (x)) xg (S(x, )) dF(x, ) 0.

Consequently, H

A.3.1. Proof of Example 3.1

Proof. By the modeling assumptions it is easy to see ( ) = 1/ , 0 = /( 1), 2 = 2 /[( 1)( 2)], and 2 =

2 /[( 1)2 ( 2)]. Hence Z c = n/(n + 1) and

c ( ) = (nX n +)/(n+ 1). On the other hand, as S0 (x) = (/( + x))

and E [S(x, ) ] = (/( + 2x)) , we have

2

2

02 (x) = and 02 (x) = ,

+ 2x +x +x + 2x

implying 02 = /[2(2 1)( 1)] and 02 = /[2( 1)]. The credibility factor and the estimator of (3.3) are then given,

respectively, by

n n 2 1

Z= and

( ) = Xn + .

n + 2 1 n + 2 1 n + 2 1 1

334 X. CAI ET AL.

) n n

Proof. First, ( |Xn ) ( ) ni=1 f (Xi , ) i=1 Xi (1 )n i=1 Xi Beta( ni=1 Xi + 1, n ni=1 Xi + 1). Next, as

E[XehX | ] = eh and mh ( ) = E[ehX | ] = 1 + (eh 1), the Bayes premium can be given by

HB Xn = = = .

E[ehXn+1 |Xn ] E[1 + (eh 1)|Xn ] (nX + 1)(eh 1) + n + 2

1, if x < 0, 1, if x < 0,

S0 (x) = 1/2, if 0 x < 1, E S(x, )2 = 1/3, if 0 x < 1,

0, if x 1, 0, if x 1,

02 = 1/12, 02 = 1/6, and Z = n/(n + 2). Thus Hcu (Xn ) = ( ni=1 Xi ehXi + eh )/( ni=1 ehXi + 1 + eh ). Because Xi takes values

0 and 1 only, we have Xi ehXi = Xi eh and ehXi = eh Xi + 1 Xi . Straightforward computation then gives Hcu (Xn ) = HB (Xn ).

A.3.3. The Monte Carlo Approximation of Pans Credibility Premiums under the Esscher Principle

m m values i ,i = 1, 2, . . . , m from distribution = Gamma(, e + 1) and compute their sample

h

Step 1. Randomly sample

1

mean = m

i=1 i ;

Step 2. For each i , generate r samples, each of which n consists n i.i.d. values form P oisson(i ):{xij 1, xij 2, , xij n },

of

j = 1, 2, . . . , r. For each j, compute Hij = s=1 xij s e

hxij s

/ ns=1 ehxij s and let Ui = Hi = 1r rj =1 Hij and Vi =

(r 1)1 rj =1 (Hij Hi )2 .

m m

Step 3. Let a = e h

(m 1)1 m i=1 (i )Ui ,b = (m 1)

1

i=1 (Ui U ) ,c = m

2 1

i=1 Vi , and

d = m1 m i=1 Ui . Then, Z

(P )

a/(b + c) and E [hn ( )] d.

Step 4. Finally, the credibility estimator HP (Xn ) can be computed by

n

a i=1 Xi e

hXi

eh ad

HP Xn n + . (A.9)

b+c i=1 e

hX i e +1 b+c

h

Proof. Since S(x, ) = k

k=x+1 e /k!, we can write

1 ( + k) 1 ( + i + j )

E [S(x, )] = and E [S(x, )]2 = .

() k=x+1 k! ( + 1)+k () i=x+1 j =x+1 i!j ! ( + 2)+i+j

Consequently,

+1 k

1 ( + k) ( + 1 + k) 1

E [S(x, )] = = = ,

x=0

() x=0 k=x+1 k! ( + 1)+k k=0 k! ( + 1) +1 +1

where the last equality holds because the summands are the probabilities of a negative binomial distribution. It follows that

min(i, j ) ( + i + j )

02 = E [S(x, )] E S(x, )2 =

x=0

i=1 j =1

i!j ! () ( + 2)+i+j

CREDIBILITY ESTIMATION OF DISTRIBUTION FUNCTIONS 335

and

02 = Var (S(x, )) = E [S(x, )]2 {E [S(x, )]}2

x=0 x=0

2

( + i + j ) ( + k)

=

x=0 i=x+1 j =x+1

i!j ! () ( + 2) +i+j

x=0 k=x+1

k! () ( + 1)+k

2 ( + i) ( + j )

( + i + j )

=

x=0 i=x+1 j =x+1

i!j ! () ( + 2)+i+j x=0 i=x+1 j =x+1

i!j ! ()2 ( + 1)2+i+j

( + i + j ) ( + i) ( + j )

=

x=0 i=x+1 j =x+1

i!j ! () ( + 2)+i+j () ( + 1)2+i+j

min(i,j

)1

( + i + j ) ( + i) ( + j )

=

i=1 j =1 x=0

i!j ! () ( + 2)+i+j () ( + 1)2+i+j

min(i, j ) ( + i + j ) ( + i) ( + j )

= .

i=1 j =1

i!j ! () ( + 2)+i+j () ( + 1)2+i+j

Proof. It is easy to see that the Bayes premium is

( + nX n )eh

HB Xn = . (A.10)

+ n eh + 1

E [Var(X| )] = /( eh + 1). Therefore,

ZG = 1

= ,

Var (( )) + n E [Var(X| )] + n eh + 1

+ nX n eh

HG Xn = Z G Xn + Hcol (X) Z G E [( )] = . (A.11)

+ n eh + 1

- Business Studies SA -1Uploaded bytssuru9182
- The Importance of Insurance Risk ManagementUploaded byankita
- Case Study SkandiaUploaded byKushal Rastogi
- Earthquake_document.pdfUploaded bySafwat El Rouby
- Insurance Notebook (1)Uploaded byDinca Mariana Mirabela
- New Business Development Financial Director in Denver CO ResumeUploaded byBrianP3
- LIC Book EnglishUploaded byKoustubh Thorat
- InventoryUploaded byindu_shr89
- UntitledUploaded byeurolex
- 1977 Annual LetterUploaded byjayatyahoodotcom
- FinalUploaded byMahbub Tushar
- Mahesh_ChitteUploaded bymahesh_chitte
- Unit 15 - Risk Assessment FormUploaded bywill
- TradUploaded byJennilyn Dean
- 1778945 Business Financial Hazard Part 4_FamilySuccessionEqualizationUploaded byMusic Word Media Group
- General Insurance Final PrintUploaded byMohsin Tamboli
- KaongaUploaded byHan gia
- Role of Microcredit and Microinsurance in Coping With Natural Hazard RisksUploaded bymichael17ph2003
- Tourbier Renewal NoticeUploaded byCristina Marie Dongallo
- One-WayUploaded byanon-970061
- Rizal Surety vs CAUploaded byCecille David
- Travelshield Singletrip TermsUploaded byBruce Gretzsky
- life invest - sales literature final 001Uploaded byanon-973770
- Insurance Cases 1Uploaded byYanaKarunungan
- 123Uploaded byDinesh Gehi DG
- Buffet 1977Uploaded byspogoli
- 77Uploaded bydiente1312
- Capital Insurance and Surety CoUploaded byMarc Gar-cia
- 10000024844Uploaded byChapter 11 Dockets

- Letting the Holy Spirit Lead (Rev. Derek Prince)Uploaded byDias Lobato
- ADVERT - 18th December 2017Uploaded byPatrick Mugo
- 1978_Alexander_Solzhenitsyn.pdfUploaded byPatrick Mugo
- SOM Postgrad BrochureUploaded byPatrick Mugo
- Report Attacho - Copy (2)Uploaded byPatrick Mugo
- CredibilityUploaded byPatrick Mugo
- 193Uploaded byPatrick Mugo
- References of CredibilityUploaded byPatrick Mugo
- Work Plan ScheduleUploaded byPatrick Mugo
- Data Request Icea LionUploaded byPatrick Mugo
- st42005-2009Uploaded byPatrick Mugo
- st42010-2014Uploaded byPatrick Mugo
- ST4_2015Uploaded byPatrick Mugo
- Bible DVDS and Audio CDs - May 2006Uploaded byPatrick Mugo
- 01 28 16 Defeated EnemyUploaded byPatrick Mugo
- Tidy data.pdfUploaded byPatrick Mugo
- 406 Mark SchemeUploaded byPatrick Mugo
- Obimbo Moses Cv Feb 2016Uploaded byPatrick Mugo
- The Life of F.F. Bosworth by Roberts Liardon _ HopeFaithPrayerUploaded byPatrick Mugo
- Scriptures for Comfort During Grieving _ HopeFaithPrayerUploaded byPatrick Mugo
- The Wonderful Name of JesusUploaded byPatrick Mugo
- 222 Prayers of the Bible _ HopeFaithPrayerUploaded byPatrick Mugo
- Scriptures Against Hopelessness _ HopeFaithPrayerUploaded byPatrick Mugo
- Scriptures Against Depression _ HopeFaithPrayerUploaded byPatrick Mugo
- Personal DetailsUploaded byPatrick Mugo
- 44 PhonemesUploaded byLea Licerio
- Gig 98 pmUploaded byPatrick Mugo
- Curriculum Vitae - PatrickUploaded byPatrick Mugo

- Estimation and Inference of Heterogeneous Treatment Effects Using Random ForestsUploaded bydonsuni
- VariogramsUploaded byBuiNgocHieu
- 705-2179-1-SM.pdfUploaded byRiyaz Ali
- AC_0712312321312Uploaded byMC Badlon
- chap9Uploaded byhendra lam
- Financial Engineering AssignmentUploaded byWilliam Masterson Shah
- Bayesian Analysis in Stata Using WinBUGSUploaded byJinghua Lei
- Zonal AnisotropyUploaded byreski_minerz
- Term 3 Maths T Final Test 1Uploaded byKelvin Fook
- The Monte Carlo method in Excel - André FarberUploaded bysneikder
- MrozUploaded byKhai Tan
- 0000407Uploaded byjayroldparcede
- Ecostat Paper (Acne Statistics)Uploaded byMac Co
- II_II_ECE_RVSPUploaded byAnonymous WL0eWCe
- Fft CalcUploaded byYusufAdiN
- Chapter_5_Statistical_Decision_Problems.pdfUploaded byChia Yin Teh
- Exercise 2 (Chap 2).docUploaded bypeikee
- CorrelationUploaded byKeerthana Sahadevan
- Pertemuan 04 Baru-Estimasi Dua PopulasiUploaded byAnta Pratama
- Monte Carlo R-solutionsUploaded byMaja
- Applied Statistics - MITUploaded bygzapas
- BKM 10e Chap008 SM FinalUploaded byBiloni Kadakia
- Time Series Econometrics[Cointegration,ARCH,GARCH]Uploaded byNguyen Anh Duy
- Chapter 7:StatisticsUploaded byHayati Aini Ahmad
- CocaCola Solved Assignments 2Uploaded bySoumya Simons
- 8. %E0%B8%9B%E0%B8%A3%E0%B8%B0%E0%B8%97%E0%B8%B1%E0%B8%9A%E0%B9%83%E0%B8%88Uploaded byIrwan Suirwan
- GridDataReport TESTUploaded byAnonymous RBNERN
- SeqHMM Package for RUploaded byneuromiguel
- 19_S2_January_2003Uploaded byOnlyMyWay
- Data Analysis Techniques by Shahzad AsgharUploaded byShahzad Asghar Arain