Attribution Non-Commercial (BY-NC)

7 views

Attribution Non-Commercial (BY-NC)

- Statistics
- Understanding People Soft Workforce Rewards
- App1
- Regression Help
- Dmaic Tools
- 10.1.1.419.2211
- Forecasting Short Seasonal
- Ct6 Syllabus
- Regression
- Beautiful Places
- Benita Paper
- Fsr3 Determinants Tcm16-9477
- Romo_et_al_MCP
- ROI Study Methodology
- Speech
- Nonlinear Example How To
- Statistics Project - FINAL
- ANALISIS REGRESI BELAJAR.docx
- SLP #5 for Module #5 RES600 Class
- Classnotes2-ANOVATable

You are on page 1of 16

An actuarial model for assessing general practictioners

prescriptions costs

Giorgio Spedicato

the date of receipt and acceptance should be inserted later

Abstract Monitoring general practitioners prescriptions costs is an important issue in order

to efciently allocate national health insurance resources. To address this aim this paper pro-

poses a methodology based on non - life actuarial models. The patients frequency and costs

of drugs prescriptions are modeled by means of Generalized Additive Models for Location,

Scale and Shape (GAMLSS) in our approach. The total cost of the pool of patients drug

prescriptions is then modelled by means of convolutions, following a classical risk theory

approach. An example based on a quasi-real dataset exemplies the proposed methodology.

Keywords GAMLSS public health insurance drug prescriptions coverage predictive

models

1 Introduction

Monitoring general practitioners (GPs) costs of drug prescriptions is an important issue to

efciently allocate National Health Insurance (NHI) budget. Prolonged economic downturn

has produced increased pressure on governments toward rationalization and budget restric-

tions. For example, NHI policy discussion in Italy [3] has brought the attention upon stan-

dard costs of service, which should represent the efcient price of any service granted by

the NHI. This paper aims to show a rationale approach to assess the standard cost of drug

prescriptions charged to the NHI.

Drug prescriptions expenditure has been widely studied by health econometricians and

medical researchers. In particular, [26] analysed GPs drug prescriptions costs in Ireland.

The yearly total cost was estimated by means of a linear regression model, based on aggre-

gate demographic variables of each GPs pool of patients. [21] conducted a similar study in

Northern Italy. Here, models applied were multiple linear regression and LISREL model,

based on both patient- and GP-level demographic variables. [9] studied the effect of GPs

age and sex on number and cost of drug prescriptions in Catalunya region. Finally, [10]

applied a panel data econometric model on data from Catalunya region. In synthesis, medi-

cal literature conrm the availability of data and the importance of using statistical models;

moreover, empirical studies show that both patient- and GPs-level demographic variables

BLINDED

2 Giorgio Spedicato

play a signicant role in determining the total cost of the yearly prescriptions. However,

there are three drawbacks in the approaches proposed in the medical literature: number and

cost of single prescriptions are not separately taken into account; linear regression mod-

els are used, while generalized linear models (GLMs) seem more adequate; the expected

value of total cost of the yearly prescriptions is modeled only, without taking into account

variability.

Even if no actuarial literature exists on this topic, some well known actuarial approaches

may be usefully applied in this context. In particular, we propose a new methodology which

combines four actuarial techniques that are widely used in non-life insurance actuarial prac-

tice.

The rst technique consists of the convolution of stochastic distributions (see e.g. [4] for

a theoretical introduction). In particular, risk theory models the total cost of claims by con-

volution of number and cost of claims distributions. One of the mayor applications of risk

theory in actuarial context regards the estimation of insurers Solvency Capital Requirement

([2]); see an example in [20]. Here, we propose to model the distribution of the yearly total

costs of drugs prescriptions of a single patient as a convolution of the stochastic number

and cost of prescriptions associated with the patient.

The second technique is an extension of GLMs, that is Generalized Additive Models for

Location, Scale and Shape (GAMLSS) [19]. GLMs are widely used in non-life rate-making

([1] and [5]). In particular, over-dispersed Poisson and Gamma GLMs are applied to model

the frequency and the severity of claims as a function of policyholders characteristics in

order to assess risk premium of insurance coverages (see [25] for details). However, the

variability of number and cost of claims is rarely taken into consideration in a standard rate-

making. GAMLSS allow to model as a function of covariates not only the mean, but also

other parameters which enable to completely dene the conditional distribution of the de-

pendent variable. Very few actuarial applications of GAMLSS exist. In particular, GAMLSS

have been proposed to assess the frequency and the cost of claims in the Australian market

in [6] and to analyse mortality trend in [24]. Moreover, in order to assess the premium risk

Solvency II capital requirement, [22] applies GAMLSS to better take into account portfo-

lios heterogeneity. Here, we propose to model frequency and costs of prescription drugs for

each patient by means of GAMLSS, in order to estimate location and dispersion parameters

as a function of patient characteristics.

The third technique is represented by models for lapse probability and conversion rate,

widely used in actuarial practice to predict drop-outs and arrivals, given that a policyholders

portfolio is an open collectivity (see e.g. [25] and [23] for a practical discussion). We propose

to model the probability that any subject may leave the GP for death or other causes, as well

as the probability that a new subject may enter the pool of patients of the GP.

The fourth technique consists of approximating the total loss distribution of a portfolio

by a theoretical distribution (see for details [14]). We extend this approach for approximating

the yearly total cost of drug prescriptions arising from a GPs pool of patients.

The paper will be structured as follows: the methodology will be introduced in Section

2, an example based on a quasi real data set will be discussed in Section 3. Finally, in Section

4 conclusions and suggestions for further research will be provided.

2 The methodology

This section introduces the theoretical tools which are the basis of the new methodology

proposed.

An actuarial model for assessing general practictioners prescriptions costs 3

2.1 Risk theory

One of the goals of risk theory is modeling the total cost of a policyholders portfolio. Given

that patients are heterogeneous, we follow the so-called individual risk theory approach to

model the distribution of the yearly total costs of prescription drugs. In particular, the yearly

total cost

T of prescription drugs can be expressed as the sum of single patients costs t

i

,

i = 1, ...N, that is:

T =

N

i=1

t

i

, (1)

where both

T and t

i

, i = 1, ...N, are random variables.

Then, the yearly cost of prescription drugs t

i

for patient i can be seen as a convolution

of single patients yearly costs c

i j

of prescription drugs for patient i, j = 1, ... n

i

, that is:

t

i

=

n

i

j=0

c

i j

, (2)

where n

i

represents the stochastic number of prescription drugs during the exposure period

for patient i and c

i j

represents the n

i

stochastic costs of drug prescriptions for patient i.

2.2 GAMLSS

GLM extends classical linear model when the dependent variable is not conditionally Gaus-

sian distributed; here, the expected value of the dependent variable y

i

is expressed as a

function of covariates through the GLM link function, that is:

_

E [ y

i

] =

i

= g

1

(

i

) = f (x

i

)

var [ y

i

] = V (

i

)

(3)

where g

1

() is the link function, V (

i

) is a function that depends by the distribution

family and is a constant that can be estimated from the data (see [1] for details). How-

ever, standard GLM framework leads to restrictive modeling for the variance of y

i

, since it

depends on

i

.

Arecent extention of GLMs, i.e. GAMLSS family, overcomes such limitations. GAMLSS

enable to model up to four parameters of y

i

distribution as a function of covariates (i.e. lo-

cation

i

, scale

i

and shape parameters

i

and

i

). Then, we have:

_

i

= f

1

(x

i

)

i

= f

2

(x

i

)

i

= f

3

(x

i

)

i

= f

4

(x

i

)

(4)

The distribution of y

i

is therefore fully characterized by a set of exible equations. In

particular, equation (4) implies that moments of y

i

can be directly expressed as a function of

covariates after a convenient parametrization, that is:

_

E [ y

i

] = f (x

i

)

var [ y

i

] = g(x

i

)

(5)

4 Giorgio Spedicato

Current GAMLSS R package [19] supports more than 60 distributions, non-linear and

non-parametric relationships (e.g. cubic splines, loess and non parametric smoothers), ran-

dom effect modeling; moreover, it provides a full set of diagnostic tools.

In order to assess the drug prescriptions total cost in (2) of a GPs pool of patients, we

propose to model n

i

and c

i j

by means of GAMLSS framework as a function of patients char-

acteristics. This enables to obtain expressions for E [ n

i

], var [ n

i

], E [ c

i

] and var [ c

i

] following

equation (5). We propose to use for n

i

a count data regression model, while for c

i

a posi-

tive distribution regression model. Suitable candidates for n

i

are Negative Binomial (NB) or

Poisson (POI) distributions, which are are widely used in non-life actuarial practice; the ad-

vantage is that closed forms for the moments exist as a function of distributions parameters.

Formulas (6) and (7) show conveniently parametrizations of NB and POI probability mass

functions, respectively:

p

Y

(y|) =

e

y

y!

E [Y] =

var [Y] =

(6)

p

Y

(y|) =

(y+

1

)

(

1

)(y+1)

_

1+

_

y

_

1

1+

_ 1

E [Y] =

var [Y] = +

2

(7)

However, suitable candidates for c

i

are Gamma (GA) and Inverse Gaussian (IG). Equa-

tions (8) and (9) show convenient parametrizations of GA and IG density functions, respec-

tively:

f

Y

(y|, ) =

1

(

2

)

y

1

2

1

e

y

(

2

)

_

1

2

_

E [Y] =

var [Y] =

2

2

(8)

f

Y

(y|, ) =

1

2

2

y

3

e

(y)

2

2

2

2

y

E [Y] =

var [Y] =

2

3

(9)

2.3 Lapse probability and conversion rate

With the aim to optimize proposed tariffs, actuaries usually t models for lapse probability

and conversion rates which take into account new policyholders ows and existing cus-

tomer drop outs, respectively. The standard approach is logistic regression with covariates

regarding policyholders demographic prole and market competitiveness environment (see

[25]). Lapse and conversion modeling allows to dene properly the effective period of ex-

posure at risk for each subject during the time of the study, e

i

.

Our application models drug prescriptions cost of a pool of patient during one calendar

year. However each patient can enter the pool after the beginning of the year, e.g. for having

changed residence, and can leave the pool before the end of the year, e.g. for death.

Therefore the effective exposure period becomes a stochastic variable, e

i

that shall be mod-

elled in order to properly assess t

i

. We assume the expected value of n

i

to be proportional

An actuarial model for assessing general practictioners prescriptions costs 5

to e

i

, as formula 10 shows. GLM modelling handles this issue by means of offsets, as [1]

shows. The ln(e

i

) term in the link equation has its coefcient set at 1 by an offset term, as

equation 10 shows.

E [ n

i

] = e

i

exp

_

x

T

i

_

ln(E [ n

i

]) = ln(e

i

) +x

T

i

(10)

However since the exposure variable is in our application stochastic, equation 10 will be

properly modied to take into account the contribution of inows and outows.

e

i

= 1 e

l

i

+ e

nb

i

(11)

Equation 11 expresses the exposure of patient i-th, e

i

as the algebraic sum of three com-

ponents: the exposure amount, 1, that would be acheived if the patient would stay within

the pool for the full calendar year, less the fraction of year exposure, e

l

i

, that shall not

be considered in case the patient leaves the pool before the year end, plus the exposure

contribution, e

nb

i

, of new patients that shares the same demographic prole of patient i-th.

e

l

i

= q

i

I

d

can be expressed as the product of a Bernulli random variable q

i

and a uniform

(0,1) random variable,

I

d

. In particular q

i

represents the probability that patient i-th will

leave the pool within the year, while

d

represents the fraction of year lost. Using a uniform

distriubtion, we are assuming that lapse probability is constant thought the year.

Similarly we can express the exposure to new patients ow as e

nb

i

=

m

i

j=0

I

nb

j

. m

j

represents

the random number of new patients and it will be modelled by a Poisson distribution of

parameter

j

. Moreover we are assuming that

i

patients share the demographic prole of

patient i-th. The interpretation of

I

nb

j

is parallel to the

d

one.

2.4 Loss distribution modeling

Many actuarial application uses loss distribution modelling to assess the shape of claim

costs. Loss distribution modeling ts theoretical distribution parameters on real data in or-

der to fully characterize the distribution that better t empirical claim costs under study.

Fitting distributions requires to choose theoretical functions as candidates, to estimate their

parameters and to assess their goodness of t. Another application of loss distribution mod-

eling lies in approximating the insurer portfolios total cost,

T, by a simple theoretical dis-

tribution.

[14] book provides a comprehensive dissertation on loss distribution modelling.

An analytical expression of the loss distribution allows to estimate key moments (e.g. mean

and variance) and other statistics by closed form instead using simulation analysis that can

be time - consuming. However very often real data are difcult to be synthesized by theo-

retical distribution due to data quality problems or excessive heterogeneity.

The applications of loss distribution tting in this paper is twofold. The rst side con-

sists in the selection of conditional distribution for n

i

and c

i j

when performing GAMLSS

modelling. Normalized quantile residuals (see [8] for details) plots aided the assessment of

chosen conditional distribution reasonableness. The second side consists in the closed ap-

proximation of shape of

T by means of a log-normal distribution following the approach

outlined in [15] paper.

6 Giorgio Spedicato

2.5 The estimation procedure

In order to estimate

T we will dene the distributions of n

i

and c

i

by means of GAMLSS

predictive models. Patients with full year exposure will be used to calibrate the model for n

i

.

Distributions of t

i

and

T can be obtained empirically by means of Monte Carlo simula-

tion. In particular, a random realization from distribution of the total cost t

i

for patient i can

be simulated using the convolution algorithm:

1. Sample one realization of the effective yearly exposure for patient i-th, e

i

2. Select the number of prescription drugs, k, at random from the assumed prescription

drugs frequency distribution n

i

.

3. Do the following k times. Select the prescription drugs cost, z, at random from the

assumed prescription drugs cost distribution c

i

. costs, z, selected in step 2.

Then, if the outlined process is repeated for all N patients of the general practitioners port-

folio, we obtain one random realization from the distribution of the total cost

T.

Finally, in order to obtain the distribution of t

i

or

T it is necessary to repeat the previous

steps M times (M >> 0).

3 An empirical application

3.1 Data sources and preparation

3.1.1 Data sources

An empirical application will be presented in the studio to exemplify numerically the frame-

work outlined previously. We will assess the distribution of yearly drug prescription total

cost of a target GP pool of patients. The data sources used in the application are:

1. A data set,the prescriptions data set (PDS), containing the number of prescriptions of

6,000+ patients to their GPs [11]. Each rows in the PDS contains the number of pre-

scriptions during a whole year (dependent variable) plus a wide choice of demographic

data. PDS will be used to calibrate the frequency model. We have not challenged the

reliability of PDS due to the impossibility to perform such task. Moreover the PDS has

been collected on patients between 25 and 65 years of age. All analyses will be therefore

limited to the corresponding span of age, without losing generality.

2. A life table split by sex used to model the probability of death as a function of age

(source [13]).

3. A data set in the same format of the VDS containing 600 patient demographic data,

henceforth the target data set (TDS). TDS represents the pool of patients of a GP that

we code as XY. XY

T distribution is to be assessed by the methodology proposed in this

paper.

4. A data set containing a sample of drugs costs along with the age and sex of the patient

whom the prescriptions was required for. This data set, henceforth the Costs Data Set

(CDS) will be used to calibrate the drug prescription cost model. This dataset has been

collected in Spring 2011 thanks to the cooperation of an Italian drugstore.

5. A function that allows to model the probability of drop out due to reasons other than

death (lapse probability). Due to data availability limitation, we have set this probability

to a at value of 2.0%, after a discussion with a panel of experienced GPs.

An actuarial model for assessing general practictioners prescriptions costs 7

6. A function that gives the rate of new enrolled patients (conversion rates). Due to data

availability limitation, we have set this rate to a at value of 3.0%, after a discussion

with a panel of experienced GPs.

Standard lapse and conversion models deployed by personal lines pricing actuaries uses

logistic regression model to predict yearly lapse probability for each policyholder. Variable

used in such regression models consist policyholder demographics, policyholder purchasing

behaviour and market competitiveness.

In our problem it is clear that the risk of enter and drop out from the pools is not uniform

among the patients. Age is indeed a systematic risk factor, but we had not the data source

to build predictive models for lapses and conversion rates with covariates and therefore we

choose a at lapse rate to model drop out for reasons other than deaths. Even if the followed

approach is simple, it however permits to simulate the open collectivity patients ows.

As the aim of the paper is to demonstrate the feasibility of the process, we did not care

to nd datasets completely matching to the real problems. The PDS and VDS comes from

a German study on yearly number of visit to GPs conducted in the 80s. We have assumed

that the number of the visit to the doctor may be a perfect proxy to the number of drug

prescription and that the population sampled in PDS and VDS dataset are representative

of the population targeted that is represented by northern Italy NHI patients. On the other

hand the CDS represents a sample of drug prescriptions amount collected in Spring 2011

thanks to the cooperation of a drug store of Nibionno (Italy). The number and the cost data

set are not collected on the same subject. This issue does not represent a limitation to the

analysis as the cost distributions has been assumed independent from the distribution of drug

prescriptions number having the effect of structural variables (like age and sex) taken into

account. Nevertheless the employed data sources allowed us to exemplify adequately the

operative methodology we have discussed 2.5.

3.2 Predictive models estimation

GAMLSS can be tted by means of an R package ([18]).

As long as the purpose of this article is to illustrate the application of an actuarial

methodology to a health economic problem, the modelling stage has not been excessively

complicated and an approach somewhat resembling the usual pricing practice in non - life

insurance has been followed.

The PDS average number of drug prescription equal to 3.33 and corresponding standard

deviation is 6.03. The sampled costs of drug prescriptions average is 20.3 and corresponding

standard deviation is 24.1.

Two predictive model on n

i

and

c

i

j were tted using GAMLSS framework.

Model building process consisted in experimenting and assessing different distributional

assumption of the dependent variables, the signicance of candidate predictors and their

functional relationship within the regression equation, as properly described in [17]. Finally

following decisions were taken with respect to the selected models:

The negative binomial has been chosen as underlying distribution for the frequency of

prescriptions, while the inverse Gaussian has been chosen as underlying distribution for

the cost of a single drug prescription. They were parametrized using formulas 7 and 9

respectively.

8 Giorgio Spedicato

Cubic splines have been used in both the frequency and costs model to handle non -

linear marginal relationships between the continuous covariates and the dependent co-

variates. Splines are suggested (see e.g. [12])in applied statistical modelling to overcome

the naive assumption of marginal linearity. Another approach to handle this issue, widely

used in personal lines ratemaking (as described in [25]), consists in binning continuous

variables into categorical variable choosing brackets properly and using such binned

variables into regression models.

In this exploratory study, we have performed no analysis of interactions of predictors.

An exposure variable has been added to the dataset, with constant value 1 (assuming

every patients of PDS to have been observed for a whole year without censoring), e

i

=1.

The number of prescriptions regression model had a ln(e

i

) term as offset. This offset

term had to been taken inserted the formula explicitly as required by GAMLSS package.

Even if in the PDS e

i

= 1 for all records, within the simulation process described further

e

i

will became a random variables taking into account the open collectivity structure.

The inspection of frequency and cost models marginal effects plots for their parameter

in gures 1 and 2 leads to following conclusions:

The relationship between age and n

i

is positive and almost linear.

Females experience more drug prescriptions than males.

The relationship between handicap percentage and drug prescription is positive and

shows non - linear behaviour.

The relationship between income and drug prescriptions is negative and almost linear.

The cost of prescriptions seems to have a parabolic behaviour with age, as it increase

sharply, peaks at 55 years circa and then seems to drop.

As previously cited, most relevant advantage of GAMLSS models is that more param-

eters in addition than can be t as a function of covariates, as shown in equation 4. The

analysis process has shown that modelling ( n

i

) as a function of patients age improve the

GAIC goodness of t index relevantly. On the other hand goodness of t has not improved

if a regression relationship between ( c

i

) either age or sex were set.

The number and cost of prescription GAMLSS models diagnostic plots are reported

in es 3 and 4 respectively. GAMLSS diagnostic plots shows normalized quantile residuals

plot with respect to tted value, position in the data base, the residual kernel distribution plot

and a normal qq-plot. Normalized quantile residuals are a generalized version of residuals

[8] that follows normal distribution by construction. They are useful to assess the correctness

of the probabilistic distributions of the model being tested. See [17] for further details.

The residuals analysis of frequency model plot in 3 shows that the body of distribution has

been t fairly well while the rightmost tail goodness of t is not perfect due to residual over-

dispersion. The diagnostic plot the cost model in gure 4 shows that the chosen probabilistic

distribution ts very well the empirical data.

An actuarial model for assessing general practictioners prescriptions costs 9

model plot.png

Fig. 1 Drugs prescriptions frequency model marginal effects plot, parameter

model plot.png

Fig. 2 Drugs prescriptions costs model marginal effects plot, parameter

10 Giorgio Spedicato

Fig. 3 GAMLSS diagnostic output of drugs prescriptions frequency model

3.3 The simulation process

The cost distribution of the yearly amount of drugs prescription for the TDS has been ini-

tially simulated as follows:

1. The TDS has been duplicated into two distinct dataset: the rst one representing patients

in force at the beginning of the period (henceforth IFP), the second one representing the

patients (henceforth NP) that would enter in the GP pool after the beginning the period.

2. The following passage have been repeated m = 1, . . . , M = 1000 times in order to simu-

late the distribution of the patients pool drug prescriptions total costs:

(a) The exposure in terms of patient/years

E has been determined both for IFP and NP

datasets rows, as follows:

For IFP patients exposure, one number I

i

from a Bernoulli variable with prob-

ability equal to q

i

(d)

+q

i

(l)

has been drawn. q

i

(d)

and q

i

(l)

represent the proba-

bility of lapse due to death and other causes respectively. Due to collected data

limitation, the model we built assumes that only age and sex affect the lapse

probability, allowing a contribution of other causes set at as q

i

(l)

= 0.02.In

case I

i

= 1 the yearly exposure for patient i-th is drawn from a uniform [0, 1].

Then the exposure for IFP dataset records is expressed as e

i

= (I

i

= 0) 1 +

U (0, 1) (I

i

= 1).

For NP data set, the exposure has been determined rst sampling a number

in

i

from a Poisson with rate parameter 0.03 for each row.

in

i

represents the number

of patien with the same demographic characteristics of the patient in row i

th that will enter in the data set within the year. For each

in

i

the convolution

An actuarial model for assessing general practictioners prescriptions costs 11

Fig. 4 GAMLSS diagnostic output of drugs prescriptions costs model

approach has been applied to determine the total exposure for new incoming

patients sampling in

i

outcomes from a uniform [0, 1] distribution.

.

(b) Predict E [ n

i

], var [ n

i

], E [ c

i

] and var [ c

i

] for each rows in IFP and NP dataset us-

ing GAMLSS models calibrated in the previous step. Therefore n

i

and c

i

are fully

dened since both and parameters are kwown for both n

i

and c

i

.

(c) Applying the convolution process on n

i

and c

i

to determine the total costs of drug

prescription in the year as shown in formula 2. The number and the cost distributions

parameters have been estimated in the previous step.

(d) Sum the simulated amounts t

i

, number, n

i

and exposures e

i

along the IFP and NP

databases and then summing them up in order to determine the yearly

E patients

exposures, prescriptions number

N and total cost

T for the analysed pool of patients.

The R object oriented structure makes possible to perform the simulation process using

the predict methods applied on estimated GAMLSS regression models at 3.2 paragraph. The

simulation steps have shown to be quite slow, as several hours have been needed to simulate

the yearly total expenditures of a 600 patients group of hypothetical general practictioner

XY using just M = 750 simulations on a standard desktop PC. A short-cut would be there-

fore useful to apply the propose operationally.

In [15] the log - normal distribution has been suggested to t total loss distribution for

an personal line non life portfolio. This suggestion has been followed and the log-normal

distribution has been t on total prescription cost distribution simulated in the previous step

by the Monte Carlo approach. The R tdistrplus package [7] was used in order to estimate

the parameters and to assess the goodness of t graphically and using suitable statistical test

12 Giorgio Spedicato

cost t.png

Fig. 5 General practictioner XY yearly total cost of drug prescritpion log - normal distribution t

(Andeson Darling and Kolmorogov Smirnow).

Fitting results shown in gure 5 show that the log-normal distribution could provide a very

good t of

T. Moreover all p-values of the two goodness of t statistical tests were non

signicative. Therefore if the parameters of the log-normal distribution of

T would be known

in advance, there would be no need to conduct a time - consuming Monte-Carlo simulation

to assess the distribution

T. We will show that it is possible to know these parameters in

advance. As

T is the sum of independent t

i

observation, equation 12 follows.

E

_

=

N

i=1

E( t)

i

var

_

=

N

i=1

var( t)

i

(12)

Moreover each t

i

represents an outcome of a compound distribution. Following [4], the

expected value and the variance of t

i

can be obtained in closed form from equation ??. All

terms in ?? are obtained from previously tted GAMLSS models.

Since the theoretical expected value and variance of

T are known, the parameters of

the log-normal approximation of the total amount distribution can be therefore be evaluated

directly using the method of moments formulas 13. The direct estimation of parameters

T

and

T

allows to completely dene

T distribution.

T

= ln(E (T))

1

2

ln

_

1+

var (E (T))

E

2

(T)

_

2

T

= ln

_

1+

var (E (T))

E

2

(T)

_

(13)

Therefore the total cost distribution can be almost perfectly approximated using a quite

simple analytical distribution.

An actuarial model for assessing general practictioners prescriptions costs 13

3.4 Results

The outlined algorithmhas been applied on TDS data set, that represent general practictioner

XY 600 patients demographic data. Tables 1, 2 and 3 shows general practictioner patient

/ years, number of prescriptions and total cost of prescription key statistics. The 99.5%

percentile gure has been added for number and total amount. Such gure may be used to

budget and monitor GP XY drug prescriptions expenditures.

mean Q1 Q3

602.84 599.69 606.10

Table 1 Doctor XY patient/years distribution

mean SD Q1 Q3 p99.5

1952.96 125.33 1868.50 2038.00 2260.52

Table 2 Doctor XY number of prescriptions distribution

mean SD Q1 Q3 p99.5

39967.79 2721.93 38182.61 41814.40 46118.52

Table 3 Doctor XY total cost of prescriptions distribution

4 Conclusions and further research

4.1 Discussion of results

This article has shown how non - life actuarial techniques can be successfully applied to a

health economics problem. We have used GAMLSS to evaluate the frequency and the cost

of drug prescriptions following an approach closely resembling personal line rate-making.

The predictive models we propose can be used to assess and explain which demographic

risk factors affect signicantly the number and the cost of drug prescriptions paid by NHI.

Moreover the convolution approach of the collective risk theory has been used to assess

the distribution of yearly expenditures of a GPs pool of patients. The assessment of the total

cost distribution can be used to monitor the prescriptions granted by the GP using a statisti-

cally grounded approach.

A relevant limitation of the followed modelling approach is that pandemic events are

not handled properly as each patient is assumed independent from the other ones. Pandemic

events would affect at the same time may patients by disease contagion especially if spatially

14 Giorgio Spedicato

close (as catastrophic insurance losses). On the other hand seasonal diseases do not repre-

sent an issue due to the yearly period of observation (fractional exposures have appeared to

be a small issue in this problem).

The approach followed in this paper has modelled all prescriptions granted by GP, avoid-

ing creating sub-models e.g. for disease groups.

Further subdivisions of drug prescriptions might be interesting for deepening the risk factors

inuencing the frequency and the cost of homogeneous groups of drug expenditures.

We think that the most valuable use of the proposed model within health economics

would be a rationale assessment of the standard cost of drug prescription for a GP pool

of patient. Assessing

T and t

i

distributions would permit to obtain statistics useful for the

planning and budgeting process like:

The expected value and any desired dispersion measures.

Extreme percentiles (e.g. 99th), that may be used as a threshold for further actions in

order to investigate potential inefciencies or abuses.

If the predictive model would be calibrated on a certied sample, they could be used to

estimate the standard cost of yearly drug prescription for any GP pool of patients knowing

patients demographics. The use of standard costs of government provided services have

been acquiring relevant increasing importance in a period of budget pressure Italy and many

OECD countries are facing. At the same time the developed model would permit to obtain

the distribution percentiles of drug prescription that can be used to monitor the expenditures,

e.g. priotitizing routinely audits of individual GP drug prescriptions.

Moreover the proposed methodology can be easily used to estimate the drug prescription

costs taking into account ination and changes of coverage offered by the NHI, like the

application of a yearly deductible or a coinsurance percentages. With respect to the actuarial

side of the analysis, another relevant application lies in the estimations of the multi - year

actuarial present value of the drug prescriptions costs for any patient given its demographic

prole. GAMLSS model for the number and costs let us to obtain a yearly average total cost

( a pure premium) as a function on age x, x +1, . . . and other demographic variables, pr

x

of

any patient. After assumptions about future ination rate i

t

, the nancial discount rate v

t

and

the probability of survival

x

p

t

have been made, the lifetime actuarial present value of drug

prescription cost for any patient in the pool can be expressed by formula 14.

c

i

=

x

t=0

(1+i

i

)

t

t

p

x

v

t

pr

x+t

(14)

4.2 Further research

Finally the proposed approach can certainly be applied to more traditional actuarial applica-

tions like personal lines rate-making or capital modelling. -

The discussed model shows a rationale approach to assess a general practitioner drugs

prescriptions cost distribution. Actuarial techniques commpon in general insurance pricing

and risk management practices have been applied to a Health Economics problem as long

as GAMLSS models, frontier methods in regression modelling.

An actuarial model for assessing general practictioners prescriptions costs 15

As a further research direction, predictive models more exible than standard log-linear

regressions framework should be tested in order to better assess the frequency and the cost

of prescriptions. More rened models for patient lapses and patient conversions can be built

following what done in pricing optimization tasks on personal lines rate-making [23], when

data avaibility issue would be solved.

We suggest to expand the study by increasing the sample of physician analysed and

patients transactions. The use of GP level data would be a valuable improvement in the ex-

plicative power of the data set. In fact literature (e.g. [21]) has shown that GP characteristics

like length of practice affect the outcome signicantly. This could increase the consistency

of model estimates and, last but not least, the inclusion of GP level variables in the model

could improve model explicative and predictive power.

16 Giorgio Spedicato

Acknowledgements The authors wish to thank Dr. Stefania Giacalone for having provided drug prescrip-

tions costs data. I wish to thank Simona Minotti for her outstanding contribution in reviewing the document.

The data analysis in this paper was performed with R, statistical software which is released under the GNU

General Public License (GPL). For more information on R, the interested reader is referred to R Development

Core Team, [16].

References

1. Duncan Anderson, Sholom Feldblum, Claudine Modlin, Doris Schirmacher, Ernesto Schirmacher, and

Neeza Thandi. A practitioners guide to generalized linear models. Technical report, Casualty Actuarial

Society, 2007.

2. CEIOPS. Qis5 technical specications, July 2010.

3. Cermlab. Alla ricerca di standard per la sanit federalista.

http://www.cermlab.it/argomenti.php?group=sanita&item=43, 02 2010.

4. C.D. Daykin, T. Pentik

ainen, and M. Pesonen. Practical risk theory for actuaries. Monographs on statistics and applied

probability. Chapman & Hall, 1994.

5. Piet de Jong and Gillian Heller. Generalized linear models for insurance data. Cambridge University

Press, New York, rst edition edition, 2008.

6. Piet de Joung, Mikis Stasinopoulos, Robert Stasinopoulos, and Gillian Heller. Mean and dispersion

modeling for policy claim cost. Scandinavian Actuarial Journal, 2007.

7. Marie Laure Delignette-Muller, Regis Pouillot, Jean-Baptiste Denis, and Christophe Dutang. tdistrplus:

help to t of a parametric distribution to non-censored or censored data, 2010. R package version 0.1-3.

8. Peter Dunn and Gordon K. Smyth. Randomized quantile residuals. J. Computat. Graph. Statist, 5:236

244, 1996.

9. E. Fernandez-Liz, P. Modamio, A. Catalan, C. F. Lastra, T. Rodriguez, and E. L. Marino. Identifying

how age and gender inuence prescription drug use in a primary health care environment in catalonia,

spain. Br J Clin Pharmacol, 65:407417, Mar 2008.

10. M. Garcia-Goni and P. Ibern. Predictability of drug expenditures: an application using morbidity data.

Health Econ, 17:119126, Jan 2008.

11. Professor W. Greene. German health care usage data. online:

http://pages.stern.nyu.edu/ wgreene/Econometrics/PanelDataSets.htm, 1997.

12. Frank E. Harrel. title. Technical report, Vanderbilt University School of Medicine, 2011.

13. Istat. Geodemo istat: Tavole di mortalit regionali, 2011. Online; accessed 26-June-2011.

14. S.A. Klugman, H.H. Panjer, and G.E. Willmot. Loss Models: From Data to Decisions (Book, Solutions

Manual, and ExamPrep). John Wiley & Sons, 2009.

15. D.E. Papush, G.S. Patrik, and F. Podgaits. Approximations of the aggregate loss distribution. In CAS

Forum, pages 175186, 2001.

16. R Development Core Team. R: A Language and Environment for Statistical Computing. R Foundation

for Statistical Computing, Vienna, Austria, 2010. ISBN 3-900051-07-0.

17. Bob Rigby and Mikis Stasinopoulos. A exible regression approach using gamlss in r, 11 2009.

18. R. A. Rigby and D. M. Stasinopoulos. Generalized additive models for location, scale and shape,(with

discussion). Applied Statistics, 54:507554, 2005.

19. Robert Rigby and Mikis Stasinopoulos. Generalized additive models for location, scale and shape,(with

discussion). Applied Statistics, 54:507554, 2005.

20. Nino Savelli and Gianpaolo Clemente. Hierarchical structures in the aggregation of premium risk for

insurance underwriting. Scandinavian Actuarial Journal, 1:1, 2010.

21. G. Simon, C. Francescutti, S. Brusin, and F. Rosa. Variation in drug prescription costs and general

practitioners in an area of north-east italy. the use of current data. Epidemiol Prev, 18:224229, Dec

1994.

22. Giorgio Alfredo Spedicato. Solvency II premium risk modeling under the direct compensation CARD

system. PhD thesis, La Sapienza, Universit a di Roma, 2011.

23. James Tanser. Pretium manual. Tower Watson, 3.1 edition, 2010.

24. Gary Venter. Mortality trend models. Casualty Actuarial Society Forum, 1:1, 2011.

25. Geoff Werner and Claudine Modlin. Basic Ratemaking, 2009.

26. Keith Wilson-Davis and William G. Stevenson. Predicting prescribing costs: A model of northern ireland

general practices. Pharmacoepidemiology and Drug Safety, 1(6):341345, 1992.

- StatisticsUploaded bySamson Scofield
- Understanding People Soft Workforce RewardsUploaded byrahulagrawal_sd
- App1Uploaded byapi-19731569
- Regression HelpUploaded byalexxmarie
- Dmaic ToolsUploaded byJayakumar Polisetty
- 10.1.1.419.2211Uploaded byAlly Gelay
- Forecasting Short SeasonalUploaded byumayrh@gmail.com
- Ct6 SyllabusUploaded byMeher Shiva
- RegressionUploaded byMUHAMMAD HASAN NAGRA
- Beautiful PlacesUploaded byMiklos Robert
- Benita PaperUploaded bySushant Shekhar
- Fsr3 Determinants Tcm16-9477Uploaded byaamritaa
- Romo_et_al_MCPUploaded byAlfie Cocteau
- ROI Study MethodologyUploaded byOpe Jegede
- SpeechUploaded bypermafrostXx
- Nonlinear Example How ToUploaded byAgnes Febriana
- Statistics Project - FINALUploaded bylavanya2401
- ANALISIS REGRESI BELAJAR.docxUploaded byZuhdary Mtc
- SLP #5 for Module #5 RES600 ClassUploaded byanhntran4850
- Classnotes2-ANOVATableUploaded byhammoudeh13
- Trust and Tacit Knowledge Sharing and UseUploaded byDian Abiyoga
- 1.0 Regression Problems for Magnitudes - Castellaro 2006Uploaded byGovind Gaurav
- UCINET Visualization and Quantitative Analysis TutorialUploaded byGerardo Damian
- PredictiveModeling-LouiseFrancis-MAS2015Uploaded byalex_garnica_deutsch
- 81imprimirtodoUploaded byMikhail López
- gs_short_exUploaded byjhon montero
- Determinants of Farmers’ Participation Decision on Local Seed Multiplication in Amhara Region,Ethiopia a Double Hurdle ApproachUploaded byIjsrnet Editorial
- 2004-Developing a Framework for a Standarized Work Programme for Building ProjectUploaded byaries
- Daily Physical Activity AssessmentUploaded byAndres Felipe
- Modeling Road Accident Fatalities in a Subregion Of GhanaUploaded bysardineta

- All Time Medicine Supply Using Vending MachineUploaded bydbpublications
- Bookstaver Et Al-2015-Pharmacotherapy- The Journal of Human Pharmacology and Drug TherapyUploaded bynoname19191
- Neuro Study ChartsUploaded bymeikashah89
- Susan Johnson Qi Ye Lian 2005Uploaded byubiktrash4192
- Chapter 009Uploaded byEbony LaShonda
- Buccal DdsUploaded byHari Krishnan
- Postpartum ComplicationsUploaded byDonald Garcesa Camatura
- 4TH Quarter TEST IN MAPEH - printed.docxUploaded byWilmae Grace Provido
- Strategic Analysis of the Pharmaceutical Sector in Morocco and Quality ApproachUploaded byvdved
- Indever.pdfUploaded bybiddut95
- Lesson 35 Medication Administration and Dose Calculations.docxUploaded byDarren Ross
- FDA Cfr Title 21 Sec. 173.310Uploaded bySunny Ooi
- annotated bibliographyUploaded byapi-356101485
- The Therapeutic Potential of Cannabis and CannabinoidsUploaded byCannabinoid Android
- Medical Marijuana - MD brochureUploaded by420
- 34640967-Amlodipine-Drug-Study.docxUploaded byDoflang Quixote
- 15 Penyakit Ulkus Peptikum (Regimen Pengobatan h.pylory)Uploaded byNhoer
- Janssen Products et. al. v. Lupin et. al.Uploaded byPriorSmart
- GUIDE-MQA-013-008 (Guidance Notes of Good Distribution Practices)Uploaded byWilliam Chandra
- EP240614-2Uploaded bynarayanasam
- Drugs and Alcohol 1910-1930Uploaded byMelissa Covarrubias
- 1 Exercise ERD 1Uploaded byYashil Yudhisthir Bhurtun
- 01aa. Case AnswersUploaded byGordon Bong
- The Deception Of Pharmaceutical InnovationUploaded byDaniel Abshear
- Interacción de Gemfibrozil Sobre RosiglitazonaUploaded bySusana Migliaro
- Healthcare needs healing-but how?Uploaded byPunyabrata Goon
- Updates in Addiction PsychopharmacologyUploaded bytadcp
- Pharmaglimps Mcqs for Pharmacology 2.PDF Enzyme Inhibitor Receptor AntagonistUploaded byPragnesh Parmar
- Chapter Quiz for Antimicrobial AgentsUploaded byEr W In
- Challenges and Issues Related to Regulation of Traditional MedicineUploaded byKavish Bhajbhuje