Professional Documents
Culture Documents
Contents
1 Preparation from Probability Theory
2 Statistical Indicators
2.1 Introduction to Economic Statistics . . . .
2.2 Statistical frequency series . . . . . . . . .
2.3 Classification algorithm . . . . . . . . . . .
2.4 Classification of statistical indicators . . .
2.5 Average measures . . . . . . . . . . . . . .
2.5.1 The arithmetic mean . . . . . . . .
2.5.2 The harmonic mean . . . . . . . .
2.5.3 The geometric mean . . . . . . . .
2.5.4 The quadratic mean . . . . . . . .
2.5.5 Absolute moments . . . . . . . . .
2.5.6 Properties of the means . . . . . .
2.6 Position measures . . . . . . . . . . . . . .
2.6.1 The mode . . . . . . . . . . . . . .
2.6.2 The median . . . . . . . . . . . . .
2.6.3 Quintiles . . . . . . . . . . . . . . .
2.6.4 Properties of the position measures
2.7 Variation measures . . . . . . . . . . . . .
2.7.1 Simple measures of dispersion . . .
2.7.2 Average deviation measures . . . .
2.7.3 Shape measures . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
26
26
27
28
29
31
31
32
32
33
33
34
34
34
35
37
37
38
38
39
41
44
44
49
52
54
57
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
CONTENTS
5.4
5.5
5.6
interest
A general model of interest . . . . . . . . . .
Equivalence of investments . . . . . . . . . .
Simple interest . . . . . . . . . . . . . . . .
5.3.1 Basic formulas . . . . . . . . . . . . .
5.3.2 Simple interest with variable rate . .
5.3.3 Equivalence by simple interest . . . .
Compound interest . . . . . . . . . . . . . .
5.4.1 Basic formulas . . . . . . . . . . . . .
5.4.2 Nominal rate and effective rate . . .
5.4.3 Compound interest with variable rate
Loans . . . . . . . . . . . . . . . . . . . . .
Problems . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
annuities
A general model. Classifications . . . . . . . . . . . . . . .
Single claim . . . . . . . . . . . . . . . . . . . . . . . . . .
Life annuities-immediate . . . . . . . . . . . . . . . . . . .
7.3.1 Whole life annuities . . . . . . . . . . . . . . . . . .
7.3.2 Deferred whole life annuities . . . . . . . . . . . . .
7.4 Temporary life annuities . . . . . . . . . . . . . . . . . . .
7.5 Life annuities-immediate with k-thly payments . . . . . . .
7.5.1 Whole life annuities with k-thly payments . . . . .
7.5.2 Deferred whole life annuities with k-thly payments
7.5.3 Temporary life annuities with k-thly payments . . .
7.6 Pension . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.6.1 Annually pension . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
64
64
66
68
68
69
70
70
70
71
72
72
74
.
.
.
.
.
.
.
75
75
76
76
78
79
81
83
.
.
.
.
.
.
.
.
.
.
.
.
84
84
85
86
86
87
88
88
89
90
91
92
92
CONTENTS
7.7
8 Life
8.1
8.2
8.3
8.4
8.5
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
95
95
95
97
98
99
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
100
. 100
. 101
. 102
. 102
. 103
. 104
. 105
. 106
. 106
. 107
. 108
. 109
10 Bonus-Malus system
10.1 A general model . . . . . . . . . . . . . . . . . . . . . . .
10.2 Bayes model based on a mixed Poisson distribution . . .
10.3 Gamma distribution for the average number of accidents
10.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
111
111
112
114
118
Bibliography
[1] F. Badea, C. Dobrin, Gestiunea bugetar
a a sistemelor de productie, Ed. Economica,
2003.
[2] N. Boboc, Analiz
a matematic
a. Partea I, Tipografia Universitatii din Bucuresti,
Bucuresti, 1988.
[3] C. Kleiber, S. Kotz, Statistical Size Distributions in Economics and Actuarial Sciences, Wiley, New Jersey, 2003.
[4] P.M. Lee, Bayesian statistics. An introduction, Hodder Arnold, London, 2004.
[5] D. Lovelock, M. Mendel, A.L. Wright, An Introduction to the Mathematics of Money.
Saving and Investing, Springer, New York, 2007.
[6] Y.D. Lyuu, Financial engineering and computation. Principles, mathematics, algorithms, Cambridge Univ. Press, 2004.
[7] I. Mircea, Matematici financiare si actuariale, Ed. Corint, Bucuresti, 2006.
[8] I. Negoit
a, Aplicatii practice n asigur
ari si reasigur
ari, Ed. Etape, Bucuresti, 2001.
[9] V. Preda, C. B
alc
au, Entropy optimization with applications, Ed. Academiei Romane,
Bucuresti, 2010.
[10] I. Purcaru, Matematici financiare: Teorie si practic
a n operatiuni bancare.
Tranzactii bursiere. Asigur
ari, Ed. Economica, Bucuresti, 1998.
[11] I. Purcaru, I. Mircea, Gh. Lazar, Asigur
ari de persoane si de bunuri : Aplicatii.
Cazuri. Solutii, Ed. Economica, Bucuresti, 1998.
[12] Gh. Secar
a, Statistic
a, Ed. Univ. Pitesti, 2002.
[13] A. Ullah, D.Giles, Handbook of applied economic statistics, Marcel Dekker, New York,
1998.
[14] R. Vernic, Matematici actuariale, Ed. Adco, Constanta, 2004.
[15] Gh. Zb
aganu, Metode matematice n teoria riscului si actuariat, Ed. Univ. Bucuresti,
2004.
Theme 1
Preparation from Probability
Theory
We group here the principal notions and results from probability theory that
were used in this course.
Definition 1.1. Let be any set. We denote by P() the set of all subsets
of , i.e. P() = {A / A }.
Definition 1.2. A topology on the set is a family T of subsets of s.t.
, T ,
A, B T A B T ,
S
Ai T ,
(Ai )iI T
iI
A B \ A B,
S
(Ai )iN B
Ai B.
i=1
iI
iI
iI
surable spaces (i , Bi ), i I.
d
O
B1 .
i=1
() = 0,
[
X
(Ai ).
(Ai )iN B mutually disjoint (Ai Aj = , i 6= j) ( Ai ) =
i=1
i=1
i=1
n
[
i=1
Ai ) =
n
X
k=1
(1)k1
(Ai1 . . . Aik );
1i1 ...ik n
[
n=1
An ) = lim (An );
n
n=1
An )
An ) = lim (An );
n=1
(An ).
n=1
o
1 .
Definition 1.11. In the context of the above proposition, we say that (p )
defined by p = ({}) is the discrete (or countable) probability
distribution of the discrete (or countable) probability .
Remark 1.1. In the setting of the above definition, (p ) is a vector if
is finite and a sequence if is infinite.
Remark 1.2. If (, B, P ) is a probability space, X : 1 is a function
and x 1 s.t. { / X() = x} B, then we denote
P (X = x) = P ({ / X() = x}).
Similarly one uses the notation P (X < x), P (X > x), P (X x), P (X x),
P (X 6= x), P (X A), where A 1 .
Also, if Y : 1 is another function s.t. { / X() = Y ()}
B, then we denote
P (X = Y ) = P ({ / X() = Y ()}).
Similarly one uses the notation P (X < Y ), P (X > Y ), P (X Y ), P (X
Y ), P (X 6= Y ).
Also, if Z : 2 is another function and z 2 s.t. { / Z() =
z} B, then we denote
P (X = x, Z = z) = P ({ / X() = x and Z() = z}).
Similarly one uses the notation P (X < x, Z < z), P (X A, Y B),
P (X = x, Y = y, Z = z), etc.
iJ
P (B A)
, B B
P (A)
n
X
i=1
10
Proposition 1.6. Let be a distribution on Rd . Then its distribution function F verifies the following properties:
1) (d) (F ; a; b) 0, a, b Rd s.t. a b (i.e. ai bi i {1, . . . , d});
2) F is right continuous, i.e. lim F (x) = F (a), a Rd ;
x&a
3) lim F (x) = 1;
x
xi
11
12
be a
Definition 1.21. Let (, B, ) be a measure space and let f : R
measurable function (with respect to the Borel fields B and B1 ).
a) If f S(, B), then the Lebesgue integral of the function f with respect
to the measure is defined by
Z
X
f d =
a(f 1 ({a}))
af ()
c) f is called Lebesgue
integrable with respect to the measure (-Lebesgue
Z
integrable) if
13
fields B and B ).
a) (Linearity)Z For every 1 , 2 R the
Z function 1Zf1 + 2 f2 is -Lebesgue
integrable and
(1 f1 + 2 f2 )d = 1 f1 d + 2 f2 d.
Z
Z
b) (Monotonicity) If f1 f2 , then
f1 d f2 d.
Z
Z
If f1 f2 and ({x / f1 (x) < f2 (x)}) > 0, then
f1 d < f2 d.
Z
Z
c) f d |f |d.
d) f is finite -a.e., i.e. ({x / f (x) = }) = 0. Z
gd = f d.
Z
Z
f ) If |g| f -a.e., then g is -Lebesgue integrable and gd f d.
Theorem 1.1. (Lebesgues dominated convergence Theorem) Let
for
(, B, ) be a measure space and let the functions f, fn , g : R,
that is, if either integral exists so does the other and they are equal.
ii) (Change of variable formula) Let U, V Rd be two non-empty open
14
where J (y) = det
i
(y)
(J is called the Jacobian of ).
yj
i,j{1,...,d}
15
b1
bd
f (x)dd (x) =
[a,b]
a1
f (x)dd (x) =
Rd
(where the Riemann integrals from the right side are improper).
Proposition 1.13. Let (1 , B1 , P1 ), . . . , (n , Bn , Pn ) be probability spaces,
n N . Then there exists a unique probability P on the measurable space
n
n
Y
O
( i ,
Bi ) s.t.
i=1
i=1
P(
n
Y
i=1
Ai ) =
n
Y
i=1
16
n
n
n
Y
O
O
The probability space ( i ,
Bi ,
Pi ) is called the product of the
i=1
i=1
i=1
probability spaces (1 , B1 , P1 ), . . . , (n , Bn , Pn ).
Proposition 1.14. Let I be an infinite index set and, for every i I, let
(i , Bi , Pi ) be a probabilityYspace.
OThen there exists a unique probability P
on the measurable space ( i ,
Bi ) s.t.
iI
iI
P prJ1 =
Pj
jJ
iI
iI
jJ
iI
iI
probability spaces (i , Bi , Pi ), i I.
Definition 1.27. Let (, B, P ) be a probability space and (1 , B1 ) be a measurable space. A function X : 1 which is measurable (with respect
to the Borel fields B and B1 ) is called a random variable (r.v., random
element).
If (1 , B1 ) = (Rd , B d ), then X is called a d-dimensional r.v. (random
vector). In particular, if d = 1 then X is called a real-valued r.v.
Proposition 1.15. Let (, B, P ) be a probability space, (1 , B1 ) be measurable space and X : 1 be a random variable. Then = P X 1 is a
probability on the space (1 , B1 ).
Definition 1.28. In the context of the above proposition, the probability =
P X 1 is called the distribution (probability distribution) of the r.v.
X (with respect to the probability P ).
17
The distribution function (probability distribution function, cumulative probability distribution function) of the r.v. X (with
respect to the probability P ) is the distribution function of its distribution
P X 1 , i.e. the function
FX : Rd [0, 1], FX (x) = P (X < x), x Rd .
A d-dimensional r.v. X is called discrete if the image X() is a countable set. A d-dimensional r.v. X is called continuous (with respect to the
probability P ) if its distribution function FX is continuous.
Remark 1.5. Let X be a d-dimensional discrete r.v. Then its distribution
function FX (with respect to any probability P ) is also discrete, i.e. the image
FX (Rd ) is a countable set.
Proposition 1.16. Let (, B, P ) be a probability space and X : Rd be
a d-dimensional r.v.
a) If X is discrete, then its distribution = P X 1 is also discrete.
b) If X is continuous (with respect to P ), then its distribution = P X 1
is also continuous.
Definition 1.29. A function p : Rd [0, ) that is measurable (with respect
to the Borel fields B d and B1 ) is called a probability density function
(probability
function, density function) if is d -Lebesgue integrable and
Z
p(x)dd (x) = 1.
Rd
d -a.e.,
18
Definition 1.31. Let X1 , . . . , Xn be random variables defined on the probability space (, B, P ), where Xi is a di -dimensional r.v., for every i {1, . . . , n}.
Let the d! +. . .+dn -dimensional r.v. X = (X1 , . . . , Xn ) : Rd1 . . .Rdn .
The distribution of X (with respect to P ) is called the joint distribution
of the r.v. X1 , . . . , Xn (with respect to P ).
For any i {1, . . . , n}, the distribution of Xi (with respect to P ) is called
a marginal distribution of the r.v. X (with respect to P ).
Definition 1.32. Let (, B, P ) be a probability space, I be a non-empty
index set and (Xi )iI be a family of random variables defined on this space
with values in a measurable space (i , Bi ), for every i I. The r.v. Xi , i I
are called independent (with respect to the probability P ) if
P (Xi1 Ai1 , . . . , Xik Aik ) = P (Xi1 Ai1 ) . . . P (Xik Aik ),
for every finite non-empty subset {i1 , . . . , ik } I of indices, i1 < . . . < ik ,
k N , and for every events Ai1 Bi1 , . . . , Aik Bik .
Proposition 1.18. Let X1 , . . . , Xn be random variables defined on the probability space (, B, P ), where Xi is a di -dimensional r.v., for every i
{1, . . . , n}, n N . Let 1 , . . . , n be the distributions of X1 , . . . , Xn , respectively (with respect to P ), and let FX1 , . . . , FXn be the distribution functions
of X1 , . . . , Xn , respectively (with respect to P ).
a) The following assertions are equivalent:
a1) The r.v. X1 , . . . , Xn are independent (with respect to P );
a2) For every events A1 B d1 , . . . , An B dn we have
P (X1 A1 , . . . , Xn An ) = P (X1 A1 ) . . . P (Xn An );
a3) The distribution of X = (X1 , . . . , Xn ) (with respect to P ) verifies
the equality = 1 . . . n ;
a4) The distribution function FX of X = (X1 , . . . , Xn ) (with respect to P )
verifies the equality
FX (x) = FX1 (x1 ) . . . FXn (xn ), x = (x1 , . . . , xn ) Rd1 . . . Rdn .
b) We assume moreover that the r.v. X1 , . . . , Xn are discrete. Then X1 , . . . , Xn
are independent (with respect to P ) if and only if
P (X1 = x1 , . . . , Xn = xn ) = P (X1 = x1 ) . . . P (Xn = xn ),
for every x1 Rd1 , . . . , xn Rdn .
c) We assume moreover that the r.v. X1 , . . . , Xn are continuous, with the
19
|x|r d(x).
Er () Er (X) =
R
xd(x).
R
d) If var (X) < and var (Y ) < , then the covariance of the r.v. X and
Y (with respect to the probability P ) is defined by
cov (X, Y ) = E [X E(X)][Y E(Y )] .
20
E(X) E() =
xA
xA
[x E()]r ({x});
xA
hX
i2
X
X
var (X) var () =
[x E()]2 ({x}) =
x2 ({x})
x({x}) ,
xA
xA
xA
Er (X) Er () =
|x|r p(x)dmL (x);
ZR
Er (X) Er () =
xr p(x)dmL (x);
ZR
E(X) E() =
xp(x)dmL (x);
R
Erc ()
Z
=
21
ZR
Remark 1.7. In the context of the above proposition, if the probability density function p is continuous or, more generally, Riemann integrable on every
compact interval, then from Theorem 1.6 we have:
Z
Z
r
xr p(x)dx;
|x| p(x)dx; Er (X) =
Er (X) =
Z
Z
E(X) =
xp(x)dx; Erc (X) =
[x E()]r p(x)dx;
Z
Z
hZ
i2
2
2
var (X) =
[x E()] p(x)dx =
x p(x)dx
xp(x)dx .
If
Rd
Z
If
Rd
22
X
i=0
23
Remark 1.9. In the above definition, ht, xi denotes the inner product
(scalar product) of vectors t = (t1 , . . . , td ) and x = (x1 , . . . , xd ), i.e.
d
X
ht, xi =
ti xi . For d = 1 we have:
i=1
Z
X (t) (t) =
itx
etx d(x), t R,
xA
Rd
24
r X
(0), r n, r N .
tr
P Xn1
P X 1 .
25
Proposition 1.29. In the context of the above definition, we have the following equivalences:
Z
Z
w
a) n
if and only if lim
f dn = f d for every bounded, uniformly
n
continuous function f : 1 R.
w
b) n
if and only if lim n (A) = (A) for every A B1 s.t. (A) = 0,
n
where A is the boundary of the
Z set A.
Z
d
c) Xn
X if and only if lim
f (Xn )dP =
Theme 2
Statistical Indicators
2.1
Economic Statistics is the science that deals with the collection, classification, analysis and interpretation of numerical facts or data from economics.
It means that by the use of probability theory it imposes order and regularity
on aggregate of disparate elements of the same population.
Statistical population (statistical collectivity): the total number of
elements of the same properties representing the object of the investigation.
Statistical unit: the basic element of the statistical population, which
will be observed within the statistical research, and will represent any individual elements of the population.
Statistical characteristic: a common property of all the population
units.
Statistical variable: a statistical characteristic which can take different
values from a unit to another unit (or from a group of units to another group).
Statistical indicator: a numerical expression of an economic category,
obtained using a statistical calculus characterizing a variable.
Statistical sample: a part of statistical population, which will be investigated.
Descriptive Statistics: methods for representing and describing the
statistical population (data summarizing, tabulation and presentation; analysis of data uniformity and consistency and symmetry interpretation; construction of indicators, index numbers, time series; correlation and regression,...).
Inferential Statistics: methods for predicting about the whole statistical population by studying the properties of a statistical sample (estimation
of population parameters; construction of confidence intervals; testing statis-
26
27
tical hypothesis).
The main steps of a statistical research:
1. data collection;
2. data analysis;
3. data conclusions and results interpreting.
The detailed steps of a statistical research:
1. Establishing the objective of the research.
2. Defining and identifying the population to be studied according to the
objective.
3. Establishing the set of characteristics according to the information we
need to obtain.
4. Analyzing the already existing data bases about the studied population,
that is analyzing the secondary data sources.
5. For insufficient secondary data, organizing a total research or a partial
research (by sampling).
6. Organizing the data collection, which means deciding where, when and
how to collect the data for each unit, individually or collectively (using
a common recording way, like as a list).
7. Data recording, using a data analysis program like as EXCEL, STATISTICA, MINITAB or SPSS.
8. Data summarizing and presentation (by tables, series, graphs...).
9. Data analyzing using descriptive statistics and inferential statistics
methods.
10. Data conclusions and results interpreting (by research reports).
2.2
28
Simple frequency distribution (single variation frequency distribution or univariated data): the statistical variable is one-dimensional.
Multidimensional frequency distribution: the statistical variable
is multidimensional.
Frequency:
Absolute frequency, denoted by ni , represents the number of units
occurring to a certain variant or falling into a certain class. (interval).
Relative frequency, denoted by fi , represents the share of the absolute frequency corresponding to a variant or a class into the total
P
n
ni is the volume of
number of frequencies: fi = i , where n =
n
i
distribution.
Cumulated frequencies, can be obtained from absolute frequencies
or from relative frequencies, and represent the number of units with
the variable value lower or equal than the upper limit of the current
class.
2.3
Classification algorithm
A class (interval) of variation of the values (data) of a statistical distribution is defined between two boundaries: its lower and upper limit. The
class size (the interval size) represents the difference between the upper
limit and the lower limit.
Data grouping assumes solving the following main issues:
the purpose of the classification is to obtain synthetic data;
the grouped results should be homogeneous groups;
their frequency distribution should be as close as possible to the normal
distribution (Gauss bell ).
The classification algorithm consists in the following steps:
1. Compute the amplitude of distribution:
A = maximum value minimum value.
29
A
r
(rounded to an integer!).
4. Construct the classes, by starting with the minimum value and adding
the class size d step by step.
Exercise 2.1. The number of failures produced by an equipment and recorded
for the last 25 hours are as follows: 12, 15, 29, 23, 17, 7, 10, 14, 14, 27, 22,
8, 5, 19, 6, 15, 20, 17, 16, 17, 23, 19, 9, 28, 5.
a. Construct a frequency distribution and a relative frequency distribution for these data.
b. Construct a line chart for these data.
c. Group these data using the above classification algorithm.
d. Construct a frequency distribution and a relative frequency distribution for the obtained classes (grouped data).
2.4
30
Measures of
position
Measures of
dispersion
Measures of
association
Significant
tests
Ratio scale
Interval scale
Ordinal scale
Nominal scale
Mode
Median,
Arithmetic
Geometric
Quintiles
mean
mean
Quintiles
Standard
Percentage of
deviation
variation
Contingency Rank correlation Correlation, all the previous
coefficient
coefficients
regression
methods
Chi-square
sign test
t test,
all the previous
Fisher test
tests
2.5
Average measures
2.5.1
31
1X
xi ,
x=
n i=1
where x1 , . . . , xn are the values of the distribution, n being the volume
(the size) of distribution (the number of recorded values).
For a frequency distribution obtained from a classification by variants, the arithmetic mean (or the weighted arithmetic mean)
is
r
P
ni xi X
r
i=1
x= P
=
fi xi ,
r
i=1
ni
i=1
where x1 , . . . , xr are the variants (the distinct values of the distribution), n1 , . . . , nr are the corresponding absolute frequencies and f1 , . . . , fr
are the corresponding relative frequencies, r being the number of variants.
For a frequency distribution obtained from a classification by classes
(intervals), the arithmetic mean (or the weighted arithmetic
mean) is
r
P
ni xi X
r
i=1
x= P
=
fi xi ,
r
i=1
ni
i=1
li1 + li
2
2.5.2
32
ni
1
i=1
xh = P
r n = P
r f .
i
i
x
x
i=1 i
i=1 i
2.5.3
For a frequency distribution obtained from a classification by variants or by classes (intervals), the geometric mean (or the weighted
geometric mean) is
v
u r
r
uY n Y
f
n
i
t
xg =
xi =
xi i ,
i=1
where n =
r
P
i=1
ni .
i=1
2.5.4
33
2.5.5
Absolute moments
1X
mj =
|xi |j ,
n i=1
and the j-th moment is
n
1X j
mj =
x.
n i=1 i
For a frequency distribution obtained from a classification by variants or by classes (intervals), the j-th absolute moment is
r
P
mj =
ni |xi |j
i=1
r
P
i=1
=
ni
r
X
i=1
fi |xi |j ,
34
mj =
ni xji
i=1
r
P
=
ni
r
X
fi xji .
i=1
i=1
We remark that m1 = x.
2.5.6
11
6. The quadratic mean is more influenced by the large values of the variable. This mean is used to compute the standard deviations.
2.6
2.6.1
Position measures
The mode
35
For a frequency distribution obtained from a classification by variants, the mode is the variant with the highest frequency.
For a frequency distribution obtained from a classification by classes
(intervals), the mode is
Mo =
f f
li1 + li d
i+1 i1 ,
2
2 fi1 2fi + fi+1
where [li1 , li ) is the interval with the maximum frequency, called the
modal interval, d = li li1 is the size of the modal interval, fi
2.6.2
The median
if n is odd,
x n+1
2
Me = x n + x n
+1
2
2
, if n is even.
2
For a frequency distribution obtained from a classification by variants, let x1 , . . . , xr be the variants, and let n1 , . . . , nr be the corresponding absolute frequencies.
Median estimation procedure consists in the following steps:
1. Compute the median location:
n2 if n 100,
Meloc =
n+1 , if n < 100,
2
where n =
r
P
i=1
36
Me = xi , where i = min{i /
i
X
nj Meloc }.
j=1
For a frequency distribution obtained from a classification by intervals (classes), let [l0 , l1 ), [l1 , l2 ), . . . , [lr1 , lr ] be the intervals, and let
n1 , . . . , nr be the corresponding absolute frequencies of these intervals.
Median estimation procedure consists in the following steps:
1. Compute the median location:
n2 if n 100,
Meloc =
n+1 , if n < 100,
2
where n =
r
P
i=1
3. Compute the median interval [li 1 , li ) as the interval corresponding to the minimum (or first) cumulated frequency grather
or equal to the median location:
i = min{i /
i
X
nj Meloc };
j=1
1
iP
j=1
ni
nj
,
2.6.3
37
Quintiles
2.6.4
1. The mode is a measure of the central tendency very used in sales analysis. Its main advantage is the possibility to be computed also for
qualitative variables, and its main disadvantage is the possibility to
have multi-modal distribution (distribution with more than one modal
value).
2. The main advantage of the median is the fact that the extreme values
do not affect it as strong as they are affecting the mean. Also the
median is easy to compute and can be used also for ordinal qualitative
data. The main disadvantage of the median is that it does not take
into account all the observation.
3. For a symmetrical distribution, the mean, the median and the mode
are identical. For a skewed distribution the mean, the median and the
mode are located in different places.
Exercise 2.2. The number of failures produced by an equipment and recorded
for the last 25 hours are as follows: 12, 15, 29, 23, 17, 7, 10, 14, 14, 27, 22,
8, 5, 19, 6, 15, 20, 17, 16, 17, 23, 19, 9, 28, 5.
Calculate the above statistical indicators in each of the following cases:
a. simple distribution with an ungrouped set of values;
b. frequency distribution obtained from a classification by variants.
c. frequency distribution obtained from a classification by intervals.
2.7
38
Variation measures
2.7.1
(Me Q1 ) + (Q3 Me )
Q3 Q1
=
.
2
2
It measures how far from the median we should go on either side before
including 50% of the observations.
4. The individual deviations:
the absolute deviation: di = xi x;
xi x
the relative deviation: d0i =
100.
x
They provide information only for each recorded value and they are
not expressing the overall variation.
2.7.2
39
1X
|xi x|.
M AD =
n i=1
For a frequency distribution obtained from a classification by
variants or by classes (intervals), the mean absolute deviation is
r
P
ni |xi x| X
r
=
fi |xi x|.
M AD = i=1 P
r
i=1
ni
i=1
2. The variance:
For a simple distribution with an ungrouped set of values (data),
the variance is
n
1X
2
(xi x)2 ,
=
n i=1
and the rectified variance is
n
1 X
s =
(xi x)2 .
n 1 i=1
2
ni (xi x)2
i=1
r
P
=
ni
r
X
fi (xi x)2 ,
i=1
i=1
ni 1
i=1
40
For the rectified variance, the average variance computed for many
samples extracted from the same population tends to the population
variance.
The variance has no measurement unit, being an abstract measure.
It is used to compute the standard deviation and other variation and
correlation measures.
2
, > 0,
2
we have:
at least 75% of the values will fall within 2 standard deviations
from the mean of the distribution ( = 2);
at least 88.89% of the values will fall within 3 standard deviations from the mean ( = 3).
4. The coefficient of variation: v =
.
x
s
.
x
Some average measures of dispersion are expressed in concrete measurements units as the variable. When comparing two or many distribution
we cannot use these measures due to possible different measurement
units. This inconvenience is over passed using the relative dispersion
measures. The coefficient of variation is the main relative dispersion
measure.
The rectified coefficient of variation: v 0 =
41
If v > 0.5 then the mean is not representative for the data set and
the population is heterogeneous.
5. The central moments:
For a simple distribution with an ungrouped set of values (data),
the j-th central moment is
n
mcj
1X
(xi x)j .
=
n i=1
We remark that
mc2
= 2.
Remark 2.1. For the frequency distributions obtained from a classification by classes (intervals), on use the following Sheepards corrections
for firstly four moments and central moments:
M 1 = m1 ;
d2
;
12
d2
M 3 = m3 +
m1 ;
4
d2
d4
M 4 = m4 +
m2 + ;
2
80
M 2 = m2 +
2.7.3
M1c = mc1 = 0;
d2
M2c = mc2 ;
12
M3c = mc3 ;
M4c = mc4
d2
7d4
mc2 +
.
2
240
Shape measures
For a perfectly symmetric distribution the mean, the median and the mode
are equals. This distribution corresponds to the Gauss Bell shape (the normal
distribution). In this case the influence of the random factors is characterized
by certain regularity, so the influences are distributed in both directions,
compared to the arithmetic mean.
For analyzing the shape of an arbitrary distribution on needs to compare
the mean the median and the mode. An arbitrary distributions can be symmetric, slightly skewed or highly skewed. For a skewed distribution the mean,
the median and the mode are located in different places. More precisely:
42
3(x Me )
.
43
(Q3 Me ) (Me Q1 )
Q1 + Q3 2Me
=
.
(Q3 Me ) + (Me Q1 )
Q3 Q1
It takes values between -1 and 1 and is also close to zero for a symmetrical distribution.
4. The excess coefficient:
Es =
mc4
3.
4
Theme 3
Two-dimensional statistical
distributions
3.1
This method is used to approximate a function when only a partial set of its
values is known. Hence we will obtain the trend of the given function.
Let f : A R R be a function and let
f (xi ), i {1, . . . , n}
be the given values, where x1 , x2 , . . . , xn A.
We will approximate the function f by a trend function g : A R.
Usually, g is a polynomial function
g(x) = a0 + a1 x + a2 x2 + + ak xk , where k n
(in particular, g can be linear g(x) = a0 + a1 x or quadratic g(x) = a0 +
a1 x + a2 x2 ), a hyperbolic function
g(x) = a0 +
a1
,
x
an exponential function
g(x) = a0 + a1 ex ,
or a logarithmic function
g(x) = a0 + a1 ln x.
The Least Squares Method consists in the following steps:
44
i=1
i=1
n
X
i=1
that is
n
n
n
n
X
X
X
X
2
k
na0 + a1
x i + a2
x i + + ak
xi =
f (xi )
i=1
i=1
i=1
i=1
n
n
n
n
n
X
X
X
X
X
k+1
2
3
x i + a1
x i + a2
x i + + ak
xi =
xi f (xi )
a0
i=1
i=1
i=1
i=1
i=1
...
n
n
n
n
n
X
X
X
X
X
k+1
k+2
k
2k
a
x
+
a
x
+
a
x
+
+
a
x
=
xki f (xi )
1
2
k
0
i
i
i
i
i=1
i=1
i=1
i=1
i=1
(3.1)
n
n
X
X
na0 + a1
xi =
f (xi )
i=1
i=1
(3.2)
n
n
n
X
X
X
x i + a1
xi =
xi f (xi ).
a0
i=1
i=1
i=1
Remark 3.1. If we select two or more trend functions of different types, the
n
X
best approximation is given by the minimum error sum of squares
[f (xi )
i=1
g(xi )]2 .
Example 3.1. The sales of a company for the last five months are as follows:
Month Jan Feb
Sales
20
25
60
q
45
q 35
q
2 1 O
25
20
-x
5
5
X
X
5a0 + a1
xi =
f (xi )
i=1
i=1
5
5
5
X
X
X
a
x
+
a
x
=
xi f (xi ).
i
1
i
0
i=1
i=1
i=1
The coefficients of this system are calculated in the following table (see the
columns corresponding to xi , f (xi ), x2i , xi f (xi )):
i xi
1 -2
2 -1
3 0
4 1
5 2
P
0
Therefore
f (xi ) g(xi )
3
-2
-2
-2
3
5a0 = 185
and hence
10a1 = 100
a0 = 37
a1 = 10.
5
5
5
X
X
X
5a0 + a1
x i + a2
xi =
f (xi )
i=1
i=1
i=1
5
5
5
5
X
X
X
X
a0
x i + a1
x2i + a2
x3i =
xi f (xi )
i=1
i=1
i=1
i=1
5
5
5
5
X
X
X
X
2
3
4
x i + a1
x i + a2
xi =
x2i f (xi ).
a0
i=1
i=1
i=1
i=1
f (xi ) x2i
20
4
25
1
35
0
45
1
60
4
185 10
x3i
-8
-1
0
1
8
0
Therefore
i=1
3.2
m
X
j=1
n
X
fij ,
fi
fij , fj =
i=1
m
X
j=1
n
X
i=1
y1
f11
...
...
yj
f1j
...
...
ym
f1m
Total
f1
xi
..
.
fi1
...
fij
...
fim
fi
xn
Total
fn1
f1
...
...
fnj
fj
...
...
fnm
fm
fn
(absolute frequencies);
y1
f11
...
...
yj
f1j
...
...
ym
f1m
Total
f1
xi
..
.
fi1
...
fij
...
fim
fi
xn
Total
fn1
f1
...
...
fnj
fj
...
...
fnm
fm
fn
(relative frequencies).
Let the two-dimensional distribution of Z = (X, Y ) as above.
The conditional distribution of Y given the event X = xi has the
values y1 , . . . , ym , the corresponding absolute frequencies fi1 , . . . , fim ,
and the corresponding relative frequencies
fi1
fim
fim
fi1
= ,...,
= , suppose that fi > 0.
fi
fi
fi
fi
f1j
fnj
f1j
fnj
= ,...,
= , suppose that fj > 0.
fj
fj
fj
fj
muv =
fij
n X
m
X
i=1 j=1
i=1 j=1
We remark that
m10 = x and m01 = y,
where
n
P
x=
fi xi
i=1
n
P
i=1
=
fi
n
X
i=1
fi xi
m
P
y=
fj yj
j=1
m
P
m
X
fj
fj yj
j=1
j=1
j = mX/Y =yj =
n
P
fij xi
i=1
fj
fij xi
i=1
fj
The function
A(yj ) = j , j {1, . . . , m}
is called the regression function of the mean of X with respect to
Y.
We remark that the regression functions B(x) and A(y) can be approximated by Least Squares Method.
3.3
We remark that
2
mc20 = X
and mc02 = Y2 ,
where
n P
m
P
2
X
i=1 j=1
n P
m
P
=
fij
n
X
fi (xi x)2
i=1
i=1 j=1
and
n P
m
P
Y2
i=1 j=1
n P
m
P
=
fij
m
X
fj (yj y)2
j=1
i=1 j=1
The conditional variance of Y given the event X = xi (the variance inside xi -group) is the variance of the conditional distribution
of Y given X = xi , i.e.
m
m
P
P
fij (yj i )2
fij (yj i )2
j=1
j=1
=
.
Y2 /X=xi =
fi
fi
(The overall variation is the combination result between the random factors
within each group and the essential factors determining the variation from a
group to another.)
3.4
The correlations (the dependence) that can be found between two variables
X and Y are classified as follows:
According to the way of change we can have:
positive correlation (direct dependence): if X is increasing
then Y will also increase and if X is decreasing then Y will also
decrease.
negative correlation (opposite dependence): if X is increasing then Y will decrease and if X is decreasing then Y will increase.
According to the intensity of the correlation we can have:
high intensity (strong or tight);
medium intensity;
low intensity.
According to the shape of the correlation we can have:
linear correlation;
nonlinear correlation, as exponential growth or logarithmic decrease, for example.
Let the two-dimensional distribution of Z = (X, Y ) as above. The degree
of correlation between the variables X and Y can be measured by using the
following indicators.
1. The covariance of X and Y (or of the two-dimensional distribution
of Z = (X, Y )), i.e.
n P
m
P
cov (X, Y ) =
mc11
i=1 j=1
n P
m
P
=
fij
n X
m
X
i=1 j=1
i=1 j=1
= (X, Y ) =
It takes values between 1 and 1.
Y
(x x).
X
Y
(x x).
X
Y
(x x).
X
Y2 /X
Y2
3.0 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4.0
1
1
1
1
2
1
2
2
1
3
5
2
1
2
1
3
5
4
1
1
1
1
1
10
3
1
1
1
1
11
2
1
1
1
12
1
1
1
2
1
1
2
2
1
3.5
u1
x1
y1
u2
x2
y2
...
...
...
un
xn
yn
n
P
d2i
i=1
3
n
where
di = ai bi , i {1, . . . , n}
(the rank differences between variables).
We remark that the Spearmans coefficient of correlation of the ranks
S is even the coefficient of correlation (A, B) of A and B, where A
and B are the statistical variables that represent the ranks of X and
Y , respectively. The distributions of A and B are represented in the
following table:
Units
A values
B values
u1
a1
b1
u2
a2
b2
...
...
...
un
an
bn
n
X
2(P Q)
,
n2 n
Pi , Q =
i=1
n
X
Qi ,
i=1
1
10
17
11
30
28
2
25
23
12
15
13
3 4 5
13 14 28
15 12 26
13 14 15
23 4 26
25 10 27
6
16
18
16
12
5
7
6
8
17
21
19
8
8
13
18
19
14
9
24
20
19
29
29
10
17
22
20
18
24
Theme 4
Time series and forecasting
Usually, a time series Y = (yi )i (i being the time) is influenced by the
following factors (components):
the trend (the tendency);
the cyclical factor;
the seasonal factor;
the random factor (the irregular factor).
The main decomposition models for a time series Y = (yi )i :
The additive model:
yi = Ti + Ci + Si + Ri ,
where Ti , Ci , Si , Ri represent the trend, the cyclical, the seasonal and
the random components, respectively.
This model assumes that the components are independent and they
have the same measurement unit.
The multiplicative model:
yi = Ti Ci Si Ri .
This model assumes that the components depend each other or they
have different measurement units.
60
4.1
61
4.2
The cyclical variation of a time series is the component that tends to oscillate
above and below the trend line for periods longer than 1 year (if the time
series is composed by annual dates). This component explains most of the
variation of evolution that remains unexplained by the trend component.
The cyclical component can be expressed as:
The cyclical variation:
Ci = yi yei ,
where
yi is the value of time series Y at time i;
yei = Ti is the estimated trend value of time series Y at the same
time i.
The cycle:
yi
100.
yei
Estimated sales (e
yi )
5.8
5.9
6
6.1
6.2
6.3
6.4
6.5
6.6
Cyclical variation
-0.1
0
0
0.1
0.1
0
0
-0.1
0
62
4.3
The seasonal variation of a time series is the repetitive and predictable movement around the trend line in 1 year or less. For detecting the seasonal
variation, the time intervals need to be measured in small periods such as
quarters, months, weeks, ... .
Let Y = (yi )i=1,n be a time series, and let k be the number of equal
periods per each year.
The seasonal component can be expressed as:
The moving average value for each time interval:
If k is odd, the moving average value corresponding to yi is
1
yi =
yi k1 + + yi + + yi+ k1 ,
2
2
k
for all i {1 +
k1
,...,n
2
k1
}.
2
63
Quarter II
130
133
137
136
141
Theme 5
The interest
5.1
65
D
(S0 , t), S0 0, t 0
t
S
(S0 , t), S0 0, t 0.
t
F (S0 , x)dx, S0 0, t 0;
D(S0 , t) =
0
F (S0 , x)dx, S0 0, t 0
St S(S0 , t) = S0 +
0
F (100, x)dx.
S
(S0 , t)
t
S(S0 , t)
, t 0
ln S
F (S0 , t)
(S0 , t) =
, t 0.
t
S(S0 , t)
66
(x)dx
St S(S0 , t) = S0 e
, S0 0, t 0;
Z t
(x)dx
D(S0 , t) = S0 e 0
1 , S0 0, t 0.
0
5.2
Equivalence of investments
Definition 5.6. A multiple (financial) investment consists in n initial values S01 , S02 , . . . , S0n invested over the times t1 , t2 , . . ., tn , with the
annual interest rates i1 , i2 , . . ., in (or with the annual interest percentages
p1 , p2 , . . ., pn ). Let D(S01 , t1 ), D(S02 , t2 ), . . ., D(S0n , tn ) be the corresponding interests, and let S1 = S(S01 , t1 ), S2 = S(S02 , t2 ), . . ., Sn = S(S0n , tn )
n
n
n
P
P
P
be the corresponding final values. The sums
S0k ,
D(S0k , tk ) and
Sk
k=1
k=1
k=1
are called the total initial value, the total interest and the total final value of the given multiple investment, respectively. This multiple investment
can be
two forms
S01 t1 i1
t1 i1 S1
S02 t2 i2
t2 i2 S2
..
.. .. (if the initial values are known), or .. .. .. (if
.
. . .
. .
S0n tn in
tn in Sn
the final values are known).
Definition 5.7. We saythat two multiple
are equivalent
by
investments
0
S01 t1 i1
S01
t01 i01
0
0
S02 t2 i2 S 0
I 02 t2 i2
interest and we denote ..
.. ..
.
..
.. if the corre .
. . ..
.
.
0
0
0
S0n tn in
S0m tm im
n
m
P
P
0
sponding total interest are equal, i.e.
D(S0k , tk ) =
D(S0k
, t0k ).
k=1
k=1
67
are equivalent
0
0
0
t1 i1 S1
t1 i1 S1
t0 i0 S 0
t2 i2 S2
2
P 2 2
and we denote .. .. .. ..
..
.. if the corresponding
. . .
.
.
.
0
0
0
tn in Sn
tm im Sm
n
m
P
P
0
total initial values are equal, i.e.
S0k =
S0k
.
k=1
Definition 5.8. If
S01 t1 i1
S02 t2 i2
..
.. ..
.
. .
S0n tn in
k=1
I
I
I
(CI)
S0 , t, i S0 , t(CI) , i S0 , t, i(CI) ,
(CI)
then the initial value S0 , the time of investment t(CI) and the annual interest rate i(CI) are called commonly replacements by interest.
Definition
S01 t1
S02 t2
..
..
.
.
S0n tn
5.9. If
i1
i2
I
..
.
in
(M I)
S0
t1 i1
S01 t1 i(M I)
S01 t(M I) i1
(M I)
(M I)
(M I)
i2
t2 i2
S0
I S02 t2 i
I S02 t
.
..
..
..
..
..
.. .. ..
..
.
.
.
.
.
.
. .
(M I)
(M I)
(M I)
S0n tn i
S0n t
in
S0
tn in
(M I)
then the initial value S0 , the time of investment t(M I) and the annual
interest rate i(M I) are called meanly replacements by interest.
Definition 5.10.
t1 i1
t2 i2
.. ..
. .
tn in
If
S1
S2
..
.
P
P
P
t, i, S (CP ) t(CP ) , i, S t, i(CP ) , S .
Sn
then the final value S (CP ) , the time of investment t(CP ) and the annual interest rate i(CP ) are called commonly replacements by present value.
Definition 5.11. If
t1 i1 S1
t2 i2 S2
P
.. .. ..
. . .
tn in Sn
t1 i1 S (M P )
t2 i2 S (M P )
.. ..
..
. .
.
(M P )
tn in S
t(M P ) i1 S1
t1 i(M P ) S1
(M P )
t(M P ) i2 S2
S2
P t2 i
..
.. .. ..
..
.. ,
.
.
. .
.
.
(M P )
(M P )
t
in Sn
tn i
Sn
68
then the final value S (M P ) , the time of investment t(M P ) and the annual
interest rate i(M P ) are called meanly replacements by present value.
5.3
5.3.1
Simple interest
Basic formulas
Definition 5.12. If the principal is not actualized over the time of investment, then we say that we obtain a simple interest.
Proposition 5.3. For simple interest we have:
D D(S0 , t) = S0 it =
S0 pt
100
St
1 + it
D
St S0
D
St S0
=
, t=
=
.
S0 t
S0 t
S0 i
S0 i
Corollary 5.3. If t =
Remark 5.6. If the time of investment t is given as the period from the initial
date (d1 , m1 , y1 ) to the final date (d2 , m2 , y2 ), (where di , mi , yi represents the
day, the number of month and the year of the date), then we have three
conventions (procedures) to calculate the simple interest:
69
S0 ih
365
or D =
S0 ih
366
S0 ih
,
360
S0 ih
,
360
where
h = 360(y2 y1 ) + 30(m2 m1 ) + d2 d1
(assumes that all months have 30 days, called the 30-day month
convention).
5.3.2
m
X
ik tk
k=1
St = S0 1 +
m
X
!
ik tk
k=1
S0 =
1+
St
m
P
k=1
5.3.3
70
k=1
it
n
P
(M I)
S0
n
P
S0k ik tk
, t
(CI)
k=1
n
P
, t
(M I)
i k tk
k=1
5.4
5.4.1
k=1
S0 i
n
P
S0k ik tk
n
P
S0k ik tk
, i
(CI)
S0 t
n
P
S0k ik tk
k=1
n
P
S0k ik tk
k=1
, i
(M I)
S0k ik
k=1
S0k ik tk
k=1
n
P
.
S0k tk
k=1
Compound interest
Basic formulas
St
(1 + i)t
1
1
=
,
u
1+i
71
S0 = St v t .
Remark 5.8. If
t=n+
h
k
h
(n being the integer part and being the fractional part, i.e. the time
k
of investment cover only h periods from a total of k equal periods per the last
year), then we have two conventions (procedures) to calculate the compound
interest:
1. The rational procedure: we apply a compound interest for the integer part and a simple interest for the fractional part, and hence
h
n
(the compounding formula);
St Sn+ h = S0 (1 + i) 1 + i
k
k
h
n
D = S0 (1 + i) 1 + i
1
(the interest formula).
k
2. The commercial procedure: we extend the compound interest to the
fractional part, and hence
p
h
St Sn+ h = S0 (1+i)n+ k = S0 (1+i)n k (1 + i)h (the compounding formula);
k
D = S0 (1 + i)
5.4.2
n+ h
k
h
i
i
p
n k
h
(1 + i) 1
1 = S0 (1 + i)
jk
1+
k
k
.
72
jk
is called the interest rate per interest period (period
k
interest rate);
ik =
i is called the effective rate or the real rate (annual interest rate).
5.4.3
m
Y
l=1
hl
(1 + il )
1 + il
kl
nl
l=1
St = S0
m
Y
(1 + il )tl
l=1
5.5
Loans
The amortization table for a loan of size (original balance) V0 u.c. per
n years at an annual interest rate i has the following form:
Years start
Years end
Year Remaining
Interest
Principal
Payment
Remaining
principal
part
part
(Rate)
principal
1
V0
d1 = V0 i
Q1
T1 = d1 + Q1
V1 = V0 Q1
2
V1
d2 = V1 i
Q2
T2 = d2 + Q2
V2 = V1 Q2
...
k
Vk1
dk = Vk1 i
Qk
Tk = dk + Qk
Vk = Vk1 Qk
...
n
Vn1
dn = Vn1 i
Qn
Tn = dn + Qn Vn = Vn1 Qn = 0
73
Obviously, we have:
V0 = Q1 + Q2 + + Qn , Vn1 = Qn ,
Tn = Qn u, Tk+1 Tk = Qk+1 Qk u,
where u = 1 + i is the the annual compounding factor.
We have two mainly procedures to calculate the payments of a loan:
1. The fixed-principal amortization: Q1 = Q2 = = Qn = Q.
In this case, we have:
V0
;
n
Tk+1 Tk = Q i
Q=
(arithmetic progression);
Tk = Q[1 + (n k + 1)i].
2. The fixed-rate amortization: T1 = T2 = = Tn = T.
In this case, we have:
T = V0
i
1 vn
un
i
uk1 ,
1
1
where u = 1 + i is the the annual compounding factor, and v = =
u
1
is the annual discounting factor.
1+i
Remark 5.9. The inflation changes the purchasing power of money. After
n years, the purchasing power of Sn u.c. is reduced to
S0 =
Sn
,
(1 + a1 )(1 + a2 ) . . . (1 + an )
5.6
74
Problems
Theme 6
Introduction to Actuarial Math
6.1
In an insurance model the insurer agrees to pay the insured one or more
amounts called claims (claim payments), at fixed times or when the
insured event occurs. In return of these claims, the insured pays one or
more amounts called premiums.
Usually the insure events are random events.
For a mutually advantageous insurance, the present values (at the initial
moment of the insurance) of the premiums need to be equal to the present
value of the claims. These values are also called actuarial present values.
Definition 6.1. For a given insurance, the single premium payable at the
initial moment of the insurance is
P = E(X),
where E(X) denotes the mean of the random variable X that represents the
present value of the claim.
Theorem 6.1. Let A be an insurance consisting in the partial insurances
A1 , A2 , . . . , An (n N ), and let P1 , P2 , . . . , Pn be the single premiums corresponding of these partial insurances. Then the single premium of the total
insurance A is
P = P 1 + P2 + + Pn .
Proof. Let X be the random variable that represents the present value of the
total insurance A and let X1 , X2 , . . . , Xn be the random variables representing the present values of partial insurances A1 , A2 , . . . , An , respectively. We
have
X = X 1 + X2 + + Xn ,
75
76
and hence
P = E(X) = E(X1 + X2 + + Xn ) = E(X1 ) + e(X2 ) + + E(Xn )
= P1 + P2 + + P n .
6.2
Biometric functions
6.2.1
m|n qx
(6.1)
(6.2)
77
0 px
= 1; 0 qx = 0;
(6.3)
(6.4)
1 px = px ; 1 q x = qx ;
(6.5)
0|n qx = n qx ;
(6.6)
n+m px = n px m px+n ;
(6.7)
n px = px px+1 . . . px+n1 ;
n qx = qx + px qx+1 + px px+1 qx+2 + . . . + px px+1 . . . px+n2 qx+n1 ;
(6.8)
= m px n qx+m ;
m|n qx = m+n qx m qx = m px m+n px .
m|n qx
(6.9)
(6.10)
Proof. Equalities (6.1), (6.2), (6.3), (6.4) and (6.5) are obvious.
Denote by A(x, y) the event that a person of age x will attains age y.
Then
A(x, x + n + m) = A(x, x + n) A(x + n, x + n + m),
and the events A(x, x + n) and A(x + n, x + n + m) are independent. Hence
n+m px
(where P (A) represents the probability of event A). Using (6.6) and (6.4)
we have
n px
Also, we have
n qx
= P (A(x, x + n))
= P A(x, x + 1) A(x, x + 1) A(x + 1, x + 2) A(x, x + 2) A(x + 2, x + 3) . . .
A(x, x + n 1) A(x + n 1, x + n)
= qx + px qx+1 + px px+1 qx+2 + . . . + px px+1 . . . px+n2 qx+n1
m|n qx
78
6.2.2
(6.11)
Proof. Obviously,
lx = E(X),
where X is the random variable that represents the number of survivors at
age x. Let
0
...
n
...
l0
X:
x (0) . . . x (n) . . . x (l0 )
be the distribution of X, where, for any n {0, . . . , l0 }, x (n) denotes the
probability that the number of survivors at age x is equal to n. We have
x (n) = Cln0 (x p0 )n (x q0 )l0 n , n {0, . . . , l0 }.
Then X has a binomial distribution of parameters l0 and p(0, x). Therefore
lx = E(X) = l0 x p0 .
79
(6.12)
(6.13)
(6.14)
(6.15)
where
dx = lx lx+1 .
(6.16)
Proof. Equation (6.12) is an immediate consequence of (6.11) and (6.7). Using (6.6) and (6.11) we have
n px
x+n p0
x p0
lx+n
lx lx+n
lx+n l0
=
, and n qx = 1 n px =
.
l0 lx
lx
lx
= m px n qx+m =
=
.
lx
lx+m
lx
6.2.3
80
ex =
x
1
1X
lx+n .
+
2 lx n=1
(6.17)
Proof. Obviously,
lx = E(Y ),
where Y is the random variable that represents the future lifetime for a
person of age x. Let
1
1
1 !
... n +
... x +
Y :
2
2
2
x (0) . . . x (n) . . . x ( x)
be the distribution of Y , where, for any n {0, . . . , x}, x (n) represents
the probability that a person of age x will live only n (i.e. will die at age
x + n). We have
x (n) = n|1 qx = n px qx+n , n {0, . . . , x},
Using (6.13) and (6.16) it follows that
x (n) =
dx+n
lx+n lx+n+1
lx+n dx+n
=
=
, n {0, . . . , x}. (6.18)
lx
lx+n
lx
lx
Hence
x
X
1 lx+n lx+n+1
ex = E(Y ) =
n+
2
lx
n=0
x
1X
1
1
=
n+
lx+n n + 1 +
lx+n+1 + lx+n+1
lx n=0
2
2
"
#
x
x
X
1
1X
1 1
1
=
lx x + 1 +
l+1 +
lx+n+1 = +
lx+n ,
lx 2
2
2
l
x
n=0
n=1
since l+1 = 0.
Remark 6.8. According to (6.17) and (6.13) we obtain:
x
1 X
ex = +
n px , x N, x .
2 n=1
(6.19)
6.2.4
81
Life tables
Nr. of survivors
lx
l0 = 100000
Nr. of deaths
dx
Probab. of death
qx
ex
= 100
Usually, the values lx are derived by a census. The values dx , qx and ex are
calculated according to (6.16), (6.15) and (6.17), respectively.
The following actuarial table shows the life expectancy for the Romanian
population in 2008 (www.pensiileprivate.ro).
x
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
lx
MALE
100000
99930
99860
99760
99654
99543
99425
99302
99173
99032
98880
98716
98540
98353
98144
97914
97664
97392
97100
96754
96356
95905
lx
FEMALE
100000
99960
99920
99880
99838
99794
99748
99700
99651
99597
99539
99477
99412
99342
99261
99167
99062
98945
98817
98670
98507
98325
qx
MALE
0.0007
0.0007
0.001
0.0011
0.0011
0.0012
0.0012
0.0013
0.0014
0.0015
0.0017
0.0018
0.0019
0.0021
0.0023
0.0026
0.0028
0.003
0.0036
0.0041
0.0047
0.0052
qx
FEMALE
0.0004
0.0004
0.0004
0.0004
0.0004
0.0005
0.0005
0.0005
0.0005
0.0006
0.0006
0.0007
0.0007
0.0008
0.0009
0.0011
0.0012
0.0013
0.0015
0.0017
0.0018
0.002
x + ex
MALE
66.7
66.7
66.8
66.8
66.9
66.9
67
67
67.1
67.1
67.2
67.3
67.3
67.4
67.5
67.6
67.7
67.7
67.8
68
68.1
68.2
x + ex
FEMALE
73
73
73
73
73
73.1
73.1
73.1
73.1
73.2
73.2
73.2
73.2
73.3
73.3
73.3
73.4
73.4
73.5
73.5
73.6
73.7
lx
MALE
95402
94849
94231
93548
92804
91998
91133
90209
89228
88191
87101
85960
84731
83417
82024
80556
79017
77374
75633
73803
71891
69907
67814
65625
63353
61011
58614
56111
53524
50875
48183
45471
42371
38981
35399
31727
28059
24483
21072
17886
lx
FEMALE
98127
97911
97670
97404
97114
96799
96461
96088
95683
95245
94774
94272
93717
93112
92456
91752
91000
90182
89302
88361
87361
86304
85125
83829
82423
80911
79301
77493
75501
73342
71032
68588
65336
61387
56877
51959
46789
41524
36311
31280
qx
MALE
0.0058
0.0065
0.0072
0.008
0.0087
0.0094
0.0101
0.0109
0.0116
0.0124
0.0131
0.0143
0.0155
0.0167
0.0179
0.0191
0.0208
0.0225
0.0242
0.0259
0.0276
0.0299
0.0323
0.0346
0.037
0.0393
0.0427
0.0461
0.0495
0.0529
0.0563
0.0682
0.08
0.0919
0.1037
0.1156
0.1275
0.1393
0.1512
0.163
qx
FEMALE
0.0022
0.0025
0.0027
0.003
0.0032
0.0035
0.0039
0.0042
0.0046
0.0049
0.0053
0.0059
0.0065
0.007
0.0076
0.0082
0.009
0.0098
0.0105
0.0113
0.0121
0.0137
0.0152
0.0168
0.0183
0.0199
0.0228
0.0257
0.0286
0.0315
0.0344
0.0474
0.0604
0.0735
0.0865
0.0995
0.1125
0.1255
0.1386
0.1516
x + ex
MALE
68.4
68.5
68.7
68.9
69.1
69.3
69.5
69.8
70
70.3
70.5
70.8
71.1
71.4
71.7
72
72.3
72.7
73
73.4
73.7
74.1
74.5
74.9
75.3
75.7
76.1
76.6
77
77.4
77.9
78.3
78.8
79.4
80
80.6
81.3
82
82.7
83.4
82
x + ex
FEMALE
73.7
73.8
73.9
74
74.1
74.2
74.3
74.4
74.5
74.6
74.7
74.8
75
75.1
75.3
75.4
75.6
75.8
76
76.2
76.3
76.5
76.7
77
77.2
77.4
77.7
77.9
78.2
78.5
78.8
79.1
79.5
79.9
80.4
81
81.6
82.2
82.9
83.6
6.3
lx
MALE
14970
12352
10045
8050
6356
4942
3785
2854
2118
1546
1111
785
545
372
250
165
107
68
42
26
15
lx
FEMALE
26538
22170
18232
14757
11751
9205
7091
5370
3996
2922
2099
1480
1024
696
463
303
194
122
75
45
26
qx
MALE
0.1749
0.1868
0.1986
0.2105
0.2223
0.2342
0.2461
0.2579
0.2698
0.2816
0.2935
0.3054
0.3172
0.3291
0.3409
0.3528
0.3647
0.3765
0.3884
0.4002
1
qx
FEMALE
0.1646
0.1776
0.1906
0.2037
0.2167
0.2297
0.2427
0.2557
0.2688
0.2818
0.2948
0.3078
0.3208
0.3339
0.3469
0.3599
0.3729
0.3859
0.399
0.412
1
x + ex
MALE
84.2
85
85.8
86.6
87.4
88.3
89.1
90
90.9
91.7
92.6
93.5
94.4
95.3
96.2
97
97.8
98.6
99.3
99.8
101.8
83
x + ex
FEMALE
84.3
85.1
85.9
86.7
87.5
88.3
89.2
90
90.9
91.7
92.6
93.5
94.4
95.2
96.1
97
97.8
98.6
99.3
99.8
101.9
Problems
Exercise 6.1. Calculate the probability that a 30 years old person will live
at least 35 years but at most 55 years.
Exercise 6.2. Consider a family of a 45 years old husband and a 43 years
old wife.
a) Calculate the probability that both spouses will die in the same year.
b) Calculate the probability that both spouses will die at the same age.
Exercise 6.3. Calculate the average remaining lifetime and the life expectancy for a 50 years old person.
Exercise 6.4. Calculate the probability that a 60 years old person will die
before the integer number of years of his average remaining lifetime.
Exercise 6.5. For a 35 years old person, calculate the life expectancy and
the age of death having the maximum probability.
Theme 7
Life annuities
7.1
In a person insurance, the claims are payments while the insured survives.
We have the following classifications.
1. By period, the claims can be:
annuities;
semiannual;
quarterly;
monthly.
2. By amount, the claims can be:
constants;
variables.
3. By time of payment, the claims can be:
annuity-due, when the claims are payed at the beginning of each
period;
annuity-immediate, when the claims are payed at the end of
each period.
4. By time of first payment, the claims can be:
immediate;
deferred.
84
85
7.2
Single claim
(7.1)
Dx = v x lx ,
(7.2)
n Ex
where
1
being the annual discounting factor, i being the annual interest
v =
1+i
rate.
Proof. For a mutually advantageous insurance, the single premium n Ex need
to be equal to the present value of the single claim, that is
n Ex
= E(X),
where X is the random variable that represents the present value of the claim.
We have
(
v n , if the insurer survives at least n years from the time of insurance issue,
X=
0, otherwise.
Hence the distribution of X is
X:
vn
n px
0
n qx
.
86
= E(X) = v n n px + 0 n qx = v n
lx+n
v x+n lx+n
Dx+n
=
=
.
x
lx
v lx
Dx
7.3
7.3.1
= y Ex z Ex+y , x, y, z N s.t. x + y .
(7.3)
Life annuities-immediate
Whole life annuities
Nx+1
,
Dx
(7.4)
where
Nx = Dx + Dx+1 + + D .
Proof. By Theorem 6.1 we have
ax = 1 Ex + 2 Ex + + x Ex .
Using (7.1) and (7.5) we obtain
ax =
D
Nx+1
Dx+1 Dx+2
+
+ +
=
.
Dx
Dx
Dx
Dx
(7.5)
87
Nx+1
.
Dx
7.3.2
r| ax
Nx+r+1
.
Dx
(7.6)
= r+1 Ex + r+2 Ex + + x Ex .
Dx+r+1 Dx+r+2
D
Nx+r+1
+
+ +
=
.
Dx
Dx
Dx
Dx
(7.7)
7.4
88
Nx+1 Nx+r+1
.
Dx
(7.8)
Dx+1 Dx+2
Dx+r
Nx+1 Nx+r+1
+
+ +
=
.
Dx
Dx
Dx
Dx
7.5
(7.9)
In this case the claims are payable at the end of each k-th period of the year.
7.5.1
89
kh
h
Dx + Dx+1 .
k
k
(7.10)
Dx+1 Dx
kh
h
=
Dx + Dx+1 .
k
k
k
Nx+1 k 1
+
.
Dx
2k
(7.11)
a(k)
x
1 XX
=
h Ex .
k n=0 h=1 n+ k
(7.12)
x k
k x
1 X X Dx+n+ hk
1 XX k h
h
=
=
Dx+n + Dx+n+1
k n=0 h=1 Dx
k Dx h=1 n=0
k
k
!
x
x
k
1 X khX
hX
=
Dx+n +
Dx+n+1
k Dx h=1
k n=0
k n=0
90
k
1 X kh
h
=
(Dx + Nx+1 ) + Nx+1
k Dx h=1
k
k
k
1
k(k + 1)
1 X kh
Dx + Nx+1 =
k
Dx + kNx+1
=
k Dx h=1
k
k Dx
2k
=
Nx+1 k 1
+
.
Dx
2k
a(k)
x = ax +
a(1)
x
7.5.2
(k)
r| ax
Nx+r+1 k 1 Dx+r
+
.
Dx
2k
Dx
(7.13)
rx k
1 X X
h Ex .
k n=0 h=1 r+n+ k
(7.14)
91
rx k
1 X X
(k)
=
r Ex n+ h Ex+r = r Ex ax+r ,
k
k n=0 h=1
Corollary 7.6. Let x, r N, x + r , k N and T 0. The single premium payable by a person of age x for an r-year deferred whole life
annuity-immediate of T u.c. per each k-th period of the year is
Nx+r+1 k 1 Dx+r
(k)
+
.
T k r| ax = T k
Dx
2
Dx
Remark 7.6. By (7.13), (7.1), (7.11) and (7.6) it follows that
= r Ex ax+r , x, r N s.t. x + r , k N ,
(k)
0| ax
= a(k)
x , x N, x , k N ,
(k)
x| ax
(1)
r| ax
7.5.3
(k)
(k)
r| ax
(7.15)
= 0, x N, x , k N ,
= r| ax , x, r N, x .
92
(k)
ax: re
1 XX
1 XX
1 XX
=
Ex =
h Ex
h Ex
n+ h
n+
k n=0 h=1 k
k n=0 h=1 k
k n=r h=1 n+ k
x k
rx k
1 XX
1 X X
(k)
r| ax(k) ,
=
h Ex
h Ex = ax
k n=0 h=1 n+ k
k n=0 h=1 r+n+ k
and using (7.11) and (7.13) we obtain the equality from enounce.
Corollary 7.7. Let x, r N, x + r , k N and T 0. The single
premium payable by a person of age x for an r-year temporary life annuityimmediate of T u.c. per each k-th period of the year is
Nx+1 Nx+r+1 k 1
Dx+r
(k)
T k ax: re = T k
+
1
.
Dx
2
Dx
Remark 7.7. By (7.11), (7.16), (7.13), (7.15) and (7.8) it follows that
(k)
(k)
a(k)
x = ax: re + r| ax , x, r N s.t. x + r , k N ,
(k)
(k)
ax: re = a(k)
x r Ex ax+r , x, r N s.t. x + r , k N ,
(k)
ax: 0e = 0, x N, x , k N ,
(k)
ax: xe = a(k)
x , x N, x , k N ,
(1)
ax: re = ax: re , x, r N, x .
7.6
7.6.1
Pension
Annually pension
r| ax
ax: re
Nx+r+1
.
Nx+1 Nx+r+1
(7.17)
93
Proof. For a mutually advantageous insurance, the present values (at the
initial moment of the insurance) of the premiums need to be equal to the
present value of the pensions. By Definition 7.6, Corollary 7.4, Definition 7.5
and Proposition 7.3 we have
Px: re (r| ax ) ax: re = r| ax , so Px: re (r| ax )
Nx+1 Nx+r+1
Nx+r+1
=
,
Dx
Dx
7.6.2
Nx+r+1
.
Nx+1 Nx+r+1
Monthly pension
(12)
(12)
r| ax
(12)
ax: re
24Nx+r+1 + 11Dx+r
.
24(Nx+1 Nx+r+1 ) + 11(Dx Dx+r )
(7.18)
Proof. For a mutually advantageous insurance, the present values (at the
initial moment of the insurance) of the premiums need to be equal to the
present value of the pensions. By Definition 7.10, Corollary 7.7, Definition
7.9 and Proposition 7.6 we have
(12)
(12)
(12)
Px: re (r| a(12)
x ) 12 ax: re = 12 r| ax ,
so
(12)
Px: re (r| a(12)
x )
Nx+1 Nx+r+1 11
12
+
Dx
2
Dx+r
Nx+r+1 11 Dx+r
1
= 12
+
,
Dx
Dx
2 Dx
94
7.7
24Nx+r+1 + 11Dx+r
.
24(Nx+1 Nx+r+1 ) + 11(Dx Dx+r )
Problems
Exercise 7.1. Calculate the single premium payable by a 30 years old person
for a single claim of 10000$ over 35 years if the person survives. The annual
interest percent is 8%.
Exercise 7.2. Calculate the single premium payable by a 30 years old person for a whole life annuity-immediate of 12000RON per year. The annual
interest percent is 14%.
Exercise 7.3. Calculate the single premium payable by a 30 years old person
for a 35-year deferred whole life annuity-immediate of 12000RON per year.
The annual interest percent is 14%.
Exercise 7.4. Calculate the single premium payable by a 30 years old person
for a 35-year temporary life annuity-immediate of 12000RON per year. The
annual interest percent is 14%.
Exercise 7.5. Calculate the single premium payable by a 30 years old person
for a whole life annuity-immediate of 1000RON per month. The annual
interest percent is 14%.
Exercise 7.6. Calculate the single premium payable by a 30 years old person
for a 35-year deferred whole life annuity-immediate of 1000RON per month.
The annual interest percent is 14%.
Exercise 7.7. Calculate the single premium payable by a 30 years old person
for a 35-year temporary life annuity-immediate of 1000RON per month. The
annual interest percent is 14%.
Exercise 7.8. Calculate the annuity-immediate premium payable by a 30
years old person for an annuity-immediate pension of 12000RON per year.
The annual interest percent is 14% and the age of retirement is 65 years.
Exercise 7.9. Calculate the monthly-immediate premium payable by a 30
years old person for a monthly pension of 1000RON per month. The annual
interest percent is 14% and the age of retirement is 65 years.
Theme 8
Life insurances
8.1
In a life insurance, the single claim is payable at the moment of death, if the
death occurs in the period covered by the insurance. The life insurance can
be:
immediate and unlimited, when the claim is payed at the moment
of death, whenever this occurs.
deferred, when the claim is payed only if the insured dies after a fixed
term from the time of insurance issue.
temporary (limited), when the claim is payed only if the insured
dies within a fixed term from the time of insurance issue.
8.2
Ax =
Mx
,
Dx
(8.1)
where
1
96
1
being the annual discounting factor, i being the annual interest
v =
1+i
rate.
Proof. For a mutually advantageous insurance, the single premium A(x) need
to be equal to the present value of the single claim, that is
Ax = E(X),
where X is the random variable that represents the present value of the claim.
Assuming that the deaths are uniform distributed throughout the year, we
have
1
x (n) =
Ax = E(X) =
x
X
x (n)v
n+ 12
n=0
n=0
x
X
Cx+n
n=0
Dx
x
X
dx+n
lx
n+ 12
1
x
X
dx+n v x+n+ 2
n=0
lx v x
Mx
.
Dx
T Ax = T
Mx
.
Dx
Ax =
v (1 i ax ) .
(8.3)
8.3
97
r| Ax
r| Ax
Mx+r
.
Dx
(8.4)
r| Ax
= E(r| X),
v x+ 2
x ( x)
.
r| Ax
= E(r| X) =
x (n)v
n+ 12
n=r
n=r
x
X
Cx+n
Dx
n=r
x
X
dx+n
lx
n+ 12
1
x
X
dx+n v x+n+ 2
n=r
lx v x
Mx+r
.
Dx
T r| Ax = T
Mx+r
.
Dx
r| Ax
= r Ex Ax+r , x, r N s.t. x + r ,
0| Ax
= Ax , x N, x .
(8.5)
r| Ax
r Ex
i r| ax , x, r N s.t. x + r .
(8.6)
8.4
98
Ax:
= the single premium payable by a person of age x for a r-year
1
re
term life insurance of 1 u.c. (payable at the moment of death only if
the insured die within r years following insurance issue).
Proposition 8.3. For any x, r N s.t. x + r , we have
Mx Mx+r
.
Ax:
=
1
re
Dx
Proof. Similar to Proposition 8.1 we have
(8.7)
Ax:
= E(Xre ),
1
re
where Xre is the random variable having the distribution
1
1
1
v2
v 1+ 2 . . .
v r1+ 2
0
0
...
Xre :
x (0) x (1) . . . x (r 1) x (r) x (r + 1) . . .
0
x ( x)
.
= E(Xre ) =
Ax:
1
re
r1
X
x (n)v
n+ 21
n=0
n=0
r1
X
Cx+n
n=0
Dx
r1
X
dx+n
lx
n+ 21
1
r1
X
dx+n v x+n+ 2
n=0
lx v x
Mx Mx+r
.
Dx
Mx Mx+r
T Ax:
=
T
.
1
re
Dx
Remark 8.3. By (8.1), (8.7), (8.4) and (8.5) it follows that
Ax = Ax:
+ r| Ax , x, r N s.t. x + r ,
1
re
(8.8)
Ax:
= Ax r Ex Ax+r , x, r N s.t. x + r ,
1
re
Ax:
= 0, x N, x .
1
0e
Remark 8.4. By (8.8), (8.3), (8.6) and (7.9) it follows that
Ax:
=
v
1
a
, x, r N s.t. x + r .
1
r
x
x:
re
re
(8.9)
8.5
99
Problems
Exercise 8.1. Calculate the single premium payable by a 30 years old person
for a whole life insurance of 1000RON. The annual interest percent is 14%.
Exercise 8.2. Calculate the single premium payable by a 30 years old person
for a 35-year deferred life insurance of 1000RON. The annual interest percent
is 14%.
Exercise 8.3. Calculate the single premium payable by a 30 years old person
for a 35-year term life insurance of 1000RON. The annual interest percent is
14%.
Theme 9
Collective annuities and
insurances
Next, we consider an insured group of m persons having the ages x1 , x2 , . . . , xm
(m N , xj N, xj j {1, . . . , m}).
9.1
[k]
x1 x2 ...xm
survive n years;
np
k = the probability that at least k of the group members will
x1 x2 ...xm
survive n years;
n px1 x2 ...xm
x1 x2 ...xm
101
Also, we denote by x
b the minimum age of the group, i.e.
x
b = min{x1 , x2 , . . . , xm }.
Remark 9.2. Obviously, if n > x
e then n px1 x2 ...xm = 0.
Proposition 9.1. Let n, k N s.t. k m. We have:
n px1 x2 ...xm
np
[k]
x1 x2 ...xm
np
9.2
k
x1 x2 ...xm
lx1 +n lx2 +n
lx +n
... m ;
lx1
lx2
lxm
X
n pxi1 xi2 ...xik+s ;
mk
X
s
(1)s Ck+s
(9.2)
1i1 <...<ik+s m
s=0
(9.1)
mk
X
s
(1)s Ck+s1
(9.3)
1i1 <...<ik+s m
s=0
where
Dx1 ,x2 ,...,xm = lx1 lx2 . . . lxm v
x1 +x2 +...+xm
m
(9.4)
(9.5)
1
v =
being the annual discounting factor, i being the annual interest
1+i
rate.
Corollary 9.1. Let a group of m persons having the ages x1 , x2 , . . . , xm ,
where m N , xj N, xj j {1, . . . , m}. Let n N and T 0.
The single premium payable by the group for a single claim of T u.c. over n
years if all of the members survive is
T n Ex1 ,x2 ,...,xm = T
9.3
102
[k]
x1 ,x2 ,...,xm
x1 ,x2 ,...,xm
[k]
x1 ,x2 ,...,xm
k
nE
x1 ,x2 ,...,xm
mk
X
(9.6)
1i1 <...<ik+s m
s=0
mk
X
s
(1)s Ck+s
s
(1)s Ck+s1
(9.7)
1i1 <...<ik+s m
s=0
[k]
x1 ,x2 ,...,xm
=T
mk
X
s=0
s
(1)s Ck+s
1i1 <...<ik+s m
2. The single premium payable by the group for a single claim of T u.c.
over n years if at least k of the members survive is
T nE
k
x1 ,x2 ,...,xm
9.4
=T
mk
X
s=0
s
(1)s Ck+s1
1i1 <...<ik+s m
103
ax1 ,x2 ,...,xm = the single premium payable by the group for a whole jointlife annuity-immediate of 1 u.c. per year (payable at the end of each
year while all of the members survive).
Proposition 9.4. We have
ax1 ,x2 ,...,xm =
(9.8)
where
Nx1 ,x2 ,...,xm =
e
x
X
(9.9)
n=0
x
e = max{x1 , x2 , . . . , xm } being the maximum age of the group.
Corollary 9.3. Let a group of m persons having the ages x1 , x2 , . . . , xm ,
where m N , xj N, xj j {1, . . . , m}. Let T 0. The single
premium payable by the group for a whole joint-life annuity-immediate of T
u.c. per year is
T ax1 ,x2 ,...,xm = T
9.5
[k]
x1 ,x2 ,...,xm
annuity-immediate of 1 u.c. per year payable (at the end of each year)
while exactly k of the members survive;
a
x1 ,x2 ,...,xm
annuity-immediate of 1 u.c. per year payable (at the end of each year)
while at least k of the members survive.
Proposition 9.5. We have
a
[k]
x1 ,x2 ,...,xm
x1 ,x2 ,...,xm
mk
X
s=0
mk
X
s=0
s
(1)s Ck+s
(9.10)
1i1 <...<ik+s m
s
(1)s Ck+s1
X
1i1 <...<ik+s m
(9.11)
104
[k]
x1 ,x2 ,...,xm
=T
mk
X
s
(1)s Ck+s
1i1 <...<ik+s m
s=0
2. The single premium payable by the group for a whole life annuityimmediate of T u.c. per year payable while at least k of the members
survive is
T a
9.6
k
x1 ,x2 ,...,xm
=T
mk
X
s
(1)s Ck+s1
1i1 <...<ik+s m
s=0
(9.12)
9.7
105
r| a
[k]
x1 ,x2 ,...,xm
r| a
x1 ,x2 ,...,xm
r| a
[k]
x1 ,x2 ,...,xm
k
x1 ,x2 ,...,xm
mk
X
(9.13)
1i1 <...<ik+s m
s=0
s
(1)s Ck+s
mk
X
s
(1)s Ck+s1
(9.14)
1i1 <...<ik+s m
s=0
[k]
x1 ,x2 ,...,xm
=T
mk
X
s=0
s
(1)s Ck+s
1i1 <...<ik+s m
2. The single premium payable by the group for an r-year deferred whole
life annuity-immediate of T u.c. per year payable while at least k of the
members survive is
T r| a
x1 ,x2 ,...,xm
=T
mk
X
s=0
s
(1)s Ck+s1
X
1i1 <...<ik+s m
9.8
106
(9.15)
9.9
[k]
x1 ,x2 ,...,xm : re
temporary life annuity-immediate of 1 u.c. per year payable (at the end
of each year) while exactly k of the members survive (during the next r
years);
a
x1 ,x2 ,...,xm : re
temporary life annuity-immediate of 1 u.c. per year payable (at the end
of each year) while at least k of the members survive (during the next
r years).
107
[k]
x1 ,x2 ,...,xm : re
k
x1 ,x2 ,...,xm : re
mk
X
(9.16)
1i1 <...<ik+s m
s=0
s
(1)s Ck+s
mk
X
s
(1)s Ck+s1
(9.17)
1i1 <...<ik+s m
s=0
=T
[k]
mk
X
x1 ,x2 ,...,xm : re
s
(1)s Ck+s
1i1 <...<ik+s m
s=0
2. The single premium payable by the group for an r-year temporary life
annuity-immediate of T u.c. per year payable while at least k of the
members survive is
T a
k
x1 ,x2 ,...,xm : re
9.10
=T
mk
X
s
(1)s Ck+s1
1i1 <...<ik+s m
s=0
Ax1 ,x2 ,...,xm = the single premium payable by the group for an insurance
of 1 u.c. payable at the moment of the first death, whenever this occurs.
Proposition 9.10. We have
(9.18)
where
Mx1 ,x2 ,...,xm =
e
x
X
n=0
(9.19)
108
x
e = max{x1 , x2 , . . . , xm } being the maximum age of the group, with
Cx1 ,x2 ,...,xm = (lx1 lx2 . . . lxm lx1 +1 lx2 +1 . . . lxm +1 ) v
x1 +x2 +...+xm
+ 12
m
,
(9.20)
1
being the annual discounting factor, i being the annual interest
v =
1+i
rate.
Corollary 9.9. Let a group of m persons having the ages x1 , x2 , . . . , xm ,
where m N , xj N, xj j {1, . . . , m}. Let T 0. The single
premium payable by the group for an insurance of T u.c. payable at the
moment of the first death, whenever this occurs, is
9.11
[k]
x1 ,x2 ,...,xm
of 1 u.c. payable at the moment of the k-th death, whenever this occurs.
Proposition 9.11. We have
[k]
x1 ,x2 ,...,xm
k1
X
s
(1)s Cmk+s
1i1 <...<imk+s+1 m
s=0
T A
[k]
x1 ,x2 ,...,xm
k1
X
s
=T
(1)s Cmk+s
s=0
X
1i1 <...<imk+s+1 m
9.12
109
Problems
Exercise 9.1. Consider a group of 4 members of 55, 53, 30, and 28 years
old.
a) Calculate the probability that all of the members survive 15 years.
b) Calculate the probability that exactly 2 of the members survive 25 years.
c) Calculate the probability that at least 3 of the members survive 20 years.
d) Calculate the probability that at most 3 of the members survive 10 years.
Exercise 9.2. Calculate the single premium payable by a family of two
persons of 32 and 30 years old for a single claim of 20000$ over 35 years if
both members will be alive. The annual interest percent is 12%.
Exercise 9.3. Calculate the single premium payable by a family of two
persons of 32 and 30 years old for a single claim of 20000$ over 35 years if
just one member will be alive. The annual interest percent is 12%.
Exercise 9.4. Calculate the single premium payable by a family of two
persons of 32 and 30 years old for a single claim of 20000$ over 35 years if at
least one member will be alive. The annual interest percent is 12%.
Exercise 9.5. Calculate the single premium payable by a family of three
persons of 46, 44 and 22 years old for a life annuity-immediate of 10000$ per
year while all of the members survive. The annual interest percent is 12%.
Exercise 9.6. Calculate the single premium payable by a family of three
persons of 46, 44 and 22 years old for a life annuity-immediate of 10000$ per
year while exactly two of the members survive. The annual interest percent
is 12%.
Exercise 9.7. Calculate the single premium payable by a family of three
persons of 46, 44 and 22 years old for a life annuity-immediate of 10000$ per
year while at least two of the members survive. The annual interest percent
is 12%.
Exercise 9.8. Calculate the single premium payable by a family of two
persons of 42 and 37 years old for a 10-year deferred life annuity-immediate
of 10000$ per year while all of the members survive. The annual interest
percent is 12%.
Exercise 9.9. Calculate the single premium payable by a family of two
persons of 42 and 37 years old for a 10-year deferred life annuity-immediate
of 10000$ per year while just one member survives. The annual interest
percent is 12%.
110
Theme 10
Bonus-Malus system in
automobile insurance
10.1
A general model
The Bonus-Malus system is the most well known system of goods insurance,
especially car insurance. In this type of insurance, policies are categorized
based on characteristics of the insured vehicle (the insured good), and on
Bonus-Malus level, given by the previous number of claims. The insurance
period for goods is usually one year. In this case, one policy remains in a
certain payment class for one year and then it can be transferred to another
payment class, based on the number of accidents from the previous year. If
the insured vehicle didnt have any accident, then the new payment class
will be better, so the premium will be reduced (bonus). As the number
of accidents grows, the new class will be worst, so the premium will be
increased (malus).
Definition 10.1. A Bonus-Malus insurance system can be represented
as S = (C, D, T, ), where:
C = {1, . . . , c} represents the set of payment classes (c N ). If
i > j, i, j C, we say thai i is a better class than j.
D = {0, . . . , r} represents the set of annual number of accidents
possible for an insurance policy (r N ).
T : C D C is a function called the rule of passing of the system;
for any i C and j D, T (i, j) represents the payment class in which
it will be transferred the next year every insurance policy from class
i that had j accidents during the current year. The function T (i, j)
increases in i (for any fixed j) and decreases in j (for any fixed i).
111
112
10.2
113
Remark 10.2. The frequency index represents the rapport between the posterior mean and the prior mean of Xn+1 .
Proposition 10.1. For any n N and x1 , . . . , xn D we have
E(|X1 = x1 , . . . , Xn = xn )
E()
premium for year n + 1 given (x1 , . . . , xn )
.
=
initial premium, from year 1
In+1 (x1 , . . . , xn ) =
(10.2)
(10.3)
114
and not on the distribution of the accidents during these n years. Therefore
the values of the frequency indexes are tabled according to the year n and the
n
P
total number of accidents
xi .
i=1
10.3
We will apply the described model in the particular case when the r.v.
(which represents the average number of annual accidents for a random insurance policy) has a Gamma prior distribution of parameters a and b, where
a, b > 0, with the probability density function
f () =
1
a1 e b , > 0.
a
(a)b
Then the r.v. X (which represents the number of accidents during one year
for a random policy) has a mixed Poisson-Gamma distribution of parameters a and b, which is equivalent with a Negative Binomial distribution of
1
1
. So X BN (a, b+1
).
parameters a and b+1
Step 0 of the previous algorithm requires the estimation of the parameters
a and b for the prior distribution of r.v. . In the next example we will
apply the maximum likelihood estimation method to estimate these
parameters, based on the number of accidents during one year.
Step 1 consists of calculating the a posteriori distribution for r.v. , that
is the distribution of conditioned r.v. |(X1 = x1 , . . . , Xn = xn ). According
to Bayes formula, this distribution has the probability density function
f ()P (X1 = x1 , . . . , Xn = xn | = )
.
f (|x1 , . . . , xn ) = R
f (t)P (X1 = x1 , . . . , Xn = xn | = t)dt
0
Using the hypothesis that for any > 0 the conditioned random variables
X1 |( = ), . . . , Xn |( = ) are independent and identically distributed with
X|( = ) (i.e. Xi |( = ) Po() for any i {1, . . . , n}), it follows that
P (X1 = x1 , . . . , Xn = xn | = ) = P (X1 = x1 | = ) . . . P (Xn = xn | = )
115
= P (X = x1 | = ) . . . P (X = xn | = )
n
P
x1
=e
x1 !
... e
xn
xn !
=e
xi
i=1
,
x1 ! . . . xn !
so
n
P
xi
a1 e b en
a
(a)b
x1 ! . . . xn !
f (|x1 , . . . , xn ) =
n
Z
P
x
1
1
a1 bt nt i=1 i
e
t
t
e
dt
(a)ba x1 ! . . . xn ! 0
i=1
a+
R
0
n
P
xi 1
i=1
a+
n
P
i=1
xi 1
(1+bn)
b
t(1+bn)
b
.
dt
Step 2 consists of calculating the posterior distribution for the r.v. Xn+1 ,
that is the distribution of the conditioned r.v. Xn+1 |(X1 = x1 , . . . , Xn = xn ).
According to the total probability formula, this distribution is given by
P (Xn+1 = x|X1 = x1 , . . . , Xn = xn )
Z
P (Xn+1 = x|X1 = x1 , . . . , Xn = xn , = )f (|x1 , . . . , xn )d, x N.
=
0
Using the hypothesis that for any > 0 the conditioned random variables
X1 |( = ), . . . , Xn |( = ), Xn+1 |( = ) are independent and identically
distributed with X|( = ), it follows that
P (Xn+1 = x|X1 = x1 , . . . , Xn = xn , = ) = P (Xn+1 = x| = )
= P (X = x| = ),
so
Z
P (Xn+1 = x|X1 = x1 , . . . , Xn = xn ) =
P (X = x| = )f (|x1 , . . . , xn )d,
0
116
we get that the posterior distribution of the r.v. Xn+1 is a mixed Poissonn
P
b
Gamma distribution of parameters a +
xi and 1+bn
, which is equivalent
i=1
n
P
i=1
n
P
xi and
i=1
1+bn
.
1+b+bn
1+bn
).
xi , 1+b+bn
Step 3 consists of calculating the frequency indexes In+1 (x1 , . . . , xn ). According to formula (10.1) we have
E(Xn+1 |X1 = x1 , . . . , Xn = xn )
, x1 , . . . , xn D.
E(Xn+1 )
According to the formula of the mean for the Negative Binomial distribution,
1
) it follows that
from X BN (a, b+1
In+1 (x1 , . . . , xn ) =
E(Xn+1 ) = E(X) = a
1
b+1
1
b+1
= ab,
n
P
i=1
1+bn
xi , 1+b+bn
) it follows
that
E(Xn+1 |X1 = x1 , . . . , Xn = xn ) =
a+
n
X
!
xi
i=1
1+bn
1+b+bn
1+bn
1+b+bn
n
P
b a+
xi
=
i=1
1 + bn
so
a+
In+1 (x1 , . . . , xn ) =
n
P
xi
i=1
a(1 + bn)
1+
=
1
a
n
P
i=1
1 + bn
xi
, x1 , . . . , xn D.
(10.4)
After calculating the frequency indexes using formula (10.4), the premiums
for year n + 1 can be obtained (at Step 4) by using the formula (10.3).
Example 10.1. We apply the model discussed for the following data set,
that one French insurance company had during year 1979. The data set
consists of m = 1044454 policyholders.
No. of accidents (j) Absolute frequency (mj )
0
881705
1
142217
2
18088
3
2118
4
273
5
53
Total
1044454
117
1
).
According to our model, X BN (a, b+1
By using the maximum likelihood estimation method, the estimated values of parameters a > 0 and b verify the following equations
1
a
=
,
b+1
a+X
5
X
j=1
mj
1
1
1
+
+ ... +
a a+1
a+j1
X
m ln 1 +
= 0,
a
(10.5)
(10.6)
where X is the mean of the sample formed by the recorded data. We have
X=
5
X
j=0
mj
' 0, 178183051.
m
(10.7)
For the given values of mj , m and X, we derive that the equation (10.7) has
a unique positive solution, namely
a ' 1, 672974126.
By (10.5) we obtain that
b=
X
' 0, 106506758.
a
1
1,672974126
n
P
xi
i=1
1 + 0, 106506758 n
, x1 , . . . , xn D.
(10.8)
Using the constructed model (Classical Bonus-Malus System) we consider the case when the maximum number of consecutive years is n = 10
and the maximum number of accidents is 5. According to formula (10.8)
we obtain the following table containing the values of frequency indexes
n
P
In+1 (x1 , . . . , xn ) based on year n and the total number of accidents
xi .
i=1
n
P
xi
118
1.0000
0.9037
0.8244
0.7579
0.7012
0.6525
0.6101
0.5729
0.5399
0.5106
0.4842
1.4439
1.3172
1.2108
1.1204
1.0425
0.9748
0.9153
0.8627
0.8158
0.7737
1.9842
1.8099
1.6638
1.5396
1.4326
1.3395
1.2578
1.1854
1.1210
1.0631
2.5244
2.3027
2.1168
1.9587
1.8226
1.7042
1.6002
1.5082
1.4262
1.3526
3.0646
2.7955
2.5698
2.3779
2.2126
2.0689
1.9426
1.8309
1.7313
1.6421
3.6048
3.2882
3.0228
2.7971
2.6027
2.4336
2.2851
2.1537
2.0365
1.9315
i=1
0
1
2
3
4
5
6
7
8
9
10
For example, consider a policyholder that had only one accident in the first
three years and the initial premium was 200 u.c. According to formula (10.3)
and the values from the table of frequency indexes, the premium for the fourth
3
P
year is obtained as 200 I3+1 (x1 , x2 , x3 ) for
xi = 1, that is 200 1.2108 =
i=1
242.16 u.c.
10.4
Problems
Exercise 10.1. Calculate and extend the above table of frequency indexes.
Exercise 10.2. The initial premium for a policyholder was 300 u.c. Calculate the premium for the next 12 years, if the policyholder had only one
accident in the second year, two accidents in the sixth year and one accident
in the seventh year.
Exercise 10.3. a) A policyholder had only one accident in the first year.
Calculate the number of years over which the premium will be less that the
initial premium.
b) The same question for a policyholder who had three accidents in the first
year.
Theme 11
Some optimization models
11.1
Portfolio planning
i=1
n
X
m i pi
i=1
p Vp =
n X
n
X
Vij pi pj
i=1 j=1
120
121
11.2
Regional planning
122
Hence (xikl )i,k,l is an optimal solution for the following linear programming
problem
n
K
L
max X X X(b c )x s.t.
ikl
ikl ikl
i=1 k=1 l=1
K X
L
X
sikl xikl Si , i {1, . . . , n},
(P11.2.1)
k=1 l=1
n X
L
X
xikl = Fk , k {1, . . . , K},
i=1 l=1
x 0, i {1, . . . , n}, k {1, . . . , K}, l {1, . . . , L}.
ikl
When the planner disposes of a lower bound M for the total bidding
power, then (xikl )i,k,l is an optimal solution for the following problem, according to the MEP:
L
K X
n X
X
max
xikl ln xikl s.t.
i=1
l=1
k=1
K L
XX
sikl xikl Si , i {1, . . . , n},
k=1 l=1
n X
L
(P11.2.2) X
xikl = Fk , k {1, . . . , K},
i=1 l=1
X
L
n X
K X
(bikl cikl )xikl M,
i=1 k=1 l=1
x 0, i {1, . . . , n}, k {1, . . . , K}, l {1, . . . , L}.
ikl
Adding auxiliary variables to the inequality constraints, this problem becomes a linear programming problem with partial entropic perturbation.
Another model for households locating is obtained by choosing the minimization of the total of journey-to-work transportation costs. By refining
the elements of the above model, we assume that the following elements are
now known, for any i, j {1, . . . , n}, k {1, . . . , K} and l {1, . . . , L}:
bijkl = the budget that a type k household is willing to allocate for
purchasing and living in a type l house in zone i and working in zone j
(under the assumption that every household has only one key worker);
cijkl = the cost that must be allocated by a type k household to living
in a type l house in zone i and working in zone j;
Lil = the number of type l house available in zone i;
123
11.3
When planning the industrial production of a country or region, one important problem consists of estimating the technical coefficient matrix
124
A = (aij )i,j{1,...,n} , where n is the number of industry sectors and, for any
i, j {1, . . . , n}, aij represents the amount of input from sector i to sector j
per unit of the output of sector j. Therefore
aij =
zij
, i, j {1, . . . , n},
Xj
where zij represents the sales input from sector i to sector j, and Xj represents the total output of sector j. Also, we have
Xi =
n
X
j=1
j=1
n
X
i=1
where
n
X
i=1
li =
n
X
cj .
j=1
where A(0) = (aij )i,j{1,...,n} , with the assumption that aij = 0 for any
(0)
i, j {1, . . . , n} for which aij = 0.
125
o
n
(0)
K = (i, j) aij > 0, i, j {1, . . . , n}
and
(0)