You are on page 1of 31

Introduction to Statistics

Prof. Christos Agiakloglou


Statistics
Basic Concepts
• Population vs Sample.
• Measures of Summarizing data.
• Parameter vs. Estimator & Estimate.
• Random Variable.
• Moments.
• Probability distributions.
• Statistical Inference.
Population vs. Sample
• Population
A Population is the whole set of all observations of a specific
phenomenon, i.e., U of I students, all publicly-traded firms in
US, all historically available monthly returns on S&P 500.
• Sample
A Sample is a subset of a population, i.e., this class of U of I
students, small-cap publicly-traded firms in US, monthly
returns on S&P 500 for the last 10 years.
• Use Sample to make inference about Population.
• Random Sample
A sample that contains the qualitative and quantitative
characteristics of a population.
The objective
• Analyze the behavior of a variable which is given by a
data set.
• Find two up to four numbers that will describe the
behavior of the variable.
• Do we use the population?? No, simply
because it too large to work with, too costly too gather the data and
the outcome may not be worthy, i.e., one number, the mean.
• Dealing with population is not statistics, it is another type
of accounting.
• Use Sample to make inference about Population.
• Sample means a small number of observations, i.e., easy
to work with them.
Descriptive Statistics
• Summarize and present numerically the behavior of
a variable in an informative way:
– Measures of central tendency
– Measures of dispersion
– Skewness
– Kurtosis
• These measures are also known in statistics as
moments, i.e., first, second, third and forth.
Measures of Central Tendency
• Measures of central tendency describe the location
of the center of the data.
– Mean
– Median
– Mode
• By mean we basically mean the arithmetic mean,
although there is also the geometric mean, as well
as the weighted mean.
The Arithmetic Mean
• The arithmetic mean is “average value” of a data set.
• Equal weights and same measurement units.
• The Population Mean (μ) for given N obvns, i.e., X1,
X2, …, XN, is defined as:
1 N

N
X
i 1
i

• The Sample Mean for given n obvns, i.e., X1,


( Xas:
X2, …, Xn, is defined )

1 n
X   Xi
n i 1
The Median
• The median is the midpoint of a data set after the
observations have been sorted in ascending or
descending order.
• If the number of observations of the data set is odd,
the median is located in the (n + 1)/2th position.
• If the number of observations of the data set is even,
the median is the average of the n/2th and (n + 2)/2th
positions.
• Same measurement units.
Example of Calculating Mean & Median
Five-year annualized total returns of five growth mutual funds

Name of Fund Annualized total return

PBHG Growth 28.5%

Dean Witter Developing Growth 17.2

AIM Aggressive Growth 25.4

Twentieth Century Giftrust 28.8

Robertson Stevens Emerging 22.6


Growth

Mean = 24.5 Median = 25.4


Median vs. Mean
• Which one is better measurement of the central tendency?
• The answer is clearly the median.
• The mean is strongly affected by extreme observations
whereas the median is not, i.e., 1, 2, 3, 4,
5 --- mean = 3 & median = 3 1, 2, 3, 4, 5 & 100 ---
mean = 19.67 & median =3.5.
• Why are we then using the mean??
• Because, as we shall see, the distribution of the sample
mean is well defined and we can make statistical inference.
• The median though does not use all information about the
size and magnitude of observations.
The Mode
• The mode is the most frequently occurring value in the
data set.
• The data set may have more than one mode or no mode, in
the sense that each value of the data set appears once.
(A data set with two modes is called bimodal)
• The mode can be easily revealed in the case of group data.
For example, data on stock returns can take any value and hence
it will be difficult to define the mode. However if we group the
data into intervals, the modal interval may be easily detected.
• The mode also appears in cases in which the concept of the
mean does not exist, i.e., what is the typical hair color in
this class or what is the typical car in this city.
Example – Monthly Stock Returns
Month Xi Obvns in order
1 1.4 0.5
2 1.6 0.8
3 2.5 0.9
4 0.5 1.1
5 1.5 1.3
6 0.9 1.4
7 1.1 1.5
8 1.5 1.5
9 1.3 1.6
10 2.1 1.9
11 0.8 2.1
12 1.9 2.5
Sum 17.1
mean 1.425
median 1.45
mode 1.5
The Geometric Mean
• The geometric mean is used to average rates of growth over
time or compute the growth rate of a variable.
• Given Xi > 0, for I =1,2,..,n, the geometric mean (G) is
defined as:
G  n X 1 X 2 ... X n
• However, in the case of returns and for a given holding time
period the geometric mean return (RG) is defined as:

• The geometric
RGmean
 T (1return
 R1 )(1 is
 Ralso referred to as compound
2 )...(1  RT )  1
return.
Arithmetic vs. Geometric Mean Return
Year Holding Period Return for
Canadian Equities

2000 -1.6%

2001 17.7

2002 25.4

2003 2.6

2004 -12.6
Arithmetic vs. Geometric Mean Return
• Arithmetic mean = (-1.6+17.7+25.4+2.6-12.6)/5 = 6.3%
• Geometric mean
– Convert returns into decimal form, i.e., (Rt /100): 0.016,
0.177, 0.254, 0.026, -0.126
– Add one to obtain: 0.984, 1.177, 1.254, 1.026, 0.874
– Thus, the geometric mean is 5.43%, i.e.,

RG  5 (0.984)(1.177)(1.254)(1.026)(0.874)  1  0.0543
• Comment: the geometric mean is always less than the
arithmetic mean or equal in the case of no variability in the
data, i.e., all observations are the same.
Arithmetic vs. Geometric Mean
• Statistically the arithmetic mean is better.
• Intuitively though some times the arithmetic mean does not
make any sense for Financial issues.
• Example 1: Suppose a share of Microsoft costs today $100.
One year later, the stock trades at $200, while at the end of
the second year the stock falls back to $100 (Microsoft pays
no dividends). R1 = 100%, R2 = -50%, arithmetic mean =
25%, i.e., (100-50)/2 = 25%, but the share is back to its initial
price.
• Example 2: Capital re Asset rm  r f Model
 r f  Pricing  (CAPM)
Which
measure of equity premium to use? Arithmetic or geometric
mean of historical returns?
Measures of Dispersion
• We want to find out how the data is spread around its
central location.
– The Range.
– The Mean Absolute Deviation (MAD).
– The Variance.

• The concept of dispersion is related to risk. For example,


think the mean of returns as the reward of an investment and
the risk of obtaining that mean return as the dispersion of
returns.
The Range
• The range of a data set is defined as the difference between
the maximum and the minimum value of the data set, i.e.,
Range = Max value – Min value
• Small range means small dispersion whereas large range
means large dispersion.
• It does not include all observations of the data set and
therefore it does not contain valuable information.
• It is based only on these two - max & min - observations.
• For example, the data set may have good dispersion which
cannot be picked up from the range, if the min and the max
are extreme observations.
The Mean Absolute Deviation
• The MAD is the average value of the absolute values of the
distance of each observation from its mean.
N
For Population:
 Xi  
i 1
MAD 
N
For Sample:
n

X i X
• The MAD uses all observations
MAD  i 1 of the data set and in that
n
sense it is better measurement of dispersion than the Range.
• However, it takes the sum giving equal weights to each
distance and it cannot be used for differentiation.
• It has though the same measurement units.
Example – Monthly Stock Returns
Month Xi │X i - mean│
1 1.4 0.025
2 1.6 0.175
3 2.5 1.075
4 0.5 0.925
5 1.5 0.075
6 0.9 0.525
7 1.1 0.325
8 1.5 0.075
9 1.3 0.125
10 2.1 0.675
11 0.8 0.625
12 1.9 0.475
Sum 17.1 5.1
mean 1.425
MAD 0.425
The Variance
• The variance is the average value of the squared deviations
from the mean. N
For Population:  ( X i   ) 2

2
  i 1
For Sample:
N
n

(X i  X )2
• The Variance must always s 2 be
 positive
i 1
(or non-negative).
n 1
• It will be zero if all observations are equal to each other equal
to their mean.
• It is the best measurement of dispersion.
Comments on the Variance
• It gives different weights on its deviation from the mean
by squaring it. Thus we get a better estimate of how the
observations are allocated around their central location.
• Recall that the mean is strongly affected by extreme
observations.
• Very difficult to explain its meaning though.
• The variance has sigh (positive), magnitude and
measurement units.
• The only thing we can say is that the smaller the variance
the better the allocation of the data around its mean.
• Define small variance or large variance??
• Well known statistical behavior.
Example – Monthly Stock Returns
Month Xi (X i- mean) (X i- mean)2
1 1.4 -0.025 0.000625
2 1.6 0.175 0.030625
3 2.5 1.075 1.155625
4 0.5 -0.925 0.855625
5 1.5 0.075 0.005625
6 0.9 -0.525 0.275625
7 1.1 -0.325 0.105625
8 1.5 0.075 0.005625
9 1.3 -0.125 0.015625
10 2.1 0.675 0.455625
11 0.8 -0.625 0.0390625
12 1.9 0.475 0.225625
Sum 17.1 0 3.5225
mean 1.425
Variance 0.320227
The Standard Deviation
• The standard deviation is the positive square root of the
variance.
For Population:
  2 For Sample:

• The standard deviation hassthe s 2 measurement units and


 same
therefore it can be used directly to express the dispersion of
the data.
• For our example, s = 0.5659.
The Coefficient of Variation
• The coefficient of variation (CV) expresses the dispersion of
the data set relative to its mean.
For Population:
CV  For
Sample: 
s
CV 
• Note that it is independent of measurement
X units. Therefore it
can be used directly for comparison.
• For our example CV = 0.5659/1.425 = 0.3971.
• The coefficient of variation for returns measures the amount of
risk (standard deviation) per unit of mean return.
Skewness
• Skewness measures the degree of symmetry in the data.
• The distribution of the data is called skewed, if it is not
symmetric.
– Skewed to the right (Positively skewed) if the distribution has a
long tail to the right.
– Skewed to the left (Negatively skewed) if the distribution has a
long tail to the left.
• Example: Suppose that the distribution of holding period
returns for HP shares is positively skewed. That means
that Investors should expect frequent small losses and a
few extreme gains.
• Mean – Median – Mode and Skewness.
The Coefficient of Skewness
• The sample coefficient of skewness is computed as follows:
1 n

n
(X
i 1
i  X )3
b1 
• Note it is independent of measurement
s3 units.
• For symmetry the value of the coefficient b1 must be close to
zero.
• For normality an absolute value greater than 0.5 must
considered unusually large.
• For our example b1 = 0.2054.
Pearson’s Coefficient of Skewness
• The coefficient is defined as:

3( X  Median)
SK 
• The value of the Pearson’s coefficient can range from –3 to 3.
s
– A value of zero (or very close to zero) indicates a
symmetrical distribution.
– A value closer to 3 indicates skewed to the right (positive
skewness).
– A value closer to -3 indicates skewed to the left (negative
skewness).
Kurtosis
• Kurtosis is another measure of the shape of the
distribution of the data which indicates the degree
the data is clustered around its mean.
• It defines whether the distribution is more or less
peaked around its mean than a normal distribution.
– Leptokurtic distribution (more peaked around the mean)
– Mesokurtic distribution (similar to a normal)
– Platykurtic distribution (one with more fat tails).
The Coefficient of Kurtosis
• The sample coefficient of kurtosis is computed as follows:

1 n

n i 1
(X i  X )4
• It is also b2 
independent of measurement
s4 units.
• For all normal distributions kurtosis is equal to 3.
• For our example b2 = 2.0861.
Kurtosis
• Example: Distributions of asset returns when there are
jumps in asset prices.
– This can be the case with markets where there is
discontinuous trading, such as securities markets.
– Information related to asset prices that becomes
available when markets are closed (such as weekends)
can cause jumps on prices when markets reopen.
– This causes higher frequencies of negative and positive
returns than would be the case in markets with
continuous trading.

You might also like