You are on page 1of 49

Introduction to Econometrics [ET2013]

Teresa Randazzo

Ca’ Foscari University of Venice


teresa.randazzo@unive.it

(Introduction to Econometrics) 1
Contacts

I Email: teresa.randazzo@unive.it

I Room: A.005

I Office hour: Thursday 16-18

(Introduction to Econometrics) 2
Organizational Details
I Timetable:
I Monday, 15:45-17:15, San Giobbe 8A
I Tuesday, 15:45-17:15, San Giobbe 8A
I Wednesday, 15:45-17:15, San Giobbe 5A
I Thursday, 08:45-9:15, San Giobbe 4A (Dr. Michele Costola)

I Prerequisites:
I Statistics
I Probability

I Assessment: written examination (exercises / Gretl/R/Stata


output comments / theory)

I Learning outcomes: at the end of the course the student will be


familiar with the main econometric techniques which can be used
to analyse cross section and time series data.
(Introduction to Econometrics) 3
Readings

I Lecture slides and supplementary materials on moodle

I Reference Books: Verbeek M. (2009). A Guide to Modern


Econometrics. John Wiley & Sons Ltd, Chapter 1,2,3,4,8.

(Introduction to Econometrics) 4
Econometrics

I Econometrics use economic theory and statistical techniques to


analyze economic data.

I Econometricians typically analyse nonexperimental data as data


from controlled experiments are hard to come by in the social
sciences.

I We can use econometrics to:


I estimate relationships between economic variables
I test economic theories and hypothesis
I validate theoretical models
I make predictions or forecasts about economics variables
I evaluate government and business policies

(Introduction to Econometrics) 5
Econometrics

1. Our interest is to understand how variables are related: State


precisely the question you aim to answer.

2. Specify an economic model based on the economic theory (e.g.


demand equations, pricing equations, optimal taxation...).

3. Turn the economic model into an econometric model.

4. Collect data on the variables and use appropriate statistical


methods to estimate the parameters

(Introduction to Econometrics) 6
The structure of economic data

I Econometric analysis requires data


I experimental data
I observational data

I Different types of economic data:


I cross-sectional data
I time series data
I panel/longitudinal data

I Econometric methods depend on the nature of the data of interest

(Introduction to Econometrics) 7
Example # 1: Economic Model of Energy consumption
I Aim to explain energy consumption behaviour and to identify its
determinants.
I In economics the relationships between economic variables is
expressed using the mathematical concept of a function.

I Economic model of energy consumption:

y = f (x1 x2 x3...)

where:
y = energy consumption
x1 = temperature
x2 = price index
x3 = income
x4 = ...
(Introduction to Econometrics) 8
Example # 2: Wage and background characteristics
I Question of interest: What is the effect of additional education on
earned wage?
I Economic model:

y = f (x1, x2, x3, x4...)

where:
y = observed hourly wage
x1 = age
x2 = gender
x3 = years of formal education
x4 = years of workforce experience

I Econometric model:
y = b0 + b1 x1 + b2 x2 + b3 x3 + b4 x4 + u
(Introduction to Econometrics) 9
Types of Data
I Cross-sectional data ara data on different entities (workers,
consumers, firms, governmental units, ecc.) for a single time
period. Most often cross-section data are data for micro units -
individuals, households, companies, etc. (e.g Household Budget
Survey for the year yyyy, the Manufacturing Statistics for the year
yyyy, the Population Census for the year yyyy)
I Time series data are data for a single entity (person, firm,
country) collected at multiple time periods. Most often
time-series data are macro data (e.g the Index of Manufacturing
Production, the Consumer Price Index and Financial statistics -
money stock, exchange rates, interest rates, bank deposits, etc.)
I Panel data, also called longitudinal data, are data for multiple
entities in which each entity is observed at two or more time
periods (e.g household consumptions over years, European
countries GDP growth)
(Introduction to Econometrics) 10
Course Outline

1. Short recall of algebra/statistics/probability


I Matrix Algebra
I Random Variables
I Expected value, variance, covariance, correlations, skewness,
kurtosis, etc...
I The Normal distribution, the t distribution, the F distribution

2. Basic ingredients of regression analysis Ch. 2


I The classic linear regression model (OLS)

3. Interpreting and comparing regression model Ch. 3


4. Heteroskedasticity and autocorrelation Ch. 4
I The generalized linear regression model

5. Univariate Time series model Ch. 8

(Introduction to Econometrics) 11
Matrix Algebra

I Matrix definition

I Summation and Product Operators

I Matrix Manipulation

I Properties of Matrices

(Introduction to Econometrics) 12
Matrix Algebra
Vectors
A vector is an ordered sequence of elements arranged in a row or
column
 
a1
a = a2  a0 = a1 a2 a3
 

a3
   
a1 b1
a = a2  b = b2 
a3 b3
Addition and Substraction
 
a1 + b1
c = a + b a2 + b2 
a3 + b3

(Introduction to Econometrics) 13
Matrix Algebra
Vectors

Vectors Multiplication
   
a1 ka1
 a2  ka2 
k ×a=
. . . =  . . . 
  

an kan
 
b1
n
  b2 
 X
a’b = a1 a2 . . . an × .  = a1 b1 +a2 b2 +...+an bn = ai bi = b’a

 .. 
i=1
bn
n
X
a’a = ai2
i=1

(Introduction to Econometrics) 14
Matrix Algebra
Matrix

The order of a matrix is given by the number of rows and the number
of column
   
a11 a12 a13 a11 a21 a31
A = a21 a22 a23  A0 = a12 a22 a32 
a31 a32 a33 a13 a23 a33

Symmetric matrix satisfied A’=A that is aij = aji for i 6= j


 
1 −1 4
A = −1 0 3 = A0
4 3 2

(Introduction to Econometrics) 15
Matrix Algebra
Matrix

Transpose properties

1 (A+B)’=A’+B’

2 (A × B)’=B’A’

3 (A × B × C)’=C’ B’ A’

4 (A’)’=A

(Introduction to Econometrics) 16
Matrix Algebra
Matrix Multiplication

A B = C
n×k k×s n×s
n
X
cij = aik bkj
k=1
 
  1 6  
1 2 3 4 11
AB = × 0 1 =
 
2 0 4 6 16
1 1
   
1 6   13 2 27
1 2 3
BA = 0 1 × =2 0 4
2 0 4
1 1 3 2 7
AB 6= BA

(Introduction to Econometrics) 17
Matrix Algebra
Square Matrices
The unit or identity matrix of order n × n is
 
1 0 ... 0
0 1 . . . 0
In =  . .
 
 .. .. . . .. 
. .
0 0 ... 1
For a matrix A of order m × n Im A = AIn = A

Diagonal matrix  
λ1 0 . . . 0
 0 λ2 . . . 0 
 
 .. .. . . .. 
. . . .
0 0 . . . λn
Idempotent Matrix
A = A2 = A3 = ...
(Introduction to Econometrics) 18
Matrix Algebra
Squared Matrix
An important property of the squared matrix is the trace which is the
sum of the elements on the principal diagonal, that is
X
tr (A) = aii
i
If the matrix A is of order m × n and B is of order n × m than AB
and BA are both squared matrix and tr(AB)=tr(BA)

The rank of a matrix


It is defined as the maximum number of linearly independent column
(or rows) in the matrix. If A is n × m

rank(A) ≤ (m, n)
A Matrix has a full rank if
rank(A) = min(m, n)
(Introduction to Econometrics) 19
Matrix Algebra
Determinant

A determinant is a scalar associated with some squared matrix.


 
a11 a12
A= = a11 a22 − a21 a12 = |A|
a21 a22
 
a11 a12 a13
B = a21 a22 a23 
a31 a32 a33
     
a22 a23 a a23 a a
|B| = a11 − a12 21 + a13 21 22
a32 a33 a31 a33 a31 a32

(Introduction to Econometrics) 20
Matrix Algebra
Inverse Matrix

The determinant is used to calculated the matrix inverse.


 
a11 a12 a13
A = a21 a22 a23 
a31 a32 a33

we can define the matrix of the associated cofactors:


 
C11 C12 C13
C = C21 C22 C23 
C31 C32 C33

C0 = adjA
1
A− 1 = adjA
|A|

(Introduction to Econometrics) 21
Matrix Algebra
Inverse Matrix

I Only a squared matrix can have an inverse

I Not all squared matrix will have an inverse

I A matrix that have an inverse is said to be non-singular and a


matrix that has no inverse is said to be singular
I If an inverse exists it is unique

I Inverses have the following properties:

(A−1 )−1 = A

(AB)−1 = B−1 A−1


(A’)−1 = (A−1 )0

(Introduction to Econometrics) 22
Recall of Statistics and Probability
Random Variable
I A random variable (RV) is a variable whose value is unknown
until it is observed
I A RV is a numerical summary of a random outcome such that
each event (an event is a set of one or more outcomes) is
associated to a probability.
I A random variable that can only take the values zero and one is
called Bernoulli random variable

I Two types of RV:


I Discrete random variable: takes on only a discrete set of values,
like 0, 1,2, . . . (e.g dice roll)
I Continuous random variable: takes on a continuum of possible
values
I A RV is represented by the cumulative distribution function c.d.f.
and by the probability mass/density function (p.m.f./p.d.f.)
(Introduction to Econometrics) 23
Recall of Statistics and Probability
p.m.f. and c.d.f. of a discrete RV

I For a discrete random variable X, its probability mass function


(p.m.f.) f(.) is specified by giving the values
f (x) = P(X = x)
for all x in the range of X
I A function f can only be a probability mass function if it satisfies
certain conditions:
(1.) As f (x) represents the probability that the variable X takes the
value x, f (x) can never be negative. So f (x) > 0 for all x
(2.) Also if we sum over all values of x (in the range of X), the total
must be equal to one
P P
f (x) = P(X = x) = 1

(Introduction to Econometrics) 24
Recall of Statistics and Probability
p.m.f. and c.d.f. of a discrete RV

I The cumulative probability distribution is the probability that the


random variable is less than or equal to a particular value.
I In other words, for a random variable X the cdf is a function FX
that, when evaluated at a point x, gives the probability that the
random variable will take on a value less than or equal to x

F (x) = Pr (X ≤ x)
I If f is the probability mass function of a discrete random variable
X with range f (x1, x2, ...) and F is its cumulative distribution
function, then X
F (x) = f (xi )
x≤ x

(Introduction to Econometrics) 25
Recall of Statistics and Probability
p.m.f. and c.d.f. of a discrete RV
Roll of a fair dice

SampleSpace : S = {1, 2, 3, 4, 5, 6}
Because the die is fair, each of the six faces has an equally likely
I p.m.f

I c.d.f 
1/6 x = 1

2/6 x = 2





3/6 x = 3
Pr [X ≤ x] = FX (x) =

4/6 x = 4

5/6 x = 5




6/6 x = 6
(Introduction to Econometrics) 26
Recall of Statistics and Probability
p.m.f. and c.d.f. of a continuous RV

I A continuous random variable can take on a continuum of


possible values
I The probability distribution, which lists the probability of each
possible value of the random variable, is summarized by the
probability density function.
I The area under the probability density function between any two
points is the probability that the random variable falls between
those two points. A probability density function is also called a
p.d.f .
Z b
P(a < X < b) = f (x)dx = F (b) − F (a)
a

(Introduction to Econometrics) 27
Recall of Statistics and Probability
p.m.f. and c.d.f. of a continuous RV

I Taking the integral of f(x) over all possible outcomes gives


Z +∞
f (x) = f (x)dx = 1
−∞

I If X takes the values within a certain range only, it is implicity


assumed that f(x)=0 anywhere outside this range
I The definition for the p.d.f . of a continuous random variable
differs from the definition for the p.m.f . of a discrete random
variable by simply changing the summations that appeared in the
discrete case to integrals in the continuous case

(Introduction to Econometrics) 28
Recall of Statistics and Probability
p.m.f. and c.d.f. of a continuous RV

I The cumulative distribution function (c.d.f) for continuous


random variables is just a straightforward extension of that of the
discrete case. As above, all we need to do is replace the
summation with an integral
I The (c.d.f .) of a continuous random variable X is defined as
Z x
F (x) = P{X 6 x} f (t)dt
−∞

I For discrete random variables F(x) is, in general, a non-decreasing


step function. For continuous random variables, F(x) is a
non-decreasing continuous function

(Introduction to Econometrics) 29
Recall of Statistics and Probability
p.m.f. and c.d.f. of a continuous RV

Student’s commuting time from home to school

(Introduction to Econometrics) 30
Recall of Statistics and Probability
Joint probability distribution

I In economics, we are usually interested in answering questions


that involve more than one RV

I In general two RV can be


I dependent: college graduate and getting a job
I indipendent: rolling two dices

I Working with more than one RV requires a joint probability


function

(Introduction to Econometrics) 31
Recall of Statistics and Probability
Joint probability distribution

Let X and Y be two discrete RV, then (X,Y) have a joint distribution
that is described by the joint probability density function of (X,Y)

f (x, y ) = P(X = x, Y = y )

When the two variables are continuous, a joint p.d.f. can be defined as

f (x, y ) = f (x)f (y )

or Z b2 Z b1
f (a1 6 X 6 b1 , a2 6 Y 6 b2 ) = f (x, y )dxdy
a2 a1

(Introduction to Econometrics) 32
Recall of Statistics and Probability
Expected Value

The expected value of a random variable X, denoted E(X), with pdf


f (x), is the long-run average value of the random variable over many
repeated trials or occurrences
Discrete RV: X
E (X ) = µx = xf (x)
Continuous RV: Z
E (X ) = µx = xf (x)

Another measure of location is the median which is the value m for

which we have P{x 6 m} > 1/2 and P{x > m} 6 1/2

When a distribution is symmetric around its mean the mean and the
median are identical!

(Introduction to Econometrics) 33
Recall of Statistics and Probability
Expected Value

Properties

I E(a)=a

I E(x+y)=E(x)+E(y)

I E(ax)=aE(x)

I E(ax+b)=aE(x)+b

I Independent RV: E(x y)=E(x) E(y)

I Dependent RV: E(x y)=E(x) E(y) + cov(xy)

(Introduction to Econometrics) 34
Recall of Statistics and Probability
Covariance

I It is a measure of linear dependence between two random variables


I Measure the joint value of two random variables
I When two variables are independent the covariance is null
I If there is dipendence the covariance can only assume a positive
value

Cov (X , Y ) = σxy = E [(X − µx )(Y − µy )]

Cov (X , Y ) = M(XY ) − M(X )M(Y )


I Properties
I Cov(X,X) = V (X)
I Cov(aX, bY ) = ab Cov(X,Y )

(Introduction to Econometrics) 35
Recall of Statistics and Probability
Variance
The variance of a random variable X, denoted var(X), is the expected
value of the square of the deviation of X from its mean:

var (X ) = σx2 = E [(X − E (X )]2 = E (X 2 ) − E (X )2 = E [(X − µx )2 ]

The standard deviation is the square root of the variance and is σx


Properties
I The variance is always positive
I Var(a)=0
I Var (aX ) = a2 Var (X )
I Var (aX + b) = a2 Var (X )
I Var (X + Y ) = Var (X ) + Var (Y ) + 2Cov (XY )
I Var (X − Y ) = Var (X ) + Var (Y ) − 2Cov (XY )
I Cov 2 (XY ) ≤ var (X )var (Y )
(Introduction to Econometrics) 36
Recall of Statistics and Probability
Correlation

The correlation is an alternative measure of dependence between X


and Y
The correlation between X and Y is the covariance between X and Y
divided by their standard deviations:

Cov (XY )
ρ=
σx σy

Properties
I It has a value between +1 and -1
I Two variables are uncorrelated if ρ = 0
I Correlation does not need dipendence

(Introduction to Econometrics) 37
Recall of Statistics and Probability
Joint probability distribution

# Example

I Joint distribution:
P(X = 0, Y = 1) = 0.15
I Conditional distribution:
P(Y = 0|X = 0) = P(X = 0, Y = 0)/P(X = 0) =??
(Introduction to Econometrics) 38
Recall of Statistics and Probability
Conditional Distribution

Conditional expectation:
P
E (Y |X = x) = i=1 yi P(Y = yi |X = x)

Conditional variance:
var (Y |X = x) = i=1 [yi − E (Y |X = x)]2 P(Y = yi |X = x)
P

Properties
I E [E (Y |X )] = E (Y ) Law of iterated expectations
I E (aX + bZ |Y ) = aE (X |Y ) + bE (Z |Y )
I E (aX |X ) = aX
I E (X |Y ) = E (X ) is X and Y are independent

(Introduction to Econometrics) 39
Recall of Statistics and Probability
Conditional Distribution

I Given the following conditional distribution

Compute
I E (Y |X = 0)
I E (Y |X = 1)
I var (Y |X = 0)

(Introduction to Econometrics) 40
Recall of Statistics and Probability
Moments

I The mean of X, E(X), is also called the first moment of X

I The expected value of the square of X, E (X 2 ), is called the


second moment of Y
I In general, the expected value of Xr is called the rth moment of
the random variable X
I The skewness is a function of the first, second, and third
moments of Y
I The kurtosis is a function of the first through fourth moments of
Y

(Introduction to Econometrics) 41
Recall of Statistics and Probability
Skewness

The skewness of a distribution shows how much a distribution deviates


from symmetry
E (X − µx )3
Skewness =
σx3

I A distribution has positive (negative) skewness if it has a long


right (left) tail
I when E (X − µx )3 = 0 the distribution is symmetric

(Introduction to Econometrics) 42
Recall of Statistics and Probability
Skewness & Kurtosis

(Introduction to Econometrics) 43
Recall of Statistics and Probability
Kurtosis
The kurtosis of a distribution is a measure of how much mass is in its
tails and, therefore, is a measure of how much of the variance of Y
arises from extreme values
The greater the kurtosis of a distribution, the more likely are outliers
(extreme values)

E (X − µx )4
Kurtosis =
σx4
I Because E (X − µx )4 cannot be negative, the kurtosis cannot be
negative
I The kurtosis of a normally distributed random variable is 3
I A distribution with kurtosis exceeding 3 is called leptokurtic
I A distribution with kurtosis less than 3 is called platicurtic
I Like skewness, the kurtosis is unit free, so changing the units of X
does not change its kurtosis.
(Introduction to Econometrics) 44
Recall of Statistics and Probability
The Normal Distribution
The probability density function of a normally distributed random
variable (the normal p.d.f.) is
1 h 1  x − µ 2 i
x
f (x) = √ exp −
σ 2π 2 σ

The normal density with mean µ and variance σ 2 is symmetric around


its mean and has 95% of its probability between µ −1.96σ and +1.96σ

(Introduction to Econometrics) 45
Recall of Statistics and Probability
The Standard Normal Distribution
Using the trasformation Z = X σ−µ we get the standard normal
distribution which is a normal distribution with mean µ = 0 and
variace σ 2 = 1

# Example:
Suppose X is distributed N(1,4). What is the probability that X ≤ 2?
(Introduction to Econometrics) 46
Recall of Statistics and Probability
The χ2 Distribution
The χ2 distribution is the distribution of the sum of m squared
independent standard normal random variables.
This distribution depends on m, which is called the degrees of freedom
of the chi-squared distribution

Let’s suppose we have a standard normal distribution Z obtained by


the trasformation of X which is Z = x−µ
σ

X ∼ (µ, σ 2 )

Z ∼ (0, 1)
Taking the square of Z we get Z = ( x−µ 2
σ ) . The sum of all values of
2
Z gives the χm2

X X  X − µ 2
Z2 = = χ2
σ
(Introduction to Econometrics) 47
Recall of Statistics and Probability
The t Distribution
The Student t distribution with m degrees of freedom is the
distribution of the ratio of a standard normal random variable, divided
by the square root of an independently distributed chi-squared random
variable divided by m
Z
t(m) = q
χ2
m

I The Student t distribution depends on the degrees of freedom m.


I The Student t distribution has a bell shape similar to that of the
normal distribution, but when m is small (20 or less), it has more
mass in the tails - that is, it is a “fatter” bell shape than the
normal.
I When m is 30 or more, the Student t distribution is well
approximated by the standard normal distribution and the t∞
distribution equals the standard normal distribution.
(Introduction to Econometrics) 48
Recall of Statistics and Probability
The F Distribution

The F Distribution with m and n degrees of freedom Fm,n is the ratio


between two independent χ2 distributions
χ2m
m
Fm,n = χ2n
n

An important special case of the F distribution arises when the


denominator degrees of freedom is large enough that the Fm,n
distribution can be approximated by the Fm,∞ distribution. In this
limiting case, the denominator random variable is the mean of infinitely
many squared standard normal random variables, and that mean is 1
because the mean of a squared standard normal random variable is 1

(Introduction to Econometrics) 49

You might also like