Professional Documents
Culture Documents
where
– β1 is intercept
– β1… βk are partial slope coefficients i
yi
x i
• they are the "ceteris paribus" effect of their variation on Y
– εi is the error term
• This is equivalent to
•
• in vector form, the econometric model can be
rewritten as
y = x’β +ε (4)
where
– y represent the dependent variable (a nx1 column vector)
– x represent the vector of independent variables (nxk).
– β is the vector of parameters to be estimated (kx1)
– ε is the vector of error terms (nx1column vector)
The Gauss-Markov Assumptions for
Multiple Regression
• Linearity in Parameters
• Random sampling
• No Perfect Collinearity)
– In the sample none of the independent variables is
constant, and there are no exact linear relationships
among the independent variables
• None of the independent variables is a multiple of another
• None has perfect correlation with a linear combination of the
others . Otherwise it would mean that some variable was
redundant - can't "identify“ the parameters
– X has full column rank (k+1)
• Zero Conditional Mean:
– The error ε has an expected value of zero given
any values of the independent variables
E(ε|x1, x2, ... xk) = 0 implies E(ε) = 0
• Spherical disturbance:
– Constant variance-covariance matrix of the error tem
E(εε’) = δ2I .....w here I is identity matrix
2
0 .... 0 1 0 .... 0
2
0 .... 0 2 0 1 .... 0 2
.... .... .... .... .... .... .... ....
0 0 .... 2 0 0 .... 1
(5)
Estimation… (cont’d)
• Notes:
– The assumption of full column rank (=k+1) for vector X
ensures that the matrix X′X is invertible , hence, that the OLS
estimator b can be obtained.
• X′X is invertible means (X′X)-1 exists
– We know that
and
– Then, we have
• Make all the necessary substitutions:
• Variance of the error term (σ2)
– The variance of β’s depend on the variance of the error
term.
– In practice, the error variance σ2 is unknown, we must
estimate it
– Following the logic we used in simple linear regression
model, we estimate it as
•
Properties… (cont’d)
• Unbiasedness
– Under GMAs, the OLS estimator is unbiased for β
– Proof:
• we knew that y=xβ+ε and
• if we substitute the former in to the latter, we will find
• Minimum variance
– Among all linear and unbiased estimates, β^ has minimum
variance
•
• Normality
– Since ε∼ N(0, σ2), it follows that:
β^∼ N[0, σ2(X′X)−1]
Measure of Goodness of Fit
• By the analogy of the simple regression model,
we can have:
•
Measure of Goodness of Fit
• How well does the estimated model fit the
data?
– One GOF measure is the standard error of the regression
• So, this decomposition is also valid for the multiple linear regression model
– We can define the same three sums of squares-SST, SSE,
SSR- as in simple regression, and R2 is still the ratio of the
explained sum of squares (SSE) to the total sum of squares
(SST):
• Note that
– only the RSS depends on X. by adding more regressors, RSS will never
rise (and will typically fall).
– Thus, R2 could be made artificially large by using irrelevant regressors
• So, we should not be impressed with high value of R2 in a model
with a long list of explanatory variables
• The R2 can be used to compare different models with
the same dependent and same number of explanatory
variables.
• Thus, when comparing models with the same
dependent variable but with different number of
independent, R2 is not a good measure
• Adjusted-R2
– To avoid this problem, R2 can be adjusted by
accounting for the degrees of freedom
– Adjusted-R2 is calculated as follows:
•
• Step 1: Regress y on xi2
• Step 2: Regress xi1 on xi2
• Step 2: Regress ui2 on ui1
• Partialling out:
– Let's focus on the case of two explanatory variables (besides the
constant), so k=2
•
• Let us estimate the effect of education on wages,
taking also into account the effect of experience
– Start by regressing education on experience (even if it
seems silly...), storing the residuals of this regression
– Then regress wages on the previously stored residuals
• What if we regress wages on education and experience directly?
–
• How do we interpret the beta coefficients?
– This estimated equation shows that
• by one standard deviation increase in oxide decreases price
by .34 standard deviations and
• a one standard deviation increase in crime reduces price by
.14 standard deviation.
– Thus, the same relative movement of pollution has a
larger effect on housing prices than crime does.
– Size of the house, as measured by number of rooms
(rooms), has the largest standardized effect.
Hypothesis testing and Inferences
• A hypothesis test is a statistical inference
technique that assesses whether the
information provided by the data (sample)
supports or does not support a particular
conjecture or assumptions about the
population.
• Testing the significants of a single coefficient
– Hypothesis testing about individual partial regression
coefficient can be conducted using the t -statistic as
usual.
• Testing the multiple coefficients
– One of the things we can do in the MLR which we cannot do in the CLR
is to test hypothesis which involve several of the regression
coefficients simultaneously.
• Post hoc t Tests
– Besides restricted least squares, we also can test linear
restrictions and (within that) equality of coeffcients using a
post hoc t -test.
–
a) Equality of regression coefficients
• This can be tested using t — statistic.
b) Test for linear restrictions
Multiple Regression with Dummy
Independent Variable
• The multiple regression model often contains
qualitative factors, which are not measured in
any units, as independent variables:
– gender, race or nationality
– employment status or home ownership
– temperatures before 1900 and after (including)
1900
• Such qualitative factors often come in the form of
binary information and are captured by defining a
zero-one variable, called dummy variables.
– take value 1 in a category and value 0 “otherwise".
– “Otherwise" can represent one or more other
categories
• Warning: Dummy variables can be used only
as regressors. Why?
– This is because the binomial feature violates the
normal distribution assumption which renders t-
statistics invalid.
• Should the dependent variable be binomial,
you need to use Logit or Probit regression
models, which employ ML estimator.
End of chapter 3
10Q!