0% found this document useful (0 votes)

54 views75 pages

Understanding Simple Linear Regression

The document discusses the principles of simple linear regression, focusing on the population model, ordinary least squares (OLS) estimation, and the assumptions required for unbiasedness and efficiency of the estimators. It explains the relationship between dependent and independent variables, the importance of error terms, and the conditions under which OLS provides the best linear unbiased estimators (BLUE). Additionally, it covers statistical inference, including hypothesis testing using t-tests under the classical linear model assumptions.

Uploaded by

devaau

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

54 views75 pages

Understanding Simple Linear Regression

Uploaded by

devaau

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

Simple Linear

Regression
Population Model
• Cross-sectional analysis
• Assume that sample is collected randomly from the population.
• We want to know how y varies with changes in x.
• What if y is affected by factors other than x.
• What is the functional form.
• How we can distinguish causality from correlation?
• Consider the following model, that hold in the population:
Population Model
• We allow for other factors to affect y by including u (error term).
• If the other factors in u are held fixed, ∆u = 0, then x has a linear
effect on y.
• Linearity: a one unit change in x has the same effect on y.

• The goal from empirical work is to estimate and (population

parameters).
• and are not directly observable.
• We estimate and using data and ASSUMPTIONS.
A simple assumption
●
The average value of u, the error term, in the population is 0: E(u) = 0
●
This is not a restrictive assumption, since we can always use 0 to
normalize E(u) to 0.
●
Show this!
Zero conditional mean / Mean
independence
●
We need to make a crucial assumption about how u and x are related
●
We want it to be the case that knowing something about x does not
give us any information about u, so that they are completely
unrelated.
●
E(u|x) = E(u) = 0, which implies
●
E(y|x) = 0 + 1x
●
This is the most crucial, challenging assumption for the interpretation
of 1 as a causal parameter.
E(y|x) as a linear function of x, where for any x
the distribution of y is centered about E(y|x)

f(y)

. E(y|x) = 0 + 1x

x1 x2
Ordinary Least Squares
●
Basic idea of regression is to estimate the population parameters
from a sample
●
Let {(xi,yi): i = 1, …,n} denote a random sample of size n from the
population
●
For each observation in this sample, it will be the case that: yi = 0 +
1xi + ui
●
ui is unobserved.
To calculate the estimates of the coefficients The regression equation that estimates
that minimize the differences between the data the equation of the first order linear model
points and the line, use the formulas: is:

cov(XX,,YY))
cov(

bb11 
s22
s xx bb00 bb11xx
ŷŷ 
yy bb11xx
bb00 
Example 17.1 Relationship between odometer
reading and a used car’s selling price.
• A car dealer wants to find
Car Odometer Price
the relationship between
1 37388 5318
the odometer reading and
2 44758 5061
the selling price of used cars. 3 45833 5008
• A random sample of 100 cars is selected, 4 30862 5795
and the data recorded. 5 31705 5784
• Find the regression line. 6 34010 5359
. . .
. . .
. . .

Independent variable x
Dependent variable y
9
Solution
• Solving by hand
• To calculate b0 and b1 we need to calculate several statistics first;

x 36,009.45; s 2x 
 ( x i  x) 2
43,528,688
n 1

y 5,411 .41; cov( X , Y ) 

 ( x  x )( y
i i  y)
 1,356,256
n 1
where n = 100.

cov(X , Y )  1,356,256
b1    .0312
s 2x 43,528,688
b 0 y  b1x 5411.41  (  .0312)(36,009.45) 6,533
ŷ b 0  b1x 6,533  .0312x
10
Alternative approach in deriving OLS
Estimates
●
To derive the OLS estimates we need to realize that our
main assumption of E(u|x) = E(u) = 0 also implies that
●
Cov(x,u) = E(xu) = 0
●
Why? Remember from basic probability that Cov(X,Y) =
E(XY) – E(X)E(Y).
●
We can write our 2 restrictions just in terms of x, y, 0 and
 , since u = y – 0 – 1x
Alternate approach, continued
●
If one uses calculus to solve the minimization problem for the
two parameters you obtain the following first order
conditions, which are the same as we obtained before,
multiplied by n:
Deriving OLS continued
●
We can write our 2 restrictions just in terms of x, y, 0 and  ,
since u = y – 0 – 1x
E(y – 0 – 1x) = 0
E[x(y – 0 – 1x)] = 0

●
These are called moment restrictions.
• and are the estimates from the data.
More Derivation

Plug into the second equation!

Summary of OLS slope
estimate
●
The slope estimate is the sample covariance between x and
y divided by the sample variance of x.
●
If x and y are positively correlated, the slope will be positive
●
If x and y are negatively correlated, the slope will be
negative
●
Only need x to vary in our sample
More OLS
●
Intuitively, OLS is fitting a line through the sample points
such that the sum of squared residuals is as small as
possible, hence the term least squares.
●
The residual, û, is an estimate of the error term, u, and is
the difference between the fitted line (sample regression
function) and the sample point
Sample regression line, sample data points
and the associated estimated error terms
A short simulation
Residuals and fitted values are uncorrelated, by construction!
Algebraic Properties of OLS
●
The sum of the OLS residuals is zero, coefficients were
optimally chosen to ensure that the residuals sum to zero.
●
Thus, the sample average of the OLS residuals is zero as well.
●
The sample covariance (correlation) between the regressors
and the OLS residuals is zero.
●
Because fitted values are linear functions of the , fitted
values and residuals are uncorrelated too.
●
The OLS regression line always goes through the mean of
the sample.
●
If we plug we predict , that is the point is on the OLS regression
line:
Algebraic Properties of OLS
• Residuals sum to zero!
• , since

• Sample covariance between x and residuals is always zero:

• The fitted values and residuals are uncorrelated too:

• The OLS regression line always goes through the mean of the sample.
Goodness of Fit
●
How do we think about how well our sample regression line fits our sample data?
●
Can compute the fraction of the total sum of squares (SST) that is explained by
the model, call this the R-squared of regression.
●
We can think of each observation as being made up of an unexplained part, and
an explained part
• We then define the following:

is the total sum of squares (SSE).

is the explained sum of squares(SSE).

is the residual sum of squares (SSR).

Proving SST = SSE + SSR
Goodness of Fit
• Then SST = SSE + SSR
●
R2 = SSE/SST = 1 – SSR/SST
●
Coefficient determination. It is interpreted as the fraction of the sample
variation in y that is explained by x.
• An R2 of zero means no linear relationship between and
• R2 of one means a perfect linear relationship.
• As R2 increases, the are closer and closer to falling on the OLS regression line.
• R2 never decreases when x is added into the model.
• R2 is a useful summary measure, it does not tell us about causality.
Unbiasedness of OLS
• So far, when we apply OLS on a sample, residuals always average to
zero, regardless of any underlying model.
• Now, we will study the statistical properties of the OLS estimator,
referring to a population model and assuming random sampling.
• How do estimators behave across different samples of data?
• Will we get the right answer if we repeatedly sample?
• We need to find the expected value of across all possible random
samples, and determine whether we are right, on average.
• Unbiasedness:
Unbiasedness of OLS
• is estimator for a specific sample.
• Different samples will generate different .
• Unbiasedness means if we could take as many random samples as we
want and compute each time, the average of the estimates would
be .
Unbiasedness of OLS
●
Assume the population model is linear in parameters: y = 0 + 1x + u.
●
Assume a random sample of size n, {(xi, yi): i=1, 2, …, n}, from the
population.
●
Thus we can write the sample model yi = 0 + 1xi + ui
●
Assume E(u|x) = 0 and thus E(ui|xi) = 0
●
Assume there is variation in the xi
●
How do we show OLS estimator is unbiased, ?
The last term is the slope coefficient from regression of ui on xi .
But this is an imaginary regression since ui is unobserved.
Monte Carlo Simulation

• Suppose we have the following population model:

y = 3 + 2x + u

Where ,
x and u are independent

• We will estimate OLS 1000 times

Monte Carlo Simulation
Unbiasedness of OLS
• Unbiasedness is a property of the procedure of the rule, it is not a
property of the estimate!
• Proof of unbiasedness depends on all of the four assumptions.
Variance of the OLS estimators
• Now, we need to capture the uncertainty in the sampling process.
• The dispersion in the sampling distribution of the estimators.
• The assumptions before are not sufficient to tell us anything about
the variance in the estimator.
• Assume to simplify calculation: Homoscedasticity/constant variance
• u has the same variance given any value of x: V(u|x) =
• is the variance of the stuff other than x that influence y.
Homoskedastic Case
y
f(y|x)

. E(y|x) = 0 + 1x

x1 x2
Heteroskedastic Case
f(y|x)

y .
. E(y|x) = 0 + 1x

.
x1 x2 x3 x
Variance of OLS estimators

The average value of y is allowed to change with x.

The variance does not change with x (homoscedastic).
Sampling variance of OLS

Read Wooldridge section 2-5B (variance of the OLS estimators) to

derive the above results.
Sampling variance of
• is not valid if homoscedastic errors does not hold.

• Remember, homoscedasticity is not used to show unbiasedness!!

• As increased, so does ; the more noise in the relationship between y and x (i.e.
the larger the variability in u), the harder it is to learn something about 1.

• As SSTx rises, decreases; more variation in xi is good.

• Now, we need to estimate , the error variance.

Estimating
• Replace each ui with its estimate

Note that and

The unbiased estimator of uses degrees
of freedom adjustment:
The unbiased estimator of under the FIVE ASSUMPTIONS is:

The standard error of the regression (an estimate of the standard

deviation of the error in the regression):

STATA calls it the root mean square error (RSME).

Given , we can now estimate sd(0) and sd(1)
Gauss-Markov Assumptions
1. Linear in parameters
2. Random sampling
3. Sample variation in x:
4. Zero conditional mean/conditional independence: E(u|x) = 0
5. Homoscedasticity
Under the five assumptions, OLS estimators are Best Linear Unbiased
Estimators (BLUE)
-Best: In the class of LUE, OLS has the smallest variance.
- None will be better than OLS
Robust Standard Errors
• Homoscedasticity is an exception. In real life, errors are
heteroscedastic.
• Unbiasedness does not depend on the assumption about the variance
of the error.
• If errors are heteroscedastic:
Robust Standard Errors
• A valid estimator of under heteroscedasticy of any form (including
homoscedasticity) is

• Option “robust” in stata.

Statistical Inference
Assumption of the Classical Linear
Model (CLM)
• So far, we know that given the Gauss-Markov assumptions, OLS is
BLUE.
• To do classical hypothesis testing, we need to add another
assumption (beyond the Gauss-Markov). Why?
• Under Gauss Markov assumption, the distribution of can be any shape.
• Assume that u is independent of and u is normally distributed with
zero mean and variance : u ∼ Normal(0, ).
• So, assumption of CLM is : Gauss-Markov assumptions + normality
assumption
CLM Assumptions (cont’d)
• Under CLM, OLS is not only BLUE, but is the minimum variance among
ALL unbiased estimator (not necessarily among linear estimators).
• We can summarize the population assumptions of CLM as follows:
• Conditional on x, y has normal distribution with mean (linear in x) and a
constant variance ().
y|x ∼ Normal()
• Normality is sometimes not the case.
• Nonnormality of the errors is not a serious problem with large sample
sizes.
Normal Sampling Distributions
• Under the CLM assumptions,
The t test
• Under the CLM assumptions:
• Note this is a t distribution (vs normal) because we estimate by .
• Knowing the sampling distribution for the standardized estimator
allows us to carry out hypothesis tests.
• Start with a null hypothesis
• For example,
• If fail to reject null, then has no effect on y, controlling for other x’s.
T-test (cont’d)
• To perform our test we first need to form the t statistics for βj :

• We will then use our t statistic along with a rejection rule to

determine whether to “accept” the null hypothesis.
T-test: One sided alternatives
• Besides our null, , we need an alternative hypothesis, , and a
significance level.
• may be one-sided, or two-sided.
• and are one-sided.
• and are two-sided.
• If we want to have only a 5% probability of rejecting if it is true, then
we say our significance level is 5%.
T-test: One sided alternatives
• Having picked a significance level, we look up the (1 − α)th percentile
in a t distribution with n – k – 1 df and call this c, the critical value.
• We can reject the null hypothesis if the t statistic is greater than the
critical value.
• If the t statistic is less than the critical value then we fail to reject the
null.
One-Sided Alternatives (cont)
One sided vs two-sided
• Because the t distribution is symmetric, testing is straightforward.
The critical value is just the negative of before.
• We can reject the null if the t-stat < –c, and if the t-stat > –c then we
fail to reject the null.
• For a two-sided test, we set the critical value based on α/2 and reject
if the absolute value of the t-stat > c.
Two-Sided Alternatives
Summary for
• Unless otherwise stated, the alternative is assumed to be two-sided.
• If we reject the null, we typically say “ is statistically significant at the
α% level”
• If we fail to reject the null, we typically say “ is statistically
insignificant at the α% level”.
Computing p-values for t tests
• An alternative to the classical approach is to ask, “what is the smallest
significance level at which the null would be rejected?”
• So, compute the t statistic, and then look up what percentile it is in
the appropriate t distribution – this is the p-value.
• p-value is the probability we would observe the t statistic we did, if
the null were true.
Confidence Interval
• Another way to use classical statistical testing is to construct a
confidence interval using the same critical value as was used for a
two-sided test.
• A (1 − α)% confidence interval is defined as , where c is the (1 − α/2 )
percentile in a distribution.
Testing Other Hypothesis
• A more general form of the t statistic recognizes that we may want to
test something like
• In this case, the appropriate t statistic is
• for the standard test.
Stata and p-values, t tests, etc.
• Most computer packages will compute the p-value for you, assuming
a two-sided test.
• If you really want a one-sided alternative, just divide the two-sided p-
value by 2.
• Stata provides the t statistic, p-value, and 95% confidence interval for
for you, in columns labeled “t”, “P > |t|” and “[95% Conf. Interval]”,
respectively.
Regression with Stata
The F-stat
●
In a regression model with k independent variables,
H0: 1 = 2 =…= k = 0
H1: H0 is not true (at least one of the s is different from
zero.
How to proceed?
-
t-stat tests a hypothesis that puts no restrictions on the other
parameters
-
Further, we would have three t-stats. What constitute a rejection
at 5% level?
The F-stat
• Run restricted and unrestricted model, find the SSR.
• How much SSR increases when we drop q variables from the model 
restricted model.
• Whether the increase in SSR is large enough relative to the SSR in the
model with all of the variables  unrestricted model.

• F stat measuring the relative increase in the SSR when moving from
the unrestricted to the restricted model.
F-stat from R-squared
• Sometimes it is more convenient to compute F-stat using R-squares
than SSR.
• SSR = SST(1-)
F-stat (cont’d)
●
Just as with t statistics, p-values can be calculated by looking
up the percentile in the appropriate F distribution
●
If only one exclusion is being tested, then F = t2, and the p-
values will be the same.
●
If Ho is fail to be rejected, this means that we must look for
other variables to explain y.
The F-statistic for Overall
Significance of a Regression
• We use F-statistics

●
Small R-squared sometimes results in a highly significant F stat.
●
That’s why we must look F-stat for joint significance on top of R-
squared.

Understanding Simple Regression Model
No ratings yet
Understanding Simple Regression Model
41 pages
1 - The Simple Regression Model
No ratings yet
1 - The Simple Regression Model
41 pages
Pertemuan 3 - Simple Linear Regression
No ratings yet
Pertemuan 3 - Simple Linear Regression
19 pages
Simple Linear Regression Explained
No ratings yet
Simple Linear Regression Explained
41 pages
Lecture 2
No ratings yet
Lecture 2
39 pages
The Simple Regression Model
No ratings yet
The Simple Regression Model
24 pages
Pertemuan 2 - Simple Linear Regression
No ratings yet
Pertemuan 2 - Simple Linear Regression
24 pages
Lecture 2 SLR - 1
No ratings yet
Lecture 2 SLR - 1
28 pages
Understanding Simple Regression and OLS
No ratings yet
Understanding Simple Regression and OLS
29 pages
Understanding Simple Regression Analysis
No ratings yet
Understanding Simple Regression Analysis
37 pages
Linear Regression Fundamentals in Econometrics
No ratings yet
Linear Regression Fundamentals in Econometrics
12 pages
Understanding Simple Regression Models
No ratings yet
Understanding Simple Regression Models
45 pages
Understanding OLS and Sampling Variation
No ratings yet
Understanding OLS and Sampling Variation
31 pages
Lecture 03 JEB109 2023
No ratings yet
Lecture 03 JEB109 2023
26 pages
Understanding Simple Regression Models
No ratings yet
Understanding Simple Regression Models
28 pages
Understanding Simple Regression Analysis
No ratings yet
Understanding Simple Regression Analysis
42 pages
Ordinary Least Squares Linear Regression Review: Week 4
No ratings yet
Ordinary Least Squares Linear Regression Review: Week 4
10 pages
Econometrics Bacheror's Lectures Utrecht University
No ratings yet
Econometrics Bacheror's Lectures Utrecht University
24 pages
Understanding Ordinary Least Squares (OLS)
100% (1)
Understanding Ordinary Least Squares (OLS)
47 pages
Summary of Econometrics Chapters 3-5
No ratings yet
Summary of Econometrics Chapters 3-5
64 pages
Regression With One Regressor
No ratings yet
Regression With One Regressor
25 pages
Linear Regression Basics and OLS Estimation
No ratings yet
Linear Regression Basics and OLS Estimation
41 pages
Ch.2 The Simple Regression Model
No ratings yet
Ch.2 The Simple Regression Model
6 pages
Simple Linear Regression Overview
No ratings yet
Simple Linear Regression Overview
7 pages
Multiple Regression Analysis Overview
No ratings yet
Multiple Regression Analysis Overview
27 pages
Quantitative Methods for Finance Overview
No ratings yet
Quantitative Methods for Finance Overview
21 pages
Understanding Regression Analysis Basics
No ratings yet
Understanding Regression Analysis Basics
31 pages
Econometrics for Students
No ratings yet
Econometrics for Students
28 pages
Understanding Linear Regression Models
No ratings yet
Understanding Linear Regression Models
3 pages
Econometrics 8
No ratings yet
Econometrics 8
35 pages
Linear Regression: OLS Estimation Guide
No ratings yet
Linear Regression: OLS Estimation Guide
47 pages
Simple Regression & OLS Explained
No ratings yet
Simple Regression & OLS Explained
41 pages
Univariate Regression with OLS Analysis
No ratings yet
Univariate Regression with OLS Analysis
72 pages
ECC321 Chapter2
No ratings yet
ECC321 Chapter2
5 pages
Econometrics: OLS and Regression Analysis
No ratings yet
Econometrics: OLS and Regression Analysis
84 pages
Simple Linear Regression Overview
No ratings yet
Simple Linear Regression Overview
7 pages
3-Econometrics-Linear Regression
No ratings yet
3-Econometrics-Linear Regression
13 pages
Understanding Simple Linear Regression
No ratings yet
Understanding Simple Linear Regression
6 pages
Understanding Linear Regression Coefficients
No ratings yet
Understanding Linear Regression Coefficients
18 pages
OLS Properties and Goodness of Fit
No ratings yet
OLS Properties and Goodness of Fit
8 pages
Finance Exam: Linear Regression Cheat Sheet
No ratings yet
Finance Exam: Linear Regression Cheat Sheet
2 pages
ECO375H Slides 3
No ratings yet
ECO375H Slides 3
39 pages
CH 03
No ratings yet
CH 03
17 pages
Simple Regression Analysis in Economics
No ratings yet
Simple Regression Analysis in Economics
9 pages
Properties of OLS
No ratings yet
Properties of OLS
13 pages
Econometrics II: Revision Class: Introduction To Econometrics
No ratings yet
Econometrics II: Revision Class: Introduction To Econometrics
55 pages
R18&19
No ratings yet
R18&19
32 pages
Econometric S
No ratings yet
Econometric S
8 pages
Class Size Impact on Test Scores
100% (2)
Class Size Impact on Test Scores
84 pages
Simple Regression Model Explained
100% (1)
Simple Regression Model Explained
48 pages
Understanding Multiple Linear Regression
No ratings yet
Understanding Multiple Linear Regression
14 pages
Multiple Regression Analysis Explained
No ratings yet
Multiple Regression Analysis Explained
26 pages
OLS Estimates Properties in Econometrics
No ratings yet
OLS Estimates Properties in Econometrics
20 pages
Linear Regression for Statisticians
No ratings yet
Linear Regression for Statisticians
51 pages
Understanding Multiple Regression Analysis
No ratings yet
Understanding Multiple Regression Analysis
28 pages
OLS Estimator: Key Statistical Insights
No ratings yet
OLS Estimator: Key Statistical Insights
12 pages
Multiple Regression Analysis Guide
No ratings yet
Multiple Regression Analysis Guide
11 pages
IoP-IUB Thesis Guidelines
No ratings yet
IoP-IUB Thesis Guidelines
26 pages
Bible Reading as Coping for Stress
100% (1)
Bible Reading as Coping for Stress
12 pages
ANOVA
No ratings yet
ANOVA
3 pages
Statistics: Leonida M. Falculan, RPM, LPT, Maed
No ratings yet
Statistics: Leonida M. Falculan, RPM, LPT, Maed
40 pages
Statistical Inference For Engineers and Data Scientists Solutions Manual
No ratings yet
Statistical Inference For Engineers and Data Scientists Solutions Manual
12 pages
Job Advertisement of COMESA 10062025
No ratings yet
Job Advertisement of COMESA 10062025
26 pages
Syllabus Class - XI 2024-25
No ratings yet
Syllabus Class - XI 2024-25
12 pages
Mechanical Reasoning and Space Perception
No ratings yet
Mechanical Reasoning and Space Perception
4 pages
Deep Learning for Time Series
No ratings yet
Deep Learning for Time Series
16 pages
M.Planning Urban Planning Syllabus KTU
No ratings yet
M.Planning Urban Planning Syllabus KTU
13 pages
Revised Textile Engineering Syllabus 2010
No ratings yet
Revised Textile Engineering Syllabus 2010
51 pages
UVEB Technology 319 Web
No ratings yet
UVEB Technology 319 Web
80 pages
The End of Theory - Data Deluge Makes The Scientific Method Obsolete
No ratings yet
The End of Theory - Data Deluge Makes The Scientific Method Obsolete
3 pages
Standard Deviation and Standard Error
No ratings yet
Standard Deviation and Standard Error
5 pages
Marine Benthic Diversity via DNA Barcoding
No ratings yet
Marine Benthic Diversity via DNA Barcoding
17 pages
Unit 20
No ratings yet
Unit 20
41 pages
Machine Learning Assignment
No ratings yet
Machine Learning Assignment
2 pages
Unit V
No ratings yet
Unit V
21 pages
Sample Size Calculations Guide
No ratings yet
Sample Size Calculations Guide
1 page
Probability Distributions Explained
No ratings yet
Probability Distributions Explained
56 pages
Pearson Type III Percentage Points Table
No ratings yet
Pearson Type III Percentage Points Table
12 pages
Language Socialisation in Vocational Education and Training VET New Directions For Language Education
No ratings yet
Language Socialisation in Vocational Education and Training VET New Directions For Language Education
34 pages
A Study On Impact of Flexible Working Condition Towards Job Satisfaction at Himatsingka Seide Limited in Doddaballapura
No ratings yet
A Study On Impact of Flexible Working Condition Towards Job Satisfaction at Himatsingka Seide Limited in Doddaballapura
12 pages
Probability and Statistics
No ratings yet
Probability and Statistics
4 pages
Burnout and Self-Efficacy Analysis
No ratings yet
Burnout and Self-Efficacy Analysis
10 pages
Winkler & Matthews 2014
No ratings yet
Winkler & Matthews 2014
9 pages
Syllabus DIS 651 PDF
No ratings yet
Syllabus DIS 651 PDF
2 pages
Mader Biology Chap.1
No ratings yet
Mader Biology Chap.1
67 pages
Process Validation
No ratings yet
Process Validation
883 pages
CH 3 - Introduction & Stats Treatment
No ratings yet
CH 3 - Introduction & Stats Treatment
5 pages

Understanding Simple Linear Regression

Uploaded by

Understanding Simple Linear Regression

Uploaded by

Simple Linear

• The goal from empirical work is to estimate and (population

y 5,411 .41; cov( X , Y ) 

Plug into the second equation!

• Sample covariance between x and residuals is always zero:

• The fitted values and residuals are uncorrelated too:

is the total sum of squares (SSE).

is the explained sum of squares(SSE).

is the residual sum of squares (SSR).

• Suppose we have the following population model:

• We will estimate OLS 1000 times

The average value of y is allowed to change with x.

Read Wooldridge section 2-5B (variance of the OLS estimators) to

• Remember, homoscedasticity is not used to show unbiasedness!!

• As SSTx rises, decreases; more variation in xi is good.

• Now, we need to estimate , the error variance.

Note that and

The standard error of the regression (an estimate of the standard

STATA calls it the root mean square error (RSME).

• Option “robust” in stata.

• We will then use our t statistic along with a rejection rule to

You might also like