4 views

Uploaded by Jheena Yousafzai

- HARVARD Guide on Economics Research Paper
- Lec15 Regression Review
- kaplan
- Optimization of Growth
- Assignment 1
- Regresi Sederhana 135020300111004 Ahmad Shahab.doc.
- Surfactant Analysis
- The Use of Statistical Methods in Mechanical Engin
- Cashew is a High Value Tree Crop That is Well Suited to Being Grown in Several Parts of Africa
- Analysis, Evaluation, And Optimization of Kinetic Parameters
- econ1447-1
- Stats 3 Note 04
- Bivariate Regression Models
- PowerRatingEvalu Nelson Hansen
- Regression.pdf
- Output
- Homework 5
- Stat Intro Comments
- Brm Analysis
- 700579308

You are on page 1of 40

Simple Linear Regression Model

Chapter 2

Multiple Regression Analysis

Chapters 3 and 4

Advanced Regression Topics

Chapter 6

Dummy Variables Chapter 7

Note: Appendices A, B, and C are

additional review if needed.

2.1 Definition of the Simple Regression

Model

2.2 Deriving the Ordinary Least Squares

Estimates

2.3 Properties of OLS on Any Sample of

Data

2.4 Units of Measurement and Functional

Form

2.5 Expected Values and Variances of the

OLS Estimators

2.6 Regression through the Origin

Economics is built upon assumptions

-assume perfect information

-assume we have a can opener

The Simple Regression Model is based on

assumptions

more analysis

-disproving assumptions leads to

more complicated models

Recall the SIMPLE LINEAR REGRESION

MODEL:

y 0 1 x u

(2.1)

-also called the two-variable linear regression

model or bivariate linear regression model

y is the DEPENDENT or EXPLAINED variable

x is the INDEPENDENT or EXPLANATORY

variable

y is a function of x

Recall the SIMPLE LINEAR REGRESION

MODEL:

y 0 1 x u

(2.1)

variable

-u takes into account all factors other

than x that affect y

-u accounts for all unobserved impacts

on y

Example of the SIMPLE LINEAR REGRESION

MODEL:

taste 0 1cookingtime u

(ie)

-taste is explained by cooking time

-taste is a function of cooking time

-u accounts for other factors affecting

taste (cooking skill, ingredients available,

random luck, differing taste buds, etc.)

The SRM shows how y changes:

y 1x if u 0

(2.2)

would cause a 6 unit change in y (2 x 3 =

6)

-B1 is the SLOPE PARAMETER

-B0 is the INTERCEPT PARAMETER or

CONSTANT TERM

-not always useful in analysis

y 0 1 x u

(2.1)

CONSTANT returns

-the first x has the same impact on y

as the 100th x

-to avoid this we can include powers or

change functional forms

-in order to achieve a ceteris paribus

analysis of xs affect on y, we need

assumptions of us relationship with x

-in order to simplify our assumptions, we

first assume that the average of u in the

population is zero:

E (u) 0

(2.5)

always be modified to make (2.5) true

-ie: if E(u)>0, simply increase B1

-we now need to assume that x and u are

unrelated

-if x and u are uncorrelated, u may still be

correlated to functions such as x2

-we therefore need a stronger assumption:

E (u | x) E (u ) 0

(2.6)

x

-second equality comes from (2.5)

-called the ZERO CONDITIONAL MEAN

2.1 Example

Papermark 0 1Paperquality u

(ie)

applied paper, in particular length

exceeding 10 pages

-assumption (2.6) requires that a papers

length does not depend on how good it is:

Conditional Expectations of (2.1) and (2.6) give

us:

E (y | x) 0 1x

(2.8)

FUNCTION (PRF)

-a one unit increase in x increases the

expected value of y by B1

-B0+B1x is the systematic (explained) part of

y

-u is the unsystematic (unexplained) part of y

In order to estimate B1 and B2, we need sample

data

observations from the population

yi 0 1x i u i

(2.9)

ui

-y5 indicates the observation of y from

the 5th data point

-this regression plots a best fit line

through our data points:

These OLS estimates create a straight line going

through the middle of the estimates:

Estimates

In order to derive OLS,

we first need assumptions.

We must first assume that u has zero expected value:

E (u) 0

(2.10)

between x and u is zero:

Cov ( x, u ) E ( xu ) 0

(2.11)

terms of x and y as:

E (y - 0 - 1x) 0

E[ x(y - 0 - 1x)] 0

(2.12)

(2.13)

Estimates

-(2.12) and (2.13)

imply restrictions on the

-given SAMPLE data, these equations become:

1 n

- x ) 0

(y

(2.14)

i

o

1 i

n i 1

1

n

- x ) 0

x

(y

i i o 1i

(2.15)

i 1

-notice that the hat above B1 and B2 indicate we are now dealing with estimates

-this is an example of method of moments estimation (see Section C for a

discussion)

Estimates

Using summation

properties, (2.14)

simplifies to:

y 0 1 x

(2.16)

0 y 1 x

(2.17)

-therefore given data and an estimate of the slope, the estimated

intercept can be determined

Estimates

By cancelling out

1/n and combining (2.17)

and 2.15 we get:

n

x) x ] 0

x

[

y

(

y

i i

1

1

i 1

n

i 1

i 1

x

(

y

y

)

i i

1 xi ( xi x )

Estimates

Recall the algebraic

properties:

n

x [ x x] [ x x]

i 1

i 1

And

n

x [y

i 1

y ] [ xi x][ yi y ]

i 1

Estimates

We can make the

simple assumption that:

n

[ x x]

i 1

(2.18)

-ie: you didnt do a survey where one question is are you alive?

-This is essentially the key assumption needed to estimate B 1hat

All this gives usEstimates

the OLS estimate for B1:

n

[ x x][ y

i

i 1

[ x x]

i 1

y]

(2.19)

2

Note that assumption (2.18) basically ensured the denominator is not zero.

-also note that if x and y are positively (negatively) correlated, B1hat will

be positive (negative)

FITTED value for y when x=xi:

yi 0 1x i

(2.20)

-the predicted ys can be greater than, less than or (rarely) equal to

the actual ys

2.2 Residuals

and the estimates is the ESTIMATED error,

or residuals:

ui yi y i yi 0 1x i

(2.21)

-these residuals ARE NOT the same as the actual error term

2.2 Residuals

expressed as:

n

ui ( yi 0 1x i )

2

(2.22)

i 1

-if B1hat and B2hat are chosen to minimize (2.22), (2.14) and (2.15) are our FIRST ORDER

CONDITIONS (FOCS) and we are able to derive the same OLS estimates as above (2.17)

and (2.19)

-the term OLS comes from the fact that the square of the residuals is minimized

-Why not minimize the residuals themselves?

-Why not minimize the cube of the residuals?

-not all minimization techniques can be expressed as formulas

-OLS has the advantage of deriving unbiasedness, consistency, and

other important statistical properties.

REGRESSION LINE:

y 0 1x

-B0hat is the predicted value of y when x=0

-not always a valid value

-(2.23) is also called the SAMPLE REGRESSION FUNCTION (SRF)

-different data sets will estimate different Bs

(2.23)

Estimates

The slope estimate:

1 y/x

(2.24)

alternatively,

y 1x

(2.25)

estimate the change in y.

Notes:

estimate OLS is difficult with more

than a few data points, econometrics

software (like Shazam) must be used.

2) A successful regression cannot

conclude on causality, only comment

on positive or negative relations

between x and y

3) We often use the terminology

regress y on x to estimate y=f(x)

Data

Review

properties are needed in order to build

OLSs foundation

-OLS (B1hat and B2hat) can be used to

calculate fitted values (yhat)

-the residual (u) is the difference

between the actual y values and the

estimated y values (yhat)

u=y-yhat

Here yhat underpredicts y

uhat

yhat

all residuals is zero:

n

u

i 1

(2.30)

between the regressors and the OLS residuals is zero:

n

x u

i 1

i i

(2.31)

to the required sample covariance

OLS regression line (from 2.16):

y 0 1 x

(2.16)

1) From (2.30) we know that the sample

average of the fitted y values equals the

sample average of the actual y values:

y y

Further Algebraic Gymnastics:

2) 2.30 and 2.31 combine to prove that the

covariance between yhat and uhat is zero

Therefore OLS breaks down yi into two

uncorrelated parts a fitted value and a

residual:

yi y i ui

(2.32)

From the idea of fitted and residual

components, we can calculate the TOTAL

SUM OF SQUARES (SST), the EXPLAINED

SUM OF SQUARES (SSE) and the

n

RESIDUAL SUM OF

SQUARES (SSR)

2

SST (y i - y)

(2.33)

i 1

n

SSE (y i - y)

(2.34)

i 1

n

SSR (y i - y i ) 2

i 1

(

u

)

i

i 1

(2.35)

SST measures the sample variation in y.

SSE measures the sample variation in yhat

(the fitted component.

SSR measures the sample variation in uhat

(the residual component.

These relate to each other as follows:

(2.36)

The proof of (2.36) is as follows:

2

2

(y

y

)

[(y

y

)

(

y

y

)]

i

i i

i

[(ui ) ( y i y )]2

2

2

(ui ) 2 u i (yi y ) ( yi y )]

SSR 2 u i (y i y ) SSE

between residuals and fitted values is zero,

2 u i (yi y ) 0

(2.37)

Data

Notes

inter-variable covariance is available in

section C for individual study

-SST, SSE and SSR have differing

interpretations and labels for different

econometric software. As such, it is

always important to look up the base

formula

-Once weve run a regression, the question

is begged, How well does x explain y.

-We cant answer that yet, but we can ask,

How well does the OLS regression line fit

the data?

-To measure this, we use R2, the

COEFFICIENT OF DETERMINATION:

SSE

SSR

R

1SST

SST

2

(2.38)

-R2 is the ratio of the explained variation

compared to the total variation

-the fraction of the sample variation in

y that is explained by x

-R2 always lies between zero and 1

-if R2=1, all actual points lie on the

regression line (usually an error)

-if R20, the regression explains very

little; OLS is a poor fit

Data

Notes

sciences, especially in cross-sectional

analysis

-econometric regressions should not be

heavily judged due to a low R 2

-for example, if R2=0.12, that means

12% of the variation is explained, which is

better than the 0% before the regression

- HARVARD Guide on Economics Research PaperUploaded bynightbox07
- Lec15 Regression ReviewUploaded byakirank1
- kaplanUploaded byFelix Andresen
- Optimization of GrowthUploaded bysiamak77
- Assignment 1Uploaded byaklank_218105
- Surfactant AnalysisUploaded byjuli_rad
- The Use of Statistical Methods in Mechanical EnginUploaded byMuhammad Usman RaNa
- Cashew is a High Value Tree Crop That is Well Suited to Being Grown in Several Parts of AfricaUploaded byPrudence Lugendo
- Regresi Sederhana 135020300111004 Ahmad Shahab.doc.Uploaded byDamian Pandu Kurniawan
- Analysis, Evaluation, And Optimization of Kinetic ParametersUploaded byThiagoSilvaOliver
- econ1447-1Uploaded byhansrajpanchoo
- Stats 3 Note 04Uploaded byacrosstheland8535
- Bivariate Regression ModelsUploaded byMick Dirks
- PowerRatingEvalu Nelson HansenUploaded bythetomhunt
- Regression.pdfUploaded bykamalkant05
- OutputUploaded byasriniadesilia
- Homework 5Uploaded byamt801
- Stat Intro CommentsUploaded bycristianodavidjones
- Brm AnalysisUploaded byaalokjoshi
- 700579308Uploaded byMaleeha Ahmad
- Math644 Chapter 2 Part3Uploaded bySKRSHAW
- Lecture 11.pdfUploaded byCharles
- Regression[1]Uploaded byRangothri Sreenivasa Subramanyam
- CorrelationsUploaded bybety oktaviani
- Slides Chapter0405Uploaded byfaf_du
- Rough Water QualityUploaded byoloche robert
- ARFANUploaded byMuh Arsawan
- Hasil Uji Regresi Variabel x1 Terhadap yUploaded byMeggy Croot Romero
- LAMPIRAN 8,9.rtfUploaded byMulyadi
- 97998537-Market-Research-Project-on-Liquid-Handwash.pdfUploaded bylucas11

- Research MethodologyUploaded byYogesh Kedar
- Ch2 SlidesUploaded byRossy Dinda Pratiwi
- Hintikka, Jaakko - Concepts of Scientific Method from Aristotle to Newton 1990Uploaded by6stones
- Phl251r6 W3 Syllogisms and Logic WorksheetUploaded byKingman93
- Solution to Rao Crammer BoundUploaded byPeter Kamfydj Ngutu
- MA_I P_III Research Methodologystudy[1]Uploaded by2nabls
- Econometrics PS Final Answers: Replication studyUploaded byDzhandarova Nurfatima
- Research (Arellano)Uploaded byApril Jane Fernandez
- Ali Almossawi, Alejandro Giraldo-Illustrated Book of Bad Arguments (2013)Uploaded byAnca Nicoleta
- 2.2 Inductive and Deductive Reasoning 2Uploaded bykaloy33
- heheUploaded byakshay patri
- 2.5_Hypothesis+Testing+and+Confidence+Intervals+假设检验和置信区间Uploaded byJames Jiang
- cbesta2chap09slidesUploaded byArjohn Yabut
- SW_2e_EE_ch08 (1)Uploaded byValdimiro
- BHM 200 Main Content (Mailafia & Goshit)Uploaded byRishi_Ranjan_5996
- Final NotesUploaded byReeba
- 4 Unit III Statistical TestsUploaded byashu
- MOS_Group15 - Letter&Replies.docxUploaded bySudhanshu N Ranjan
- HBR_Store24B_CaseUploaded bymanojben
- ANOVA ExamplesUploaded byWaqar Azeem
- econometrics methodolgyUploaded byShantanu Chatterjee
- shs core statistics and probability cgUploaded byapi-323289375
- TU04-PappasUploaded byCarlos Lobo
- PhilsUploaded byJohn
- brm notesUploaded byPrasanna Hegde
- Brief Introduction to LogicUploaded bythejeia
- 03- Comperative Study of Inductive & Deductive Methods of Teaching MethamaticsUploaded byvivek mishra
- Appendix A - Glossary of Sanskrit Terms.pdfUploaded byd4d0ff70
- Decision ScienceUploaded byNikhil Kasat
- SChapter 4Uploaded bySrinyanavel ஸ்ரீஞானவேல்