- Regression Equation
- Resume for IT Industry
- Multicollinearit1
- Introduction to Linear Regression
- 9. the Impact of Community Service Experience on Self Esteem Among UiTM Pahang Students (Hazlin Hasssan)Pp 58-67
- cqe-bok-2006
- Chapter 3
- Correlation & Regression
- Interest Rate Prediction for Social Loans
- Assessment of the Determinants of Non-performing Loans and Their Effects on Performance of Commercial Banks in Kenya
- phys ther-1999-freburger-906-18
- Probability and Statstical Inference 7
- Stat 608 Chapter 5
- Quetions for top students(2).pdf
- Statistics
- operational efficiency of commercial banks in india
- AusGeo2001
- Coladarci Breton 1997
- RosinRammlerRegression
- Tutorial Single Equation Regression Model
- Statistcs case Analysis
- Script_ASR_v161212 (1)
- SPE-28630
- Analyzing the Effect of Different Aggregation Approaches on Remotely Sensed Data
- Tennis
- 2014_EJAP_Effect of CMJ on P-F-V Profile
- regression project
- m Th 302 Quiz Important
- FHMM 1134 Tutorial 5 Correlation and Regression
- DevelopParametricModel.pdf
- Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
- Dispatches from Pluto: Lost and Found in the Mississippi Delta
- The Innovators: How a Group of Hackers, Geniuses, and Geeks Created the Digital Revolution
- Sapiens: A Brief History of Humankind
- Yes Please
- The Unwinding: An Inner History of the New America
- Grand Pursuit: The Story of Economic Genius
- This Changes Everything: Capitalism vs. The Climate
- A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
- The Emperor of All Maladies: A Biography of Cancer
- The Prize: The Epic Quest for Oil, Money & Power
- John Adams
- Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
- The World Is Flat 3.0: A Brief History of the Twenty-first Century
- Rise of ISIS: A Threat We Can't Ignore
- The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
- Smart People Should Build Things: How to Restore Our Culture of Achievement, Build a Path for Entrepreneurs, and Create New Jobs in America
- Team of Rivals: The Political Genius of Abraham Lincoln
- The New Confessions of an Economic Hit Man
- How To Win Friends and Influence People
- Angela's Ashes: A Memoir
- Steve Jobs
- Bad Feminist: Essays
- You Too Can Have a Body Like Mine: A Novel
- The Incarnations: A Novel
- The Light Between Oceans: A Novel
- The Silver Linings Playbook: A Novel
- Leaving Berlin: A Novel
- Extremely Loud and Incredibly Close: A Novel
- The Sympathizer: A Novel (Pulitzer Prize for Fiction)
- A Man Called Ove: A Novel
- The Master
- Bel Canto
- The Blazing World: A Novel
- The Rosie Project: A Novel
- The First Bad Man: A Novel
- We Are Not Ourselves: A Novel
- Brooklyn: A Novel
- The Flamethrowers: A Novel
- Life of Pi
- The Love Affairs of Nathaniel P.: A Novel
- Lovers at the Chameleon Club, Paris 1932: A Novel
- The Bonfire of the Vanities: A Novel
- The Perks of Being a Wallflower
- A Prayer for Owen Meany: A Novel
- The Cider House Rules
- The Art of Racing in the Rain: A Novel
- Wolf Hall: A Novel
- The Wallcreeper
- Interpreter of Maladies
- The Kitchen House: A Novel
- Beautiful Ruins: A Novel
- Good in Bed

Regression analysis

5.1 Correlation 5.2 Simple linear regression 5.3 Multiple regression

All these notions can be extended to the case with multiple predictors...

193 / 221

Veronika Czellar HEC Paris

Statistics

1. Descriptive statistics 2. Foundations of inferential statistics 3. Estimation and conﬁdence intervals 4. Testing statistical hypotheses 5. Regression analysis

5.1 Correlation 5.2 Simple linear regression 5.3 Multiple regression

Example We can use two predictors for Intel: S&P500 and inﬂation.

−0.4−0.3−0.2−0.1 0.0 0.1 0.2 0.3 0.4

Intel

0.015 0.010 0.005 0.000 −0.005 −0.010 −0.015 −0.020

−0.20−0.15−0.10−0.05 0.00 0.05 0.10

SP500

194 / 221

Veronika Czellar HEC Paris

Statistics

Inflation

where xi1 . σ > 0 is a ﬁxed and unknown parameter. .3 Multiple regression 5. . Regression analysis 5. . β1 .2 Simple linear regression 5. . . . .i.3. . . n .1. Foundations of inferential statistics 3. . xik are observable variables. βk are ﬁxed and unknown parameters. Deﬁnition: multiple linear regression equation yi = β0 + β1 xi1 + β2 xi2 + · · · + βk xik + εi . Estimation and conﬁdence intervals 4.d. β0 . ε1 . 195 / 221 Veronika Czellar HEC Paris Statistics i = 1.1 Multiple regression equation We extend the regression theory to k explanatory variables. Descriptive statistics 2.1 Correlation 5. εn ∼ i. . . .3 Multiple regression 5. N (0. . . σ 2 ). Testing statistical hypotheses 5. .

Testing statistical hypotheses 5. βk are n ˆ ˆ (β0 .βk i=1 yi − (β0 + β1 xi1 + · · · + βk xik ) 2 .. . Estimation and conﬁdence intervals 4. Regression analysis 5. .1 Correlation 5.1.. .. βk ) = arg min β0 . . . 196 / 221 Veronika Czellar HEC Paris Statistics .2 Simple linear regression 5.3 Multiple regression Deﬁnitions The least squares (LS) estimators of β0 . Descriptive statistics 2. . . Remark: explicit formulas for these estimators are available . Foundations of inferential statistics 3. . .. .

Foundations of inferential statistics 3. . and is ˆ β = (X T X )−1 X T Y . . ε = . . . Testing statistical hypotheses 5. . . . with Y = . Y = Xβ + ε.3 Multiple regression . . . we will use Excel to estimate the parameters. No need to learn this slide by heart. yn 1 xn1 · · · xnk βk εn The LS estimator of β minimizes (Y − X β)T (Y − X β). β = .1. . 1 x11 · · · x1k y1 β0 ε1 . Estimation and conﬁdence intervals 4. . Descriptive statistics 2. .1 Correlation 5. . X = . . . . . . Regression analysis 5. 197 / 221 Veronika Czellar HEC Paris Statistics . . but require a matrix form of the regression model.2 Simple linear regression 5.

1 Correlation 5.2 Simple linear regression 5. Testing statistical hypotheses 5. Regression analysis 5. Foundations of inferential statistics 3.1. Estimation and conﬁdence intervals 4.3 Multiple regression Regressing Intel on S&P500 and inﬂation: Back to one predictor 198 / 221 Veronika Czellar HEC Paris Statistics . Descriptive statistics 2.

1 Correlation 5. Testing statistical hypotheses 5. Foundations of inferential statistics 3.2 Simple linear regression 5. Regression analysis 5.1. Descriptive statistics 2. Estimation and conﬁdence intervals 4. than in the case of one predictor S&P500 only. Question What does R Square mean in the multiple regression? 199 / 221 Veronika Czellar HEC Paris Statistics .3 Multiple regression The R Square in the Excel output is higher.

Syy where Syy = n (yi − y )2 and yi are the ﬁtted values ˆ i=1 ˆ0 + β1 xi1 + · · · + βk xik . Testing statistical hypotheses 5. Statistics . Regression analysis 5. Foundations of inferential statistics 3.3 Multiple regression 5. ˆ ˆ yi = β ˆ Proposition R2 = 1 − 200 / 221 Veronika Czellar HEC Paris n i=1 (yi Syy − yi ) 2 ˆ .2 Evaluating a multiple regression equation Deﬁnition n y i=1 (ˆi Back to simple regression The coeﬃcient of multiple determination R 2 is deﬁned by R2 = − y )2 . Estimation and conﬁdence intervals 4. Descriptive statistics 2.3.1 Correlation 5.1.2 Simple linear regression 5.

Estimation and conﬁdence intervals 4.2 Simple linear regression 5. A value near 0 indicates little linear association between the set of independent variables and the dependent variable. R 2 can almost always be made very close to 1 by using a model with k quite close to n. Testing statistical hypotheses 5. Foundations of inferential statistics 3.1 Correlation 5. R 2 cannot go down when an extra predictor is added to the model and it will generally increase. Regression analysis 5. A value near 1 means a strong association. 2 3 201 / 221 Veronika Czellar HEC Paris Statistics .1.3 Multiple regression Properties of the coeﬃcient of multiple determination 1 It can range from 0 to 1. even if many of the predictors would contribute only marginally to variation in y . Descriptive statistics 2.

adjusted R 2 penalizes the addition of extraneous predictors to the model.3 Multiple regression Deﬁnition The adjusted R 2 is deﬁned by Adjusted R 2 = 1 − Properties of the adjusted R 2 1 n−1 n−k −1 n i=1 (yi Syy − yi ) 2 ˆ . 2 202 / 221 Veronika Czellar HEC Paris Statistics .2 Simple linear regression 5.1.1 Correlation 5. Estimation and conﬁdence intervals 4. Regression analysis 5. Testing statistical hypotheses 5. Foundations of inferential statistics 3. Descriptive statistics 2. adjusted R 2 is smaller than R 2 .

But how large should this value be before we draw this conclusion? 203 / 221 Veronika Czellar HEC Paris Statistics . Regression analysis 5.3 Multiple regression Question High values of R 2 suggest that the model ﬁt is a useful one.1. Foundations of inferential statistics 3.2 Simple linear regression 5. Estimation and conﬁdence intervals 4.1 Correlation 5. Descriptive statistics 2. Testing statistical hypotheses 5.

Testing statistical hypotheses 5.3 Testing the global utility of the multiple regression H0 : β 1 = β 2 = · · · = β k = 0 Ha : at least one among β1 . .3 Multiple regression 5. Regression analysis 5. .1. Estimation and conﬁdence intervals 4. Descriptive statistics 2.1 Correlation 5. Foundations of inferential statistics 3.2 Simple linear regression 5.n−k−1) .3. . . βk is not zero Model utility F test: F = R 2 /k H0 ∼ F(k. 2 )/(n − k − 1) (1 − R 204 / 221 Veronika Czellar HEC Paris Statistics .

Descriptive statistics 2. Regression analysis 5.1. Foundations of inferential statistics 3.2 Simple linear regression 5.3 Multiple regression Model utility F test for the Intel example with two predictors: 205 / 221 Veronika Czellar HEC Paris Statistics . Testing statistical hypotheses 5. Estimation and conﬁdence intervals 4.1 Correlation 5.

1. Testing statistical hypotheses 5. Foundations of inferential statistics 3. Descriptive statistics 2. 206 / 221 Veronika Czellar HEC Paris Statistics . it does not mean that all predictors are useful.1 Correlation 5. Regression analysis 5.3 Multiple regression Warning If the F test results in the rejection of H0 .2 Simple linear regression 5. Estimation and conﬁdence intervals 4.

1. Estimation and conﬁdence intervals 4. .. . we can test H0 : β j = 0 . . Foundations of inferential statistics 3. Descriptive statistics 2. k}. For any given j ∈ {0.3 Multiple regression 5. Testing statistical hypotheses 5. Ha : β j = 0 using a t test: T βj = where ˆ βj ˆ SE (βj ) H0 ∼ tn−k−1 . 207 / 221 Veronika Czellar HEC Paris Statistics . .1.3.2 Simple linear regression 5..4 Evaluating individual regression coeﬃcients The t tests can be extended to the multivariate case. Regression analysis 5.1 Correlation 5.

1 Correlation 5. Regression analysis 5.2 Simple linear regression 5. Testing statistical hypotheses 5. ˆ jj 1 ˆ σ = n−k−1 n (yi − yi )2 and is called multiple standard ˆ i=1 error of estimate. (and has the matrix form σ 2 (X T X )−1 ). Example Do an individual test of each independent variable for the Intel regression with two predictors. Estimation and conﬁdence intervals 4.05 signiﬁcance level. Foundations of inferential statistics 3.1. Descriptive statistics 2.3 Multiple regression ˆ SE (βj ) is the standard error of the coeﬃcient j. Which variable would you consider eliminating? Use the 0. 208 / 221 Veronika Czellar HEC Paris Statistics .

2 Simple linear regression 5. Foundations of inferential statistics 3.3 Multiple regression 209 / 221 Veronika Czellar HEC Paris Statistics .1. Descriptive statistics 2.1 Correlation 5. Regression analysis 5. Testing statistical hypotheses 5. Estimation and conﬁdence intervals 4.

we should delete only one variable at a time. Testing statistical hypotheses 5. we need to rerun the regression equation and check the remaining variables. Estimation and conﬁdence intervals 4.1.3 Multiple regression Remark: if there are more than one nonsigniﬁcant variables.2 Simple linear regression 5. 210 / 221 Veronika Czellar HEC Paris Statistics . Descriptive statistics 2. Regression analysis 5.1 Correlation 5. This method is called backward stepwise regression method. Each time we delete a variable. Foundations of inferential statistics 3.

1. Estimation and conﬁdence intervals 4.2 Simple linear regression 5. Descriptive statistics 2. Testing statistical hypotheses 5. 211 / 221 Veronika Czellar HEC Paris Statistics . Example Global warming is the increase in the average temperature of Earth’s near-surface air and oceans since the mid-20th century and its projected continuation. 37 percent above those in 1990. Regression analysis 5.3.1 billion tons in 2009.3 Multiple regression 5.5 Transformed variables We can also include transformed variables or mixtures of variables in a multiple regression model.1 Correlation 5. Foundations of inferential statistics 3. Global CO2 emissions totalled 31.txt for more than 65 countries has been released in August 2010 by and available on the CERINA Plan website (and on the course website as well). It is well-known that climate change is inﬂuenced by human CO2 emissions. Global data GlobalAirpollution.

Estimation and conﬁdence intervals 4.2009 ) GDP2009realgrowth : GDP real growth rate (in %.1 Correlation 5.2008 212 / 221 Veronika Czellar HEC Paris Statistics i = 1.1 + β2 xi. Testing statistical hypotheses 5. xi.2 + β3 xi. Descriptive statistics 2. yi. yi.1 ) PopGrowth2009 : population growth rate (in %.1 ) 2 SquarePopGrowth2009 : squared PopGrowth2009 (xi.2 Simple linear regression 5.2 ) Fit the following model: yi. 65 . .1.2008 ) Year2009 : emissions of CO2 in 2009 (in million tons. Foundations of inferential statistics 3.2009 2 2 = β0 + β1 xi.1 + β4 xi. xi. . yi.2 ) 2 SquareGDP2009 : squared GDP2009realgrowth (xi. . Year2008 : emissions of CO2 in 2008 (in million tons. Regression analysis 5. . .2 + εi .3 Multiple regression Example continued We would like to investigate the impact of GDP per capita and population growth on the increase of CO2 emissions.

Regression analysis 5.1 Correlation 5.2 Simple linear regression 5.3 Multiple regression 213 / 221 Veronika Czellar HEC Paris Statistics . Descriptive statistics 2. Foundations of inferential statistics 3. Estimation and conﬁdence intervals 4.1. Testing statistical hypotheses 5.

Descriptive statistics 2. Estimation and conﬁdence intervals 4.2 Simple linear regression 5. Foundations of inferential statistics 3.3 Multiple regression 214 / 221 Veronika Czellar HEC Paris Statistics . Regression analysis 5. Testing statistical hypotheses 5.1.1 Correlation 5.

6 Dummy variables We can also include a dummy variable as a predictor. which takes the values 0 or 1 to indicate the absence or presence of some categorical eﬀect. Foundations of inferential statistics 3. Testing statistical hypotheses 5.3. Example: CEO salaries (see NorthwestCEOsalaries.1.txt on course website) 215 / 221 Veronika Czellar HEC Paris Statistics .1 Correlation 5. Estimation and conﬁdence intervals 4.3 Multiple regression 5. Regression analysis 5. Descriptive statistics 2.2 Simple linear regression 5.

Example: prices of LCD televisions (see LCD. Descriptive statistics 2.txt on course website.1 Correlation 5. Regression analysis 5.2 Simple linear regression 5.7 Qualitative variables A categorical (or qualitative) variable is a predictor that takes a ﬁnite number d possible values.3. Testing statistical hypotheses 5. Only d − 1 categories are added to the regression model.3 Multiple regression 5. and exercise 5.12) 216 / 221 Veronika Czellar HEC Paris Statistics . Estimation and conﬁdence intervals 4. Foundations of inferential statistics 3.1.

Testing statistical hypotheses 5. Foundations of inferential statistics 3. Regression analysis 5.3 Multiple regression 5.1 Correlation 5.3.2 Simple linear regression 5. Estimation and conﬁdence intervals 4.1. Descriptive statistics 2.8 Interaction variables In some cases. Example: CEO salaries The product between the woman dummy and sales is an interaction term. 217 / 221 Veronika Czellar HEC Paris Statistics . which are products of at least two variables. it can be useful to add interaction terms.

.8). Estimation and conﬁdence intervals 4.1 Correlation 5. Regression analysis 5. there is an additional requirement in multiple regression: predictors should not be correlated. Descriptive statistics 2.2.2 Simple linear regression 5. Foundations of inferential statistics 3.1. Testing statistical hypotheses 5. Back to simple regression However. . 218 / 221 Veronika Czellar HEC Paris Statistics .3 Multiple regression Model assumptions in multiple regression can be veriﬁed in the same way as in simple linear regression (see 5.

A regression coeﬃcient that should have a positive sign turns out to be negative.3 Multiple regression 5. Several clues that indicate problems with multicollinearity: An independent variable known to be an important predictor ends up being not signiﬁcant.1. Testing statistical hypotheses 5. Descriptive statistics 2. Estimation and conﬁdence intervals 4. or vice versa.3. Regression analysis 5. Foundations of inferential statistics 3. there is a drastic change in the values of the remaining coeﬃcients. 219 / 221 Veronika Czellar HEC Paris Statistics . When an independent variable is added or removed.9 Multicollinearity Multicollinearity exists when independent variables are correlated.1 Correlation 5.2 Simple linear regression 5.

Estimation and conﬁdence intervals 4. Neter and Li (2005). Applied Regression Analysis and Generalized Linear Models..2 Simple linear regression 5. see Kutner. 2nd ed. Fox (2008). 5th ed. Descriptive statistics 2. Sage Publications. Testing statistical hypotheses 5.1 Correlation 5. 220 / 221 Veronika Czellar HEC Paris Statistics . Foundations of inferential statistics 3. Nachtscheim.3 Multiple regression For further details about linear regression. Regression analysis 5.. McGraw-Hill. Applied Linear Statistical Models.1.

Testing statistical hypotheses 5.2 Simple linear regression 5. Estimation and conﬁdence intervals 4. Descriptive statistics 2.. Regression analysis 5. Merci Danke Grazie Gracias Spasibo K¨sz¨n¨m o o o 221 / 221 Veronika Czellar HEC Paris Statistics ..1. Foundations of inferential statistics 3.1 Correlation 5.3 Multiple regression Thank you.

- Regression EquationUploaded byMuhammad Tariq
- Resume for IT IndustryUploaded byJian Gao
- Multicollinearit1Uploaded byAsif Sultan
- Introduction to Linear RegressionUploaded byLemdy Anwuna
- 9. the Impact of Community Service Experience on Self Esteem Among UiTM Pahang Students (Hazlin Hasssan)Pp 58-67Uploaded byupenapahang
- cqe-bok-2006Uploaded bypangrv
- Chapter 3Uploaded byMuhammad Arslan Ali
- Correlation & RegressionUploaded byHitesh Thakur
- Interest Rate Prediction for Social LoansUploaded byErik van Kempen
- Assessment of the Determinants of Non-performing Loans and Their Effects on Performance of Commercial Banks in KenyaUploaded byOIRC
- phys ther-1999-freburger-906-18Uploaded byapi-285767422
- Probability and Statstical Inference 7Uploaded byAshoka Vanjare
- Stat 608 Chapter 5Uploaded byjstpallav
- Quetions for top students(2).pdfUploaded byharuhi.karasuno
- StatisticsUploaded byAhmed El-saadany
- operational efficiency of commercial banks in indiaUploaded byArpit Jain
- AusGeo2001Uploaded bySekhar Challa
- Coladarci Breton 1997Uploaded byMd Helemy
- RosinRammlerRegressionUploaded byAnonymous UO394e
- Tutorial Single Equation Regression ModelUploaded bygapusing
- Statistcs case AnalysisUploaded byANZ
- Script_ASR_v161212 (1)Uploaded byAbhimanyu Sahai
- SPE-28630Uploaded bydanienrique17
- Analyzing the Effect of Different Aggregation Approaches on Remotely Sensed DataUploaded bySopheak Pen
- TennisUploaded bySF1234567890
- 2014_EJAP_Effect of CMJ on P-F-V ProfileUploaded byJuan Palomo
- regression projectUploaded byapi-313602147
- m Th 302 Quiz ImportantUploaded byshahzir
- FHMM 1134 Tutorial 5 Correlation and RegressionUploaded bykhohzian
- DevelopParametricModel.pdfUploaded bybharathanin