You are on page 1of 27

Chapter 6

Correlation vs Causality in Linear Regression Analysis


© 2019 McGraw-Hill Education. All rights reserved. Authorized only for instructor use in the classroom. No reproduction or distribution without the prior written consent of McGraw-Hill Education
Learning Objectives

1. Differentiate between correlation and causality in general and in the


regression environment
2. Calculate partial and semi partial correlation
3. Execute inference for correlation regression analysis
4. Execute passive prediction using regression analysis
5. Execute inference for determining functions
6. Execute active prediction using regression analysis
7. Distinguish the relevance of model fit between active and passive
prediction
© 2019 McGraw-Hill Education. 2
The Difference Between Correlation and
Causality

Yi = fi(X1i, X2i, …, XKi) + Ui


• We define as the determining function, since it comprises the part
of the outcome that we can explicitly determine
• Ui can only be inferred by solving Yi – fi(X1i, X2i, …, XKi)
• Data-generating process as a framework for modeling causality
1. The reasoning established to measure an average treatment effect
using sample means easily maps to this framework
2. Easily extends into modeling causality for multi-level treatments
and multiple-treatments
© 2019 McGraw-Hill Education. 3
The Difference Between Correlation and
Causality

• A causal relationship between two variables clearly implies co-


movement.
• If X casually impacts Y, then when X changes, we expect a change in Y
• However, variables often move together even when there is no casual
relationship between them
• For example, height of two different children of ages 5 and 10. Since
both the children are growing during these ages, their heights will
generally move together. this co-movement is not due to causality – an
increase in height by one child will not change in the height for the other.

© 2019 McGraw-Hill Education. 4


The Difference Between Correlation and
Causality

• Measurement
  of the co-movement between two variables in a
dataset is captured through sample covariance or correlation:

Covariance: sCov(X,Y) =

Correlation: sCorr(X,Y) =

© 2019 McGraw-Hill Education. 5


The Difference Between Correlation and
Causality

• When there are more than two variables, e.g., Y, X1, X2, we can
also measure partial correlation between two of the variables
• Partial correlation between two variables is their correlation
after holding one or more other variables fixed

© 2019 McGraw-Hill Education. 6


The Difference Between Correlation and
Causality

• Causality implies that a change in one variable or variables causes


a change in another
• Data analysis attempting to measure causality generally involves an
attempt to measure the determining function within the data-generating
process
• Correlation implies that variables move together
• Data analysis attempting to measure correlation is not concerned about
the data-generating process and determining function, it uses standard
statistical formulas (sample correlation, partial correlation) to assess how
variables move together
© 2019 McGraw-Hill Education. 7
Regression Analysis for Correlation

• The dataset is a cross-section of 230 grocery stores

AvgPrice = Average Price


AvgHHSize = Average
Size of Households of
Customers at that
Grocery Store.

© 2019 McGraw-Hill Education. 8


Regression Analysis for Correlation

Sales = b + m1AvgPrice + m2AvgHHSize


Solving b, m1, m2:
Sales = 1591.54 – 181.66 × AvgPrice + 128.09 × AvgHHSize

• This equation provides us information about how the variables


in the equation are correlated within our sample.

© 2019 McGraw-Hill Education. 9


Different Ways to Measure Correlation Between
Two Variables
• Unconditional
  correlation is the standard measure of correlation
between two variables X and Y
Corr(X,Y) =
Sx = Sample standard deviation for X and
SY = Sample standard deviation for Y

• Partial correlation between X and Y is a measure of the relationship


between these two variables, holding at least one other variable fixed
• Semi-partial correlation between X and Y is a measure of the
relationship between these two variables, holding at least one other
variable fixed for only X or Y
© 2019 McGraw-Hill Education. 10
Regression Analysis for Correlation

• For the general regression equation: Y = b + m1X1 + … +mKXK the


solutions for m1 through mk when solving the sample moment
equations are proportional to the partial and semi-partial
correlation between Y and the respective Xs

© 2019 McGraw-Hill Education. 11


Regression and Population Correlation

• Suppose we have the data for the entire population for our
grocery store data, then, we have:
Sales = B + M1AvgPrice + M2AvgHHSize
• Capital letters are used to indicate that these are the intercept
and slopes for the population, rather than the sample
• Solve for B, M1, and M2 by solving the sample moment
equations using the entire population of data

© 2019 McGraw-Hill Education. 12


Regression and Population Criteria

• We do not have the data for the entire population, but for a
sample dataset for the population whose regression line is:
Sales = b + m1AvgPrice + m2AvgHHSize
• Solve for b, m1 and m2
• The intercept and slope(s) of the regression equation
describing a sample are estimators for the intercept and
slope(s) of the corresponding regression equation describing
the population.

© 2019 McGraw-Hill Education. 13


Regression and Population Correlation

• Consistent estimator is an estimator whose realized value gets


close to its corresponding population parameter as the sample
size gets large.

© 2019 McGraw-Hill Education. 14


Regression Line for Full Population

© 2019 McGraw-Hill Education. 15


Regression Lines for Three Samples of Size 10

© 2019 McGraw-Hill Education. 16


Regression Lines for Three Samples of Size 30

© 2019 McGraw-Hill Education. 17


Confidence Interval and Hypothesis Testing for
the Population Parameters

• In order to conduct hypothesis testing or building confidence


intervals for the population parameters of a regression
equation, we need to know the distribution of the estimators
• Each estimator becomes very close to its corresponding
population parameters for a large sample
• For a large sample, these estimators are normally distributed

© 2019 McGraw-Hill Education. 18


Confidence Interval and Hypothesis Testing for
the Population Parameters

• A large random sample implies that:


b~N(B,σB)
m1~N(M1,σm1)
mk~N(MK,σmk)
• If we write each element in the population as:
Yi = B + M1X1i + … + MKXK + Ei
, where Ei is the residual, then Var(Y|X) is equal to Var(E|X)
• Common assumption that this variance is constant across all values of X , so
Var(Y|X) = Var(E|X) = Var(E) = σ2
• This consistency of variance is called homoscedasticity
© 2019 McGraw-Hill Education. 19
Prediction Using Regression

•   Sales = 1591.54 – 181.66 × AvgPrice + 128.09 × AvgHHSize


• If Store A has an average price of $0.50 higher than Store B, and Store
A has an average household size that is 0.40 less than Store B, then:
= -181.66 × 0.50 + 128.09 × (-0.4) = -142
• We predict Store A has 143 fewer sales than Store B
• When using correlational regression analysis to make predictions, we
must be considering a population that spans across time and we
assume that the population regression equation best describes the
future population

© 2019 McGraw-Hill Education. 20


Regression and Causation

• Data-generating process of an outcome Y can be written as:


Yi = fi(X1i, X2i, …, XKi) + Ui
• We assume the determining function can be written as:
fi(X1i, X2i, …, XKi) = α + β1X1i + β2X2i +… βKXKi
• Combining these assumptions into a single assumption, the data-
generating process can be written as:
Yi = α + β1X1i + β2X2i +… βKXKi + Ui
• Error term represents unobserved factors that determine the outcome

© 2019 McGraw-Hill Education. 21


Regression and Causation

• Yi = B + M1X1i + … +MKXK + Ei (Correlation model)


• Yi = α + β1X1i + … βKXKi + Ui (Causality model)
• Correlational model residuals (Ei) have a mean of zero and are
uncorrelated with each of Xs. For this model, we simply plot all the
data points in the population and write each observation in terms
of equation that best describes these points.
• For the causality model, the data-generating process is the process
that actually generating the data we observe and determining
function need not be the equation that best describe the data.
© 2019 McGraw-Hill Education. 22
The Difference Between the Correlation Model
and the Causality Model: An Example

CONSIDERING THESE DATA FOR Y, X, AND U ARE FOR THE ENTIRE POPULATION:

THESE DATA WERE GENERATED USING THE DATA- GENERATING PROCESS: Yi = 5 + 3.2Xi + Ui

MEANING WE HAVE A DETERMING FUNCTION : f(X) = 5 + 3.2X


© 2019 McGraw-Hill Education. 23
Scatterplot, Regression Line, and Determining
Function of X and Y

IN THIS FIGURE, WE
PLOT Y AND X ALONG
WITH THE DETERMING
FUNCTION (BLUE LINE)
AND THE POPULATION
REGRESSION EQUATION
(RED LINE).

© 2019 McGraw-Hill Education. 24


Regression and Causation

• The correlation model describes the data best but need not
coincide with the causal mechanism generating the data
• The causality model provides the casual mechanism but need
not describe the data best

© 2019 McGraw-Hill Education. 25


The Relevance of Model Fit for Passive and
Active Prediction

• Total
  sum of squares (TSS): The sum of the squared difference
between each observation of Y and the average value of Yi
TSS = Yi – )2
• Sum of squared residuals (SSRes): The sum of the squared
residuals.
SSRes = i
• R-squared: The fraction of the total variance in Y that can be
attributed to variation in the Xs
R2 = 1 – SSRes/TSS
© 2019 McGraw-Hill Education. 26
The Relevance of Model Fit for Passive and
Active Prediction

• A high R-squared implies a good fit, meaning the points on the


regression equation tend to be close to the actual Y values
• R-squared for passive prediction (correlation) : Finding a high
R-squared implies the prediction is close to reality
• R-squared for active prediction (causality): R-squared is not a
primary consideration when evaluating predictions

© 2019 McGraw-Hill Education. 27

You might also like