You are on page 1of 20

Debre Markos University CHAPTER ONE: LINEAR REGRESSION ANALYSIS

College of Post Graduate Studies


Department of Economics
1.1 Introduction
MSc in Project planning and management
1.2 Ordinary Least Square Methods of Estimation
1.3 Properties of Least Square
1.4 Assumptions of the Classical Linear Regression Model
1.5 Interval estimation and hypothesis testing
1.6 Violation of classical assumptions
Econometrics for Project Managers
Course Code: PPM 551

By Aynalem Shita (PhD candidate)

March, 2021

1.1 INTRODUCTION • An economic model consists of mathematical


equations that describe various relationships:
• The term econometrics is originated from two words, economics y = f ( x1 , x2 ,...)
and metric.
• Econometrics can be narrowly defined as economic • Formal economic modeling is the starting point for
measurement. empirical analysis.
• Econometrics is defined as a science that deals with the • Then, we need to turn the economic model into
measurements of economic relationship.
econometric one……
• Econometrics is a separate discipline from mathematical statistics
because of its focus on collecting and analyzing non-experimental
economic data. yi = β 0 + β1 xi1 + β 2 xi 2 + ... + ε i
• It is a combination of economic theory, mathematical economics
and statistics.

1
What are uses of Econometrics? Methodology Of Econometrics
• Testing economic theories: Test Keynes hypothesis: • Broadly speaking, traditional econometric methodology
Consumption increases with income… proceeds along the following lines.
• Estimation of economic relationships: Demand and
supply equations; Production functions.
• Forecasting: Use current and past economic data to
predict future values of variables such as inflation, GDP,
stock prices, etc.
• Evaluating government policies: Employment effects of
an increase in the minimum wage; Effects of tuition fee
on sustainability of water schemes.

Types of Data Types of Data

• Three types of data may be available for empirical analysis:


time series, cross-section, and pooled (i.e., combination • Cross-Section Data: are data on one or more variables
of time series and crosssection) data. collected at the same point in time, such as the census of
• Time Series Data: A time series is a set of observations on the population conducted by the CSA every 10 years.
values that a variable takes at different times. Such data may be • Pooled/panel Data: In pooled, or combined, data are
collected at regular time intervals, such as daily (e.g., stock elements of both time series and cross-section data. This is the
prices, weather reports), weekly (e.g., money supply figures), case when the same cross-sectional unit (say, a family or a firm)
monthly [e.g., the unemployment rate, the Consumer Price is surveyed over time.
Index (CPI)], quarterly (e.g., GDP), annually (e.g.,
government budgets).

2
Simple Linear Regression Different Names of Dependent and Independent Variables

• Regression analysis is concerned with the study of the dependence of Dependent variable Independent Variable
one variable, the dependent variable, on one or more other variables,
the explanatory variables, with a view to estimating and/or predicting Explained variable Explanatory Variable
the (population) mean or average value of the former in terms of the
known or fixed (in repeated sampling) values of the latter. Predicted Predictor
Regressand Regressor
• The term ‘simple’ refers to the fact that we use only two variables
(one dependent and another independent variable). Response Stimulus
Endogenous Exogenous
Y = f( X)
• If the number of independent or explanatory variables is greater than Outcome Covariate
one, we say it is ‘multiple’.
Y = f( X1, X2, X3, …………)
Dependent variable = f( explanatory variables)

Simple Linear Regression Simple Linear Regression

• The major objectives and uses of a regression function are:- • Look at table 2.1 which refers to a total population of 60
families and their weekly income (X) and weekly consumption
1. To estimate mean value of the dependent variable, given the expenditure (Y). The 60 families are divided into 10 income
value of independent variable(s); groups.
2. To test hypothesis about sign and magnitude of relationship • There is considerable variation in weekly consumption
between the dependent and independent variable(s); expenditure in each income group.
3. To predict or forecast future value(s) of the dependent variable • But the general picture that one gets is that, despite the
in policy formulation variability of weekly consumption expenditure within each
income bracket, on the average, weekly consumption
4. Combination of any two or more of the above objectives.
expenditure increases as income increases.

3
u1
Simple Linear Regression

Simple Linear Regression The Concept of Population Regression Function (PRF)

• The dark circled points in Figure 2.1 show the • From the preceding discussion and Figures. 2.1, it is
conditional mean values of Y against the various X clear that each conditional mean E(Y | Xi) is a function
values. of Xi. Symbolically,
• If we join these conditional mean values, we obtain • E(Y | Xi) = f (Xi) (2.2.1)
what is known as the population regression line (PRL), • Equation (2.2.1) is known as the conditional
or more generally, the population regression curve. expectation function (CEF) or population regression
• More simply, it is the regression of Y on X. The function (PRF) or population regression (PR) for
adjective “population” comes from the fact that we short.
are dealing in this example with the entire population • The functional form of the PRF is an empirical
of 60 families. question. For example, we may assume that the PRF
• Of course, in reality a population may have many E(Y | Xi) is a linear function of Xi, say, of the type
families. • E(Y | Xi) = β1 + β2Xi (2.2.2)

4
Slide 13

u1 Conditional Mean of Y given X


Cov X,Y/EY
Cov X,Y = E(XY)-(EX*EY)
user1, 10/15/2009
The Meaning of the Term Linear The Meaning of the Term Linear

• Linearity in the Variables • Now consider the model:


• The first meaning of linearity is that the conditional • E(Y | Xi) = β1 + β22 Xi .
expectation of Y is a linear function of Xi, the regression curve • The preceding model is an example of a nonlinear (in the
in this case is a straight line. But parameter) regression model.
• E(Y | Xi) = β1 + β2X2i is not a linear function in variable • From now on the term “linear” regression will always mean a
regression that is linear in the parameters; the β’s (that is, the
• Linearity in the Parameters parameters are raised to the first power only).
• The second interpretation of linearity is that the conditional
expectation of Y, E(Y | Xi), is a linear function of the parameters,
the β’s; it may or may not be linear in the variable X.
• E(Y | Xi) = β1 + β2X2i
• is a linear (in the parameter) regression model.

The Concept of Population Regression Function (PRF) The Sample Regression Function (SRF)

• The PRF can be written as; • The data of Table 2.1 represent the population, not a sample.
• In most practical situations what we have is a sample of Y values
• Yi = β1 + β2Xi + ui (2.4.2) corresponding to some fixed X’s.
• Pretend that the population of Table 2.1 was not known to us and the only
• The above specification (2.4.2) has the advantage that it clearly shows that information we had was a randomly selected sample of Y values for the
there are other variables besides income that affect consumption fixed X’s as given in Table 2.4. each Y (given Xi) in Table 2.4 is chosen
expenditure and that an individual family’s consumption expenditure randomly from similar Y’s corresponding to the same Xi from the population
cannot be fully explained only by the variable(s) included in the regression of Table 2.1.
model. • Can we estimate the PRF from the sample data?
• The disturbance term ui is a surrogate/substitute for all those variables that • We may not be able to estimate the PRF “accurately” because of sampling
are omitted from the model but that collectively affect Y fluctuations.
• To see this, suppose we draw another random sample from the population
of Table 2.1, as presented in Table 2.5. Plotting the data of Tables 2.4 and
2.5, we obtain the scattergram given in Figure 2.4. In the scattergram two
sample regression lines are drawn so as

5
The Sample Regression Function (SRF) The Sample Regression Function (SRF)

The Sample Regression Function (SRF) The Sample Regression Function (SRF)

• Which of the two regression lines represents the “true”


population regression line?
• There is no way we can be absolutely sure that either
of the regression lines shown in Figure 2.4 represents
the true population regression line (or curve).
• Supposedly they represent the population regression
line, but because of sampling fluctuations they are at
best an approximation of the true PR.
• In general, we would get N different SRFs for N
different samples, and these SRFs are not likely to be
the same.

6
The Sample Regression Function (SRF) The Sample Regression Function (SRF)

• Now just as we expressed the PRF in two equivalent forms, (2.2.2) and
(2.4.2), we can express the SRF (2.6.1) in its stochastic form as follows:
• Yi = βˆ1 + βˆ2Xi +uˆi (2.6.2)
• ˆui denotes the (sample) residual term. Conceptually ˆui is analogous to ui and
can be regarded as an estimate of ui.
• It is introduced in the SRF for the same reasons as ui was introduced in the
PRF.
• To sum up, then, we find our primary objective in regression analysis is to
estimate the PRF
• Yi = β1 + β2Xi + ui (2.4.2)
• on the basis of the SRF
• Yi = βˆ1 + βˆ2Xi +uˆi (2.6.2)
• because more often than not our analysis is based upon a single sample
from some population.

The Sample Regression Function (SRF) The Sample Regression Function (SRF)

• The PRF based on the SRF is at best an approximate • The critical question now is: Granted that the SRF is
one. This approximation is shown diagrammatically but an approximation of the PRF, can we devise a
in Figure 2.5. For X = Xi, we have one (sample) rule or a method that will make this approximation as
observation Y = Yi. In terms of the SRF, the observed Yi “close” as possible?
can be expressed as: • In other words, how should the SRF be constructed so
• Yi = Yˆi +uˆi (2.6.3) that βˆ1 is as “close” as possible to the true β1 and βˆ2 is
• and in terms of the PRF, it can be expressed as as “close” as possible to the true β2 even though we will
• Yi = E(Y | Xi) + ui (2.6.4) never know the true β1 and β2?
• Now obviously in Figure 2.5 Yˆi overestimates the true • The answer to this question will occupy much of our
E(Y | Xi) for the Xi shown therein. By the same token, attention in the next section; the method of OLS
for any Xi to the left of the point A, the SRF will
underestimate the true PRF.

7
1.2 Ordinary Least Square Methods of Estimation 1.2 Ordinary Least Square Methods of Estimation

1.2 Ordinary Least Square Methods of Estimation 1.2 Ordinary Least Square Methods of Estimation

• Consider Table 3.1 and conduct two experiments. • Since the βˆ values in the two experiments are
different, we get different values for the estimated
residuals.
• Now which sets of βˆ values should we choose?
Obviously the βˆ’s of the first experiment are the
“best” values. But we can make endless experiments
and then choosing that set of βˆ values that gives us the
least possible value of ˆu2i
• But since time, and patience, are generally in short
supply, we need to consider some shortcuts to this
trial-and-error process. Fortunately, the method of
least squares provides us with unique estimates of β1
and β2 that give the smallest possible value of ˆu2i.

8
1.2 Ordinary Least Square Methods of Estimation 1.2 Ordinary Least Square Methods of Estimation

• where X¯ and Y¯ are the sample means of X and Y and where we define xi =
(Xi − X¯ ) and yi = (Yi − Y¯ ). Henceforth we adopt the convention of letting the
lowercase letters denote deviations from mean values.

A NUMERICAL EXAMPLE

9
The estimated regression line therefore is
• Yˆi = 24.4545 + 0.5091Xi (3.6.2)
• Each point on the regression line gives an estimate of the expected or mean
value of Y corresponding to the chosen X value; that is, Yˆi is an estimate of
E(Y | Xi).
• The value of βˆ2 = 0.5091, which measures the slope of the line, shows that,
within the sample range of X between $80 and $260 per week, as X
increases, say, by $1, the estimated increase in the mean or average weekly
consumption expenditure amounts to about 51 cents.
• The value of βˆ1 = 24.4545, which is the intercept of the line, indicates the
average level of weekly consumption expenditure when weekly income is
zero.

1.3 Properties of Least Square: The Gauss–


1.3 Properties of Least Square: The Gauss–Markov Theorem
Markov Theorem
• To understand this theorem, we need to consider the best
linear unbiasedness property of an estimator.
• An estimator, say the OLS estimator βˆ2, is said to be a best
linear unbiased estimator (BLUE) of β2 if the following hold:
1. It is linear, that is, a linear function of a random variable,
such as the dependent variable Y in the regression model.
2. It is unbiased, that is, its average or expected value, E(βˆ2), is
equal to the true value, β2.
3. It has minimum variance in the class of all such linear
unbiased estimators; an unbiased estimator with the least
variance is known as an efficient estimator.

10
1.4 Assumptions of the Classical Linear Regression Model 1.4 Assumptions of the Classical Linear Regression Model

• The classical linear regression model consists of a set of


assumptions about how a data set will be produced by an
underlying “data-generating process.
• Look at the PRF: Yi = β1 + β2Xi + ui . It shows that Yi depends on
both Xi and ui . The assumptions made about the Xi variable(s)
and the error term are extremely critical to the valid
interpretation of the regression estimates. • Keep in mind that the regressand Y and the regressor X themselves may be
nonlinear.

• The Gaussian, standard, or classical linear regression model


(CLRM), makes 10 assumptions.

1.4 Assumptions of the Classical Linear Regression Model

• look at Table 2.1. Keeping the value of income X fixed, say, at $80, we
draw at random a family and observe its weekly family consumption
expenditure Y as, say, $60. Still keeping X at $80, we draw at random
another family and observe its Y value as $75. In each of these
drawings (i.e., repeated sampling), the value of X is fixed at $80. We
can repeat this process for all the X values shown in Table 2.1.
• This means that our regression analysis is conditional regression
analysis, that is, conditional on the given values of the regressor(s) X.

11
1.4 Assumptions of the Classical Linear Regression Model 1.4 Assumptions of the Classical Linear Regression Model

• Technically, (3.2.2) represents the assumption of homoscedasticity, or equal


(homo) spread (scedasticity) or equal variance. Stated differently, (3.2.2)
means that the Y populations corresponding to various X values have the
same variance.
• Put simply, the variation around the regression line (which is the line of
average relationship between Y and X) is the same across the X values; it
neither increases or decreases as X varies

1.4 Assumptions of the Classical Linear Regression Model

• The disturbances ui and uj are uncorrelated, i.e., no serial correlation. This


means that, given Xi , the deviations of any two Y values from their mean
value do not exhibit patterns. In Figure 3.6a, the u’s are positively correlated,
a positive u followed by a positive u or a negative u followed by a negative u.
In Figure 3.6b, the u’s are negatively correlated, a positive u followed by a
negative u and vice versa. If the disturbances follow systematic patterns,
Figure 3.6a and b, there is auto- or serial correlation. Figure 3.6c shows that
there is no systematic pattern to the u’s, thus indicating zero correlation.

12
1.4 Assumptions of the Classical Linear Regression Model 1.4 Assumptions of the Classical Linear Regression Model

• In the hypothetical example of Table 3.1, imagine that we had only the first
pair of observations on Y and X (4 and 1). From this single observation there
is no way to estimate the two unknowns, β1 and β2. We need at least two pairs
of observations to estimate the two unknowns
• The disturbance u and explanatory variable X are uncorrelated. The PRF
assumes that X and u (which may represent the influence of all the omitted
variables) have separate (and additive) influence on Y. But if X and u are
correlated, it is not possible to assess their individual effects on Y. Thus, if X
and u are positively correlated, X increases when u increases and it decreases
when u decreases. Similarly, if X and u are negatively correlated, X increases
when u decreases and it decreases when u increases. In either case, it is
difficult to isolate the influence of X and u on Y.

1.4 Assumptions of the Classical Linear Regression Model 1.4 Assumptions of the Classical Linear Regression Model

• This assumption too is not so innocuous as it looks. Look at Eq. (3.1.6). If all
the X values are identical, then Xi = X¯ and the denominator of that equation • An econometric investigation begins with the specification of the
will be zero, making it impossible to estimate β2 and therefore β1. Looking at econometric model underlying the phenomenon of interest. Some important
our family consumption expenditure example in Chapter 2, if there is very questions that arise in the specification of the model include the following:
little variation in family income, we will not be able to explain much of the (1) What variables should be included in the model?
variation in the consumption expenditure. • (2) What is the functional form of the model? Is it linear in the parameters,
the variables, or both?

13
1.4 Assumptions of the Classical Linear Regression Model 1.4 Assumptions of the Classical Linear Regression Model

• Suppose we choose the following two models to depict the underlying


relationship between the rate of change of money wages and the
unemployment rate:
• Yi = α1 + α2Xi + ui (3.2.7)
• Yi = β1 + β2 (1/Xi ) + ui (3.2.8)
• where Yi = the rate of change of money wages, and Xi = the unemployment
rate. The regression model (3.2.7) is linear both in the parameters and the
variables, whereas (3.2.8) is linear in the parameters (hence a linear
regression model by our definition) but nonlinear in the variable X. Now
consider Figure 3.7.
• If model (3.2.8) is the “correct” or the “true” model, fitting the model (3.2.7)
to the scatterpoints shown in Figure 3.7 will give us wrong predictions.
• Unfortunately, in practice one rarely knows the correct variables to include
in the model or the correct functional form of the model or the correct
probabilistic assumptions about the variables entering the model for the
theory underlying the particular investigation may not be strong or robust
enough to answer all these questions.

1.4 Assumptions of the Classical Linear Regression Model 1.5 Interval Estimation and Hypothesis Testing

• Look at the estimated MPC in Yˆi = 24.4545 + 0.5091Xi , which is a single


(point) estimate of the unknown population MPC β2. How reliable is this
estimate? A single estimate is likely to differ from the true value, although
in repeated sampling its mean value is expected to be equal to the true value.
• In statistics the reliability of a point estimator is measured by its standard
error. Therefore, we may construct an interval around the point estimator,
say within two or three standard errors on either side of the point estimator,
such that this interval has, say, 95 percent probability of including the true
parameter value.
• Assume that we want to find out how “close” is, say, βˆ2 to β2. We try to find
out two positive numbers δ and α, the latter lying between 0 and 1, such that
the probability that the random interval (βˆ2 − δ, βˆ2 + δ) contains the true β2
is 1 − α. Symbolically,
• Pr (βˆ2 − δ ≤ β2 ≤ βˆ2 + δ) = 1 − α (5.2.1)
• Such an interval is known as a confidence interval.

14
1.5 Interval Estimation and Hypothesis Testing 1.5 Interval Estimation and Hypothesis Testing

• 1 − α is known as the confidence coefficient; and • But σ2 is rarely known, and in practice it is determined by the unbiased estimator
σˆ2. If we replace σ by σˆ, (5.3.1) may be written as:
• α (0 < α < 1) is known as the level of significance.
• The endpoints of the confidence interval are known as the confidence limits
(also known as critical values), βˆ2 − δ being the lower confidence limit and
βˆ2 + δ the upper confidence limit.
• where the se (βˆ2) now refers to the estimated standard error.
• If α = 0.05, or 5 percent, (5.2.1) would read: The probability that the
• Therefore, we can use the t distribution to establish a confidence interval for β2
(random) interval shown there includes the true β2 is 0.95, or 95 percent.
can be established as follows:
The interval estimator thus gives a range of values within which the true β2
may lie. • Pr (−tα/2 ≤ t ≤ tα/2) = 1 − α (5.3.3)
• where tα/2 is the value of the t variable obtained from the t distribution for α/2 level
of significance and n − 2 df; it is often called the critical t value at α/2 level of
significance.
• The confidence interval can be written as ;
• Pr [βˆ1 − tα/2 se (βˆ1) ≤ β1 ≤ βˆ1 + tα/2 se (βˆ1)] = 1 − α
• or, more compactly,
• 100(1 − α)% confidence interval for β1:
• βˆ1 ± tα/2 se (βˆ1)

1.5 Interval Estimation and Hypothesis Testing 1.5 Interval Estimation and Hypothesis Testing

• Notice an important feature of the confidence intervals given in (5.3.6) and • We found that βˆ2 = 0.5091, se (βˆ2) = 0.0357, and df = 8. If we assume α =
(5.3.8): In both cases the width of the confidence interval is proportional to the 5%, that is, 95% confidence coefficient, then the t table shows that for 8 df the
standard error of the estimator. critical tα/2 = t0.025 = 2.306. Substituting these values in (5.3.5), the 95%
confidence interval for β2 is as follows:
• That is, the larger the standard error, the larger is the width of the confidence • 0.4268 ≤ β2 ≤ 0.5914 (5.3.9)
interval. Put differently, the larger the standard error of the estimator, the • Or, using (5.3.6), it is
greater is the uncertainty of estimating the true value of the unknown • 0.5091 ± 2.306(0.0357)
parameter. • that is,
• 0.5091 ± 0.0823 (5.3.10)
• The interpretation of this confidence interval is: Given the confidence
coefficient of 95%, in 95 out of 100 cases intervals like (0.4268, 0.5914) will
contain the true β2.

15
HYPOTHESIS TESTING: THE CONFIDENCE-INTERVAL
1.5 Interval Estimation and Hypothesis Testing APPROACH

• Confidence Interval for β1 • Two-Sided or Two-Tail Test


• Following (5.3.7), we can verify that the 95% confidence interval for β1 of • To illustrate the confidence-interval approach, look at the consumption–
the consumption–income example is income example, the estimated (MPC), βˆ2, is 0.5091. Suppose we postulate
• 9.6643 ≤ β1 ≤ 39.2448 (5.3.11) that:
• Or, using (5.3.8), we find it is • H0: β2 = 0.3 and H1: β2 ≠ 0.3
• 24.4545 ± 2.306(6.4138) • that is, the true MPC is 0.3 under the null hypothesis but it is less than or
• that is, greater than 0.3 under the alternative hypothesis. The alternative hypothesis
is a two-sided hypothesis. It reflects the fact that we do not have a strong
• 24.4545 ± 14.7902 (5.3.12) expectation about the direction in which the alternative hypothesis should
• In the long run, in 95 out of 100 cases intervals like (5.3.11) will contain the move from the null hypothesis.
true β1; again the probability that this particular fixed interval includes the • Is the observed βˆ2 compatible with H0? To answer this question, let us refer
true β1 is either 1 or 0. to the confidence interval (5.3.9). We know that the intervals like (0.4268,
0.5914) will contain the true β2 with 95 percent probability.

HYPOTHESIS TESTING: THE CONFIDENCE-INTERVAL HYPOTHESIS TESTING: THE CONFIDENCE-INTERVAL


APPROACH APPROACH

• Consequently, in the intervals provide a range or limits within which the


true β2 may lie with a confidence coefficient of, say, 95%.
• Therefore, if β2 under H0 falls within the 100(1 − α)% confidence interval, we
do not reject the null hypothesis; if it lies outside the interval, we may reject it.
This range is illustrated schematically in Figure 5.2.
• Decision Rule: Construct a 100(1 − α)% confidence interval for β2. If the β2
under H0 falls within this confidence interval, do not reject H0, but if it falls
outside this interval, reject H0.
• Following this rule, H0: β2 = 0.3 clearly lies outside the 95% confidence
interval given in (5.3.9). Therefore, we can reject the hypothesis that the
true MPC is 0.3, with 95% confidence.

16
HYPOTHESIS TESTING: THE CONFIDENCE-INTERVAL HYPOTHESIS TESTING:
APPROACH THE TEST-OF-SIGNIFICANCE APPROACH

• In statistics, when we reject the null hypothesis, we say that our finding is • Testing the Significance of Regression Coefficients: The t Test
statistically significant. • An alternative to the confidence-interval method is the test-of-significance
• On the other hand, when we do not reject the null hypothesis, we say that approach. It is a procedure by which sample results are used to verify the
our finding is not statistically significant. truth or falsity of a null hypothesis. The decision to accept or reject H0 is
• One-Sided or One-Tail Test made on the basis of the value of the test statistic obtained from the data at
• If we have a strong theoretical expectation that the alternative hypothesis is hand. Recall
one-sided or unidirectional rather than two-sided. Thus, for our
consumption–income example, one could postulate that
• H0: β2 ≤ 0.3 and H1: β2 > 0.3
• Perhaps economic theory or prior empirical work suggests that the
marginal propensity to consume is greater than 0.3. Although the procedure
to test this hypothesis can be easily derived from (5.3.5), the actual • A statistic is said to be statistically significant if the value of the test statistic
mechanics are better explained in terms of the test-of-significance approach lies in the critical region. By the same token, a test is said to be statistically
discussed next. insignificant if the value of the test statistic lies in the acceptance region

HYPOTHESIS TESTING: HYPOTHESIS TESTING:


THE TEST-OF-SIGNIFICANCE APPROACH THE TEST-OF-SIGNIFICANCE APPROACH

• The “Zero” Null Hypothesis


• A null hypothesis that is commonly tested in empirical work is H0: β2 = 0,
that is, the slope coefficient is zero.
• This null hypothesis can be easily tested by the confidence interval or the t-
test approach discussed in the preceding sections.
• For large samples, if α , the level of significance, is set at 0.05, then the null
hypothesis β2=0 can be rejected if the t value exceeds 1.96 in absolute value.

17
HYPOTHESIS TESTING:
R2 AND THE ADJUSTED R2
THE TEST-OF-SIGNIFICANCE APPROACH
• The Exact Level of Significance: The p Value • We now consider the goodness of fit of the fitted regression line to a set of
• Once a test statistic is obtained in a given example, why not simply go to the data; that is, we shall find out how “well” the sample regression line fits the
appropriate statistical table and find out the actual probability of obtaining data. The coefficient of determination r2 (two-variable case) or R2 (multiple
a value of the test statistic as much as or greater than that obtained in the regression) is a summary measure that tells how well the sample regression
example? This probability is called the p value (i.e., probability value). line fits the data.
• To illustrate, given H0 that the true MPC is 0.3, we obtained a t value of 5.86 • For example, a coefficient of determination of 0.6 shows that 60% of the data
in (5.7.4).
fit the regression model. Generally, a higher coefficient indicates a better fit for
• What is the p value of obtaining a t value of as much as or greater than 5.86? the model.
• Looking up the t table, for 8 df the probability of obtaining such a t value
must be much smaller than 0.001 (one-tail) or 0.002 (two-tail). This observed, • R2 = ESS / TSS = 1 − (RSS / TSS) (7.8.1)
or exact, level of significance of the t statistic is much smaller than the
conventionally, and arbitrarily, fixed level of significance, such as 1, 5, or 10
percent.
• If we were to use the p value just computed, and reject the null hypothesis,
the probability of our committing an error is only about 0.02 percent, that
is, only about 2 in 10,000!

R2 AND THE ADJUSTED R2 REPORTING THE RESULTS OF REGRESSION ANALYSIS

• An important property of R2 is that as the number of regressors increases, • Employing the consumption–income example of as an illustration:
R2 almost invariably increases and never decreases. Stated differently, an • Yˆi = 24.4545 + 0.5091Xi
additional X variable will not decrease R2. • se = (6.4138) (0.0357) r2 = 0.9621 (5.11.1)
• To compare two R2 terms, one must take into account the number of X • t = (3.8128) (14.2605)
variables present in the model. This can be done readily if we consider an
alternative coefficient of determination, which is as follows: • p = (0.002571) (0.000000289)
• R¯ 2 = 1 − (RSS / (n− k)) / (TSS /(n− 1)) (7.8.2) • Under the null hypothesis that the true population intercept value is zero, is
only about 0.0026. Therefore, if we reject this null hypothesis, it means that
• where k = the number of parameters in the model including the intercept the true population intercept is different from zero.
term. (In the three-variable regression, k = 3. The R2 thus defined is known
as the adjusted R2, denoted by R¯ 2.

18
A HYPOTHETICAL EXAMPLE

• MS (mean sum of Squares) is obtained by dividing SS by their df.


• F is computed by dividing the model MS to the residual MS
• R-squared =ESS/TSS
• Adj R-squared=1-Residual MS/Total MS
• t=coef./std. Err
• 95% confidence interval= [βˆ1 − tα/2 se (βˆ1) βˆ1 + tα/2 se (βˆ1)]
[βˆ1 − 2.306 se (βˆ1) βˆ1+ 2.306se (βˆ1)]

19

You might also like