You are on page 1of 21


Gujarati PART SINGLE-EQUATION REGRESSION MODELS 15 1 The Nature of Regression Analysis 2 Two-Variable Regression Analysis: Some Basic Ideas 3 Two-Variable Regression Model: The Problem of Estimation 4 Classical Normal Linear Regression Model (CNLRM) 5 Two-Variable Regression: Interval Estimation and Hypothesis Testing 6 Extensions of the Two-Variable Linear Regression Model 7 Multiple Regression Analysis: The Problem of Estimation 8 Multiple Regression Analysis: The Problem of Inference 9 Dummy Variable Regression Models PART II RELAXING THE ASSUMPTIONS OF THE CLASSICAL MODEL 10 Multicollinearity: What Happens if the Regressors Are Correlated 11 Heteroscedasticity: What Happens if the Error Variance Is Nonconstant? 12 Autocorrelation: What Happens if the Error Terms Are Correlated 13 Econometric Modeling: Model Specification and Diagnostic Testing vi BRIEF CONTENTS PART III TOPICS IN ECONOMETRICS 14 Nonlinear Regression Models 15 Qualitative Response Regression Models

16 Panel Data Regression Models 17 Dynamic Econometric Models: Autoregressive and Distributed-Lag Models PART IV SIMULTANEOUS-EQUATION MODELS 18 Simultaneous-Equation Models 19 The Identification Problem 20 Simultaneous-Equation Methods 21 Time Series Econometrics: Some Basic Concepts 22 Time Series Econometrics: Forecasting

INTRODUCTION I.1 WHAT IS ECONOMETRICS? Literally interpreted, econometrics means economic measurement. Al-though measurement is an important part of econometrics, the scope of econometrics is much broader, as can be seen from the following quotations: Econometrics, the result of a certain outlook on the role of economics, consists of the application of mathematical statistics to economic data to lend empirical sup-port to the models constructed by mathematical economics and to obtain numerical results. . . . econometrics may be dened as the quantitative analysis of actual economic phenomena based on the concurrent development of theory and observation, related by appropriate methods of inference. Econometrics may be dened as the social science in which the tools of economic theory, mathematics, and statistical inference are applied to the analysis of economic phenomena. Econometrics is concerned with the empirical determination of economic laws.

METHODOLOGY OF ECONOMETRICS How do econometricians proceed in their analysis of an economic problem? That is, what is their methodology? Although there are several schools of thought on econometric methodology, we present here the traditional or classical methodology, which still dominates empirical research in economics and other social and behavioral sciences. Broadly speaking, traditional econometric methodology proceeds along the following lines: 1. Statement of theory or hypothesis.

2. Specication of the mathematical model of the theory 3. Specication of the statistical, or econometric, model 4. Obtaining the data 5. Estimation of the parameters of the econometric model 6. Hypothesis testing 7. Forecasting or prediction 8. Using the model for control or policy purposes. FIGURE I.4 Anatomy of econometric modeling. Please see Page 10.

CHAPTER ONE: The Nature of Regression Analysis

1.1 Historical origin of the term regression 1.2 The modern interpretation of regression 1.3 STATISTICAL VERSUS DETERMINISTIC RELATIONSHIPS 1.4 REGRESSION VERSUS CAUSATION 1.5 REGRESSION VERSUS CORRELATION 1.6 TERMINOLOGY AND NOTATION Before we proceed to a formal analysis of regression theory, let us dwell briey on the matter of terminology and notation. In the literature the terms dependent variable and explanatory variable are described variously. A representative list is: Dependent variable Explained variable Predictand Regressand Response Endogenous Outcome Explanatory variable Independent variable Predictor Regressor Stimulus Exogenous Covariate

Controlled variable

Control variable

1.7 THE NATURE AND SOURCES OF DATA FOR ECONOMIC ANALYSIS10 The success of any econometric analysis ultimately depends on the avail-ability of the appropriate data. It is therefore essential that we spend some time discussing the nature, sources, and limitations of the data that one may encounter in empirical analysis. Types of Data Three types of data may be available for empirical analysis: time series, cross-section, and pooled (i.e., combination of time series and cross-section) data. Time Series Data Table I.2: Please see Page 32. Cross-section Data Table 1.1 Pages 27

CHAPTER TWO: TWO-VARIABLE REGRESSION ANALYSIS: SOME BASIC IDEAS 41 2.1 A hypothetical Example 2.2 THE CONCEPT OF POPULATION REGRESSION FUNCTION (PRF) 2.3 THE MEANING OF THE TERM LINEAR Since this text is concerned primarily with linear models, it is essential to know what the term linear really means, for it can be interpreted in two different ways. Linearity in the Variables

Linearity in the Parameters 2.4 STOCHASTIC SPECIFICATION OF PRF 2.5 THE SIGNIFICANCE OF THE STOCHASTIC DISTURBANCE TERM As noted in Section 2.4, the disturbance term ui is a surrogate for all those variables that are omitted from the model but that collectively affect Y. The obvious question is: Why not introduce these variables into the model explicitly? Stated otherwise, why not develop a multiple regression model with as many variables as possible? The reasons are many. 1. Vagueness of theory: The theory, if any, determining the behavior of Y may be, and often is, incomplete. We might know for certain that weekly income X inuences weekly consumption expenditure Y, but we might be ignorant or unsure about the other variables affecting Y. Therefore, ui maybe used as a substitute for all the excluded or omitted variables from the model. 2. Unavailability of data: Even if we know what some of the excluded variables are and therefore consider a multiple regression rather than a simple regression, we may not have quantitative information about these. 3. Core variables vs peripheral variables 4. Intrinsic randomness in human behavior 5. Poor proxy variables 6. Principles of parsimony 7. Wrong functional form 2.7 AN ILLUSTRATIVE EXAMPLE Please see page 51.

EXERCISES Questions 2.1. What is the conditional expectation function or the population regression function? 2.2. What is the difference between the population and sample regression functions? Is this a distinction without difference? 2.3. What is the role of the stochastic error term ui in regression analysis? What is the difference between the stochastic error term and the residual, ui ? 2.4. Why do we need regression analysis? Why not simply use the mean value of the regressand as its best value? 2.5. What do we mean by a linear regression model? 2.6. Determine whether the following models are linear in the parameters, or the variables, or both. Which of these models are linear regression models? Exercise 2.16: Student Exercise Page 57.

CHAPTER THREE: TWO-VARIABLE REGRESSION MODEL 3.1 THE METHOD OF ORDINARY LEAST SQUARES 3.2 THE CLASSICAL LINEAR REGRESSION MODEL: THE ASSUMPTIONS UNDERLYING THE METHOD OF LEAST SQUARES Assumption 1: Linear regression model. The regression model is linear in the parameters, as shown in Yi = 1 + 2Xi + ui This assumption means that the dependent variable can be calculated as a linear function of a set of independent variables, plus a disturbance term. This is shown in the above equation that states that: the regression model is linear in the unknown coefficients 1, 2 so that Yi = 1 + 2Xi + ui , for i = 1,2,3.,n. Assumption 2: X values are xed in repeated sampling. Values taken by the regressor X are considered xed in repeated samples. More technically, X is assumed to be nonstochastic. the manner in which Yi are generated. To see why this requirement is needed, look at the PRF: Yi = 1 + 2 Xi + ui . It shows that Yi depends on both Xi and ui . Therefore, unless we are specic about how Xi and ui are created or generated, there is no way we can make any statistical inference about the Yi and also, as we shall see, about 1 and 2. Thus, the assumptions made about the Xi variable(s) and the error term are extremely critical to the valid interpretation of the regression estimates. Assumption 3 states that the mean value of ui , conditional upon the given Xi , is zero. Geometrically, this assumption can be pictured as in Figure 3.3, which shows a few values of the variable X and the Y populations associated with each of them. As shown, each Y population corresponding to a given X is distributed around its mean value (shown by the circled points on the PRF) with some Y values above the mean and some below it. The distances above and below the mean values are nothing but the ui , and what (3.2.1) requires is that the average or mean value of these deviations corresponding to any given X should be zero. This assumption should not be difcult to comprehend in view of the discussion in Section 2.4 .All that this assumption says is that the factors not explicitly included in the model, and therefore subsumed in ui , do not systematically affect the mean value of Y; so to speak, the positive ui 9For illustration, we are assuming merely that the us are distributed symmetrically as shown in Figure 3.3. But shortly we will assume that the us are distributed normally. Assumption 4: Homoscedasticity or equal variance of ui. Given the value of X, the variance of ui is the same for all observations. That is, the conditional variances of ui are identical. Symbolically, we have var (ui | Xi) = E *ui E(ui | Xi)+2 = E(ui2 | Xi ) because of Assumption 3

= 2 (3.2.2) where var stands for variance. In passing, note that the assumption E(ui | Xi ) = 0 implies that E(Yi | Xi ) = i + 2 Xi . (Why?) Therefore, the two assumptions are equivalent. Assumption 5: No autocorrelation between the disturbances. Given any two X values, Xi and Xj (i = j ), the correlation between any two ui and uj (i = j ) is zero. Symbolically, cov (ui,uj | Xi,Xj) = E ,*ui E (ui)+ | Xi -,*uj E(uj)+ | Xj = E(ui | Xi)(uj | Xj) (why?) = 0 (3.2.5) Assumption 6: Zero covariance between ui and Xi , or E(ui/Xi) = 0. Formally, cov (ui, Xi) = E *ui E(ui)+*Xi E(Xi)+ = E *ui (Xi E(Xi))+ since E(ui) = 0 = E (uiXi) E(Xi)E(ui) since E(Xi) is non-stochastic (3.2.6) = E(uiXi) since E(ui) = 0 = 0 by assumption Assumption 6 states that the disturbance u and explanatory variable X are uncorrelated. The rationale for this assumption is as follows: When we expressed the PRF as in (2.4.2), we assumed that X and u (which may represent the inuence of all the omitted variables) have separate (and additive) inuence on Y. But if X and u are correlated, it is not possible to assess their individual effects on Y. Thus, if X and u are positively correlated, X increases. Assumption 7: The number of observations n must be greater than the number of parameters to be estimated. Alternatively, the number of observations n must be greater than the number of explanatory variables. Assumption 8: Variability in X values. The X values in a given sample must not all be the same. Technically, var (X ) must be a nite positive number.13 Assumption 9: The regression model is correctly specied. Alternatively, there is no specication bias or error in the model used in empirical analysis. Assumption 10: There is no perfect multicollinearity. That is, there are no perfect linear relationships among the explanatory variables.

3.3 PRECISION OR STANDARD ERRORS OF LEAST-SQUARES ESTIMATES Meaning of Degrees of Freedom: Please see page 77 3.4 PROPERTIES OF LEAST-SQUARES ESTIMATORS: THE GAUSSMARKOV THEOREM As noted earlier, given the assumptions of the classical linear regression model, the least-squares estimates possess some ideal or optimum proper-ties. These properties are contained in the well-known GaussMarkov theorem. To understand this theorem, we need to consider the best linear unbiasedness property of an estimator. For this the following conditions must hold: 1. It is linear, that is, a linear function of a random variable, such as the dependent variable Y in the regression model. 2. It is unbiased, that is, its average or expected value, E( 2), is equal to the true value, 2. 3. It has minimum variance in the class of all such linear unbiased estimators; an unbiased estimator with the least variance is known as an efcient estimator. In the regression context it can be proved that the OLS estimators are BLUE. This is the gist of the famous GaussMarkov theorem, which can be stated as follows: GaussMarkov Theorem: Given the assumptions of the classical linear regression model, the leastsquares estimators, in the class of unbiased linear estimators, have minimum variance, that is, they are BLUE. The proof of this theorem is sketched in Appendix 3A, Section 3A.6.

Question: Prove that OLS coefficient for the slope parameter in the simple linear regression model is unbiased and efficient. Answer: See Appendix A. 3.5 THE COEFFICIENT OF DETERMINATION r 2 : A MEASURE OF GOODNESS OF FIT: The regular coefficient of determination, r 2 is a measure of the closeness of fit in the multiple regression. However,

r 2 , cannot be used as a means of comparing two different equations containing different numbers of
explanatory variables. This is because when additional explanatory variables are added, the proportion
2 of variation in Y (Dependent variable) explained by the Xs (Independent variable), r , will always be

increased. Therefore, we will always obtain a higher r regardless of the importance or not of the additional regressor. For this reason we need a different measure that will take into account the number
2 of explanatory variables included in each model. This measure is called the adjusted r , because it is adjusted for the number of regressors (or adjusted for the degrees of freedom)

Please see page 81-87 for details.

Adjusted r 2 : Adjusted r 2 = 1

RSS /(n k ) RSS (n 1) 1 TSS /(n 1) TSS (n k )

RSS =Residual sum of squares; TSS =Total sum of squares. TSS = RSS + ESS ; ESS = explained sum of squares.

General criteria for model selection: As we know that increasing the number of explanatory variables in multiple regression model will
2 decrease the RSS, and r will therefore increase. However, the cost of that is a loss in terms of degrees

of freedom. A different method-apart from adjusted r - of allowing for number of Xs to change when assessing goodness of fit is to use different criteria for model comparison such as: Akaike Information Criterion (AIC); Schwartz Information Criterion (SIC); Schwartz Bayesian Criterion (SBC), Hannan and Quin Critirion (HQC). The general guideline to select a model: In comparing two or more models, the model with the lowest AIC is preferred when used AIC criterion. Similarly the model with the lowest SIC is preferred when used SIC criterion.

For details please see page: 530-539. 3.6 A NUMERICAL EXAMPLE: Please see this example at page 89. Illustrative Example: Please see 90.

Some examples of Regression

Data Source: wage.wf1 SUMMARY OUTPUT


Regression Statistics Multiple R 0.385548014 R Square 0.148647271 Adjusted R Square 0.14579676 Standard Error 0.388465077 Observations 900 ANOVA df Regression Residual Total 3 896 899 SS MS 23.60801004 7.869336682 135.2109838 0.150905116 158.8189938 F 52.14758055 Significance F 4.53E-31


Coefficients 5.528329355 0.07311662 0.015357835 0.012964063

Standard Error 0.112794574 0.006635679 0.003425312 0.002630727

t Stat 49.01236961 11.01871044 4.483630929 4.927939405

P-value 8.6993E-256 1.44273E-26 8.2905E-06 9.89741E-07

Upper Lower 95% 95% 5.306957 5.749702 0.060093 0.08614 0.008635 0.02208 0.007801 0.018127

5 0 0 0

ESS / df ESS /(k 1) 7.869336682 52 .14758055 RSS / df RSS /(n k ) 0.150905116


R 2 /(k 1) (1 R 2 ) /(n k )

The p-value of obtaining an F value of as much as 52.14758055 or greater is zero, leading to the rejection of the hypothesis that together educ, exper, and tenure have no effect on lnwage. If you were to use the conventional 5% level-of-significance value, the critical F value for 3 df in the numenator and 896 df in the denominator is about 3.84. Obviously the observed F of 52.14758055 far exceeds the critical F values of 3.84 and thereby we reject the null hypothesis of H 0 : 1 2 3 0 .



Series: RESID Sample 1 900 Observations 900 Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis 1.16e-15 0.023735 1.332397 -1.837909 0.387816 -0.236952 3.841932





Jarque-Bera 35.00382 Probability 0.000000

-1.5 -1.0 -0.5 0.0 0.5 1.0

Note that t-testing procedure is based on the assumption that the error term u i follows the normal distribution. Although we cannot directly observe u i , we can observe their proxy, the
i , that is, the residuals. For our lnwage regression, the histogram of the residuals is as shown u

in the above figure. From the histogarmit seems that that the residuals are not normally distributed. In our case, the JB value is 35 with a p- value of 0. It seems that the error term is not normally distributed. For our example, the skewness value is -0.23 and the kutosis value is 3.84. Recall that for a normally distributed variablr the skewness and kurtosis values are, respectively, 0 and 3.

400 350 300 250 200 150 100 50 0 9 10 11 12 13 14 15 16 17 18

Series: EDUC Sample 1 900 Observations 900 Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis Jarque-Bera Probability 13.48000 12.00000 18.00000 9.000000 2.200374 0.541434 2.261056 64.44902 0.000000

Some examples of Regression

Data Source: wage.wf1

ln wage 0 1educ 2 exp er 3tenure i

As may be seen from Table 1.1, all three variables have positive coefficients. These are all above the rule of thumb critical t-value of 2, hence all are significant. So, it may be said that wages will increase as education, experience and tenure increases. Despite the significance of these variables, the adjusted r 2 is quite low (0.145) as there are probably other variables that affect wage.
For example, education increases by 1 year, on average, wage would increase by Taka 0.07. If education was zero, the average wage would be Taka 5.52 (which is the constant). The r value of about 0.15 means that 15% of the variation in lnwage is explained by educ, exper, and tenure. Testing hypothesis: Suppose we want to test null hypothesis that there is no relationship between lnwage and education , that is, the true slope coefficient 1 0 . The estimated value of 1 is 0.073117. If the null hypothesis were true, what is the probability of obtaining a value of 0.073117? Under the null hypothesis, we observe that the t- value is 11.01871 and the p value of obtaining such a t- value is practically zero. In other words, we can reject the null hypothesis resoundingly.
Dependent Variable: LNWAGE Method: Least Squares Date: 11/05/12 Time: 18:49 Sample: 1 900 Included observations: 900

Table 1.1: Results from the wage equation Variable C EDUC EXPER TENURE R-squared Adjusted R-squared S.E. of regression Sum squared resid Log likelihood F-statistic Prob(F-statistic) Coefficient 5.528329 0.073117 0.015358 0.012964 0.148647 0.145797 0.388465 135.2110 -424.0434 52.14758 0.000000 Std. Error 0.112795 0.006636 0.003425 0.002631 t-Statistic 49.01237 11.01871 4.483631 4.927939 Prob. 0.0000 0.0000 0.0000 0.0000 6.786164 0.420312 0.951208 0.972552 0.959361 1.750376

Mean dependent var S.D. dependent var Akaike info criterion Schwarz criterion Hannan-Quinn criter. Durbin-Watson stat

Descriptive statistics: Table 2

EDUC Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis 13.48000 12.00000 18.00000 9.000000 2.200374 0.541434 2.261056

EXPER 11.59222 11.00000 23.00000 1.000000 4.379564 0.072955 2.437964

WAGE 964.2644 912.5000 3078.000 115.0000 405.1624 1.203925 5.746906

TENURE 7.265556 7.000000 22.00000 0.000000 5.080611 0.422778 2.199197

Jarque-Bera Probability

64.44902 0.000000

12.64404 0.001796

500.3712 0.000000

50.85935 0.000000

Sum Sum Sq. Dev.

12132.00 4352.640

10433.00 17243.35

867838.0 1.48E+08

6539.000 23205.53






Equation: Untitled Test Statistic t-statistic F-statistic Chi-square Value 0.498655 0.248656 0.248656 df 896 (1, 896) 1 Probability 0.6181 0.6181 0.6180

Null Hypothesis: C(3)=C(4) Null Hypothesis Summary:

Normalized Restriction (= 0) C(3) - C(4) Restrictions are linear in coefficients.

Value 0.002394

Std. Err. 0.004800

Redundant Variables Test Equation: UNTITLED Specification: LNWAGE C EDUC EXPER TENURE Redundant Variables: TENURE Value 4.927939 24.28459 24.06829 df 896 (1, 896) 1 Probability 0.0000 0.0000 0.0000

t-statistic F-statistic Likelihood ratio F-test summary:

Test SSR Restricted SSR Unrestricted SSR Unrestricted SSR LR test summary: Restricted LogL Unrestricted LogL

Sum of Sq. 3.664668 138.8757 135.2110 135.2110

df 1 897 896 896

Mean Squares 3.664668 0.154822 0.150905 0.150905

Value -436.0776 -424.0434

df 897 896

Restricted Test Equation: Dependent Variable: LNWAGE Method: Least Squares Date: 11/05/12 Time: 18:53 Sample: 1 900 Included observations: 900 Variable C EDUC EXPER R-squared Adjusted R-squared S.E. of regression Sum squared resid Log likelihood F-statistic Prob(F-statistic) Coefficient 5.537798 0.075865 0.019470 0.125573 0.123623 0.393475 138.8757 -436.0776 64.40718 0.000000 Std. Error 0.114233 0.006697 0.003365 t-Statistic 48.47827 11.32741 5.786278 Prob. 0.0000 0.0000 0.0000 6.786164 0.420312 0.975728 0.991736 0.981843 1.770020

Mean dependent var S.D. dependent var Akaike info criterion Schwarz criterion Hannan-Quinn criter. Durbin-Watson stat

Dependent Variable: LNWAGE Method: Least Squares Date: 11/05/12 Time: 18:55 Sample: 1 900 Included observations: 900 Variable C EXPER TENURE R-squared Adjusted R-squared S.E. of regression Sum squared resid Log likelihood F-statistic Prob(F-statistic) Coefficient 6.697589 -0.002011 0.015400 0.033285 0.031130 0.413718 153.5327 -481.2281 15.44241 0.000000 Std. Error 0.040722 0.003239 0.002792 t-Statistic 164.4699 -0.621069 5.516228 Prob. 0.0000 0.5347 0.0000 6.786164 0.420312 1.076062 1.092070 1.082178 1.662338

Mean dependent var S.D. dependent var Akaike info criterion Schwarz criterion Hannan-Quinn criter. Durbin-Watson stat

Omitted Variables Test Equation: UNTITLED Specification: LNWAGE C EXPER TENURE Omitted Variables: EDUC Value 11.01871 121.4120 114.3693 df 896 (1, 896) 1 Probability 0.0000 0.0000 0.0000

t-statistic F-statistic Likelihood ratio F-test summary:

Test SSR Restricted SSR Unrestricted SSR Unrestricted SSR LR test summary: Restricted LogL Unrestricted LogL

Sum of Sq. 18.32169 153.5327 135.2110 135.2110

df 1 897 896 896

Mean Squares 18.32169 0.171162 0.150905 0.150905

Value -481.2281 -424.0434

df 897 896

Unrestricted Test Equation: Dependent Variable: LNWAGE Method: Least Squares Date: 11/05/12 Time: 18:56 Sample: 1 900 Included observations: 900 Variable C EXPER TENURE Coefficient 5.528329 0.015358 0.012964 Std. Error 0.112795 0.003425 0.002631 t-Statistic 49.01237 4.483631 4.927939 Prob. 0.0000 0.0000 0.0000

EDUC R-squared Adjusted R-squared S.E. of regression Sum squared resid Log likelihood F-statistic

0.073117 0.148647 0.145797 0.388465 135.2110 -424.0434 52.14758



0.0000 6.786164 0.420312 0.951208 0.972552 0.959361 1.750376

Mean dependent var S.D. dependent var Akaike info criterion Schwarz criterion Hannan-Quinn criter. Durbin-Watson stat

CLASSICAL NORMAL LINEAR REGRESSION MODEL (CNLRM): CHAPTER 4 4.2 THE NORMALITY ASSUMPTION FOR ui The classical normal linear regression model assumes that each ui is distributed normally with Mean: E(ui ) = 0 (4.2.1) Variance: E*ui E(ui )+2 = E u2 = 2 (4.2.2) cov (ui, uj): E,*(ui E(ui)+*uj E(uj )+-= E(ui uj ) = 0 i j (4.2.3) The assumptions given above can be more compactly stated as ui N(0, 2) (4.2.4) Why the normality assumption is required? Please see page 109

4.3: Properties of the OLS estimators under the normality assumption: See page: 110-112.

CHAPTER FIVE: TWO VARIABLE REGRESSION: INTERVAL ESTIMATION AND HYPOTHESIS TESTING 5.1: Statistical prerequisites 5.2 Interval Estimation: Some basic ideas 5.3 Confidence Intervals for regression coefficients 5.4 CONFIDENCE INTERVAL FOR 2 5.5 HYPOTHESIS TESTING: GENERAL COMMENTS Having discussed the problem of point and interval estimation, we shall now consider the topic of hypothesis testing. In this section we discuss briey some general aspects of this topic; Appendix A gives some additional details. The problem of statistical hypothesis testing may be stated simply as follows: Is a given observation or nding compatible with some stated hypothesis or not? The word compatible, as used here, means sufciently close to the hypothesized value so that we do not reject the stated hypothesis. 5.6 HYPOTHESIS TESTING: THE CONFIDENCE-INTERVAL APPROACH Two-Sided or Two-Tail Test To illustrate the condence-interval approach, once again we revert to the Consumptionincome example. As we know, the estimated marginal propen-sity to consume (MPC), 2, is 0.5091. Suppose we postulate that H0: 2 = 0.3 H1: 2 = 0.3 that is, the true MPC is 0.3 under the null hypothesis but it is less than or greater than 0.3 under the alternative hypothesis. The null hypothesis is a simple hypothesis, whereas the alternative hypothesis is composite; actually it is what is known as a two-sided hypothesis. Very often such a two-sided alternative hypothesis reects the fact that we do not have a strong a priori or theoretical expectation about the direction in which the alternative hypothesis should move from the null hypothesis. Is the observed 2 compatible with H0? To answer this question, let us refer to the condence interval (5.3.9). We know that in the long run intervals like (0.4268, 0.5914) will contain the true 2 with 95 percent probability. Consequently, in the long run (i.e., repeated sampling) such intervals provide a range or limits within which the true 2 may lie with a condence coefcient of, say, 95%. Thus, the condence interval provides a set of plausible null hypotheses. Therefore, if 2 under H0 falls within the 100(1 )% condence interval, we do not reject the null hypothesis; if it lies outside the interval, we may reject it.7 This range is illustrated schematically in Figure 5.2. Always bear in mind that there is a 100 percent chance that the condence interval does not contain 2 under H0 even though the

hypothesis is correct. In short, there is a 100 percent chance of committing a Type I error. Thus, if = 0.05, there is a 5 percent chance that we could reject the null hypothesis even though it is true. Following this rule, for our hypothetical example, H0: 2 = 0.3 clearly lies outside the 95% condence interval given in (5.3.9). Therefore, we can reject the hypothesis that MPC is 0.3, with 95% confidence. One-Sided or One-Tail Test Sometimes we have a strong a priori or theoretical expectation (or expectations based on some previous empirical work) that the alternative hypothe-sis is one-sided or unidirectional rather than two-sided, as just discussed. Thus, for our consumptionincome example, one could postulate that H0: 2 0.3 and H1: 2 > 0.3 Perhaps economic theory or prior empirical work suggests that the mar-ginal propensity to consume is greater than 0.3. Although the procedure to test this hypothesis can be easily derived from (5.3.5), the actual mechanics are better explained in terms of the test-of-signicance approach. 5.7 HYPOTHESIS TESTING: THE TEST-OF-SIGNIFICANCE APPROACH Testing the Signicance of Regression Coefcients: The t Test An alternative but complementary approach to the condence-interval method of testing statistical hypotheses is the test-of-signicance approach developed along independent lines by R. A. Fisher and jointly by Neyman and Pearson. Broadly speaking, a test of signicance is a procedure by which sample results are used to verify the truth or falsity of a null hypothesis. The key idea behind tests of signicance is that of a test statistic (estimator) and the sampling distribution of such a statistic under the null hypothesis. The decision to accept or reject H0 is made on the basis of the value of the test statistic obtained from the data at hand. As an illustration, recall that under the normality assumption the variable t = 2 2/se ( 2)

) x 2 )/ = ( 2 2 i

please see page 129-132

5.8 HYPOTHESIS TESTING: SOME PRACTICAL ASPECTS The Meaning of Accepting or Rejecting a Hypothesis If on the basis of a test of signicance, say, the t test, we decide to accept the null hypothesis, all we are saying is that on the basis of the sample evidence we have no reason to reject it; we are not saying that the null hypothesis is true beyond any doubt. Why? To answer this, let us revert to our consumptionincome example and assume that H0: 2 (MPC) = 0.50. Now the estimated value of the MPC is 2 = 0.5091 with a se ( 2) = 0.0357. Then

on the basis of the t test we nd that t = (0.5091 0.50)/0.0357 = 0.25, which is insignicant, say, at = 5%. Therefore, we say accept H0. But now let us assume H0: 2 = 0.48. Applying the t test, we obtain t = (0.5091 0.48)/0.0357 = 0.82, which too is statistically insignicant. So now we say accept this H0.

Please read page 139: The choice between confidence interval and test of significance approaches to hypothesis testing 5.9 REGRESSION ANALYSIS AND ANALYSIS OF VARIANCE In this section we study regression analysis from the point of view of the analysis of variance and introduce the reader to an illuminating and complementary way of looking at the statistical inference problem. 5.12 EVALUATING THE RESULTS OF REGRESSION ANALYSIS

Please read page 147: Normality tests that include (1) Histogram of residuals; (2) Normal probability plot (NPP), a graphical device; and (3) Jarque-Bera test.