You are on page 1of 3

How to read a regression table?

A regression is a statistical modeling of a relationship between a variable y and one or more independent variables x1, x2, x3, …xn. The regression table is a common feature in empirical articles which appear in OB, psychology, and economics journals where the hypothesized relationships are tested through interpretation of the regression. A good way to learn how to read regression tables is to read the text and try to relate the textual matter with the numbers in the regression table. 1. What is the dependent variable (DV)? This is the variable that we are trying to predict/explain. What is desirable in real terms- an increasing DV or a reducing DV? 2. What are the independent variables (IVs)? How are they measured? What does a positive or negative sign on the variables represent? [+ve represents an increase in IV leads to an increase in DV] 3. How many observations have been included in the sample? 4. A regression is an optimization program which tries to derive an equation predicting/explaining the dependent variable in terms of the independent variables. Its general form is DV = f(IVn) + c. 5. One regression table may often consist of the output of a number of regressions. The output of each regression will be listed as a new column. Often these columns are numbered 1, 2, 3,…. The first thing to find out is how each of the regression columns is different. If the column headings are different then each column may be predicting different DVs. However, if the columns are simply numbered then each column may be representing a more complete explanation of the same DV. A more complete explanation of the DV often incorporates more of IVs in a step wise fashion. In case you are confused or have less time, a safe option is to look at the rightmost column which often contains the maximum number of IVs and is often the most complete description of the DV. 6. Under each column, the numbers in the regression table represent the following: a. Coefficients of IVs: The values mentioned against each of the IVs represent the coefficient of the IVs. The IVs are often mentioned in the extreme left of the table and each value listed against the IV represents its coefficient pertaining to the regression equation. The coefficients of the IV may be either standardized or nonstandardized. The values mentioned in economics journals are usually nonstandardized whereas; those mentioned in OB/Psychology journals are usually

1

They can be used directly to plug in values of IV and predict the value of DV.standardized. (ii) the p value is mentioned in parenthesis. The greater the number of *’s the more statistically significant the coefficient is. Standardized coefficients: These coefficients are formed when the analysis variables are standardized (i.. the stronger the relationship of the IV with the DV. Statistical significance of coefficients of IVs: The statistical significance of the coefficients of the IVs is displayed on the coefficients in the form of one. these can be compared amongst each other and the larger the coefficient.(i) a footnote in the table mentions that all reported values (above a particular value) are significant.e. R square value increasing which shows that as we keep adding variables in the regression. Sometimes if the *’s are not present and either of the three other indications of statistical significance are used. ii. Statistical significance represents the confidence with which we can say that a particular coefficient is non-zero. p values lesser than . (iii) values of the t statistic are mentioned in parenthesis (larger t statistic values implies higher statistical significance). two. mean is subtracted from each value & the difference is divided by the standard deviation).10 are considered significant. the explanatory power of the model gets better. Often in a series of regressions which are displayed in successive columns. i. In case a coefficient is not statistically significant irrespective of the numerical value. this is mentioned in the table or will be mentioned in the text. c. or three “*” whose explanation is provided at the bottom of the table. A coefficient of zero means that that independent variable is not related to the dependent variable.. In case you are not sure. However. 2 . you will find the adj. b. Non-standardized coefficients: These coefficients are formed when the analysis variables have retained their units (hence they remain sensitive to units of measurement). They are independent of units and cannot be used to predict the value of the DV in meaningful terms. we can consider it to be zero. Adjusted R2: The adjusted R square value is a value between zero to one which represents the proportion of the dependent variable which can be explained by the set of independent variables in the column.

the algorithm identifies that B is more strongly related to C as compared to A. If the coefficient of the interaction term is negative. Residuals: Residuals are the difference between the actual and the estimated value of the dependent variable (after the regression).. However. This means that B moderates the relationship between A and C. 3 . the strength of the relationship between A and C reduces. B is a mediator) and suppose initially we include only A and C in the regression. Suppose A is related to C and the interaction term. At this time. Mediation: Regression tries to fit the most likely relationship between two or more variables. Moderation: Moderation is tested in the regression by entering the cross product of two variables as an independent variable in the regression. Statistical and Practical significance: The statistical significance of a relationship can be established by the statistical significance of the coefficients of the independent variables and the value of Adjusted R square. Suppose a variable A has an impact of B which in turn leads to C (i. The practical significance of a relationship can be tested by plugging high and low values of IV into the (non-standardized) regression equation and seeing the impact of the values of DV. If the change in DV values is meaningful enough. the regression algorithm is likely to show a strong relationship between A and C. once B is entered into the regression. 9. 11. Statistical significance simply means that the independent variable(s) have some relationship with the DV and are able to predict with some certainty the DV. 10.e. A*B is also entered into the regression equation and we find that the interaction term A*B is significant. It is also known as “error” and it represents the unexplained portion of the regression.e. However. There will be one residual value corresponding to each data point. the strength of the relationship between A and C increases. squares or cubes.7. it means that as the value of B increases. Log terms and Power terms: If the independent variable is a logarithm or a power term i. 8. This shows that B mediates (partially or fully) the relationship between A and C. we can conclude that the relationship is practically significant.. The relationship between the IV and the DV has to be interpreted accordingly as y = k*ln(x) + constant or y = k*x2 + constant. If the coefficient of the interaction term is +ve it means that as the value of B increases. this does not necessarily mean that the relationship is practically meaningful. the earlier coefficient of A will reduce in size or significance.