# Interpreting Regression Output

Suppose I am investigating the relationship between types of cars and their miles per gallon. My hypothesis is that luxury models are gas guzzlers. I am testing this hypothesis using 1978 auto data. I use weight as a proxy for luxury models, as I expect luxury cars are heavier. It also seems to make sense that heavier cars would use more gas. At the command window, type: sysuse auto This brings up our sample data into Stata. Next, try: regress mpg weight

Stata outputs analysis of variance (anova) results along with the regression results. Top left is anova table, and bottom is regression results. The dependent variable here is miles per gallon (mpg), and the variable name is shown at the left top of regression results table. The weight here is measured in pounds. The coefficients for weight and foreign are shown in the Coef. column. Std. Err. is Standard Error, t t test statistics, P>|t| the p values, and 95% Confidence Interval. The results can be written in regression equation form as: predicted MPG = 39.44 - 0.006WEIGHT For each pound increase in auto weight, miles per gallon decrease by 0.006, and it is statistically significant at least at 99% level (when shown as 0.000, it is less than 0.0005). You can see that the standard error is very small showing less variation and the absolute value of the t test statistic is relatively large. You can tell the statistical significance through the p value: when it is less than 0.05, it is significant at 95% level, and if it is less than 0.01, it is significant at 99% level. Constant (_cons) is an intercept of the regression line, or the starting point: mpg would be about 39 for cars with no weight. It may not make sense as such, but that is the average of mpg controlling for weight.

and it is statistically significant at 99% level. Adjusted R squared adjusts the value of the R squared by the ratio of the sample size to the number of variables. heavier models use more gas: each one pound increase in weight results in 0.S. statistical significance can change by getting more observations. because the p value is 0. but the adjusted R squared takes the number of . Root MSE is square root of the mean squared error (MS Residual in the anova table). models or non-U.0.0066 less mpg. Later you can see the change in the statistical significance of foreign by making an adjustment to the model. In this regression I got R-squared of 0.S.1.000.Right top corner lists information associated with the anova and the regression output. Naturally.Variables that have a binary outcome like this U.68 . Controlling for foreign cars.0066WEIGHT . R squared was 0. Statistical significance shows you the probability that the sample value is the population value. R squared will be larger if you have more variables. The interpretation of the variable is easier if you code them as 0 or 1. I want to control whether the cars are U. the variable foreign are coded 0 for US (domestic) cars and 1 for non-US (foreign) cars. meaning about 65% of the variance in mpg is explained by the model. is not important in estimating mpg? Here. Now. so by adding one variable I am explaining the mpg 1% more. after all. or by fitting the regression line better. I will come back to the R-squared and adjusted R-squared in the next model. Then it is an example of a multiple regression. and 1 for foreign cars: so MPG is 1.65FOREIGN You can plug in 0 into foreign to estimate the MPG for domestic cars.6627.S. assuming null hypothesis of no relationship is true. F test statistic with 1 numerator degrees of freedom and 72 denominator degrees of freedom is 134. So can we say foreign. models in addition to weight in predicting miles per gallon. and is the standard deviation of the error term. Total number of observations used for the analysis is 74. What I did earlier is a simple regression with just one predictor variable.65 less for foreign cars than for domestic cars. In the earlier model. models are called dummy variables. it is very important that you distinguish statistical and substantive significance. In addition.S. Here.65. predicted MPG = 41. vs non-U. what is not explained by the model. Notice that foreign is not statistically significant at any conventional level of significance in this model. still.

6467. We jumped right in to regression.((1.variables into account. In the earlier model. In your study. The formula to get the adjusted R squared is 1. So still this model explains the mpg better. but there is a whole series of assumptions we are making in running regression analyses. UCLA has very good sites where they discuss regression diagnostics. you need to check the data to see if the regression assumptions are met. . adjusted R-squared was 0.R squared)* ((n-1)/(n-k-1)). and in the current model it is 0. It can be useful when you have many variables and a small sample size.6532.