Professional Documents
Culture Documents
Introduction
• In multiple regression and correlation we use more than one independent
variable to investigate the dependent variable.
ANOVA
Significance
df SS MS F F
Regression 2 21.57613 10.78806 9.411471 0.010371
Residual 7 8.023873 1.146268
Total 9 29.6
Standard
Coefficients Error t Stat P-value
Intercept -13.81963 13.32330 -1.03725 0.33411
X1 0.56366 0.30327 1.85859 0.10543
X2 1.09947 0.31314 3.51112 0.00984
Multiple R
• The "R" column represents the value of R, the
multiple correlation coefficient. R can be
considered to be one measure of the quality
of the prediction of the dependent variable.
R Square
• The "R Square" column represents the R2 value
(also called the coefficient of determination),
which is the proportion of variance in the
dependent variable that can be explained by the
independent variables (technically, it is the
proportion of variation accounted for by the
regression model above and beyond the mean
model).
• You can see from our value of 0.728 that our
independent variables explain 72.8% of the
variability of our dependent variable.
Interpretation of R squared
• After fitting a linear regression model, you need to
determine how well the model fits the data. Does it do
a good job of explaining changes in the dependent
variable?
• R-squared is a goodness-of-fit measure for linear
regression models.
• R2 by itself can't thus be used to identify which
predictors should be included in a model and which
should be excluded. R2 can only be between 0 and 1,
where 0 indicates that the outcome cannot be
predicted by any of the independent variables and 1
indicates that the outcome can be predicted without
error from the independent variables.
Interpretation of R squared
• R-squared evaluates the scatter of the data points
around the fitted regression line. It is also called
the coefficient of determination, or the
coefficient of multiple determination for multiple
regression.
• For the same data set, higher R-squared values
represent smaller differences between the
observed data and the fitted values.
• R-squared is the percentage of the dependent
variable variation that a linear model explains.
Are Low R-squared Values
Always a Problem?
• Where
• Y = sample values of dependent variable
• ŷ = corresponding estimated values from the
regression equation
• N = number of data points
• K = number of independent variables
Standard Error of Estimates