R E G R EGRADUATE
SSION ANALYSIS CHEAT SHEET
R ESOURCE C ENTER , U NIVERSITY OF N EW M EXICO
1 R EGRESSION BASICS 2 R EGRESSION 3 R EGRESSION RESULT
Definitions and Terms Graphical Representation (univeriate linear model) After Regression, what to look for
Variable: Measurable characteristics that varies (by groups, indivi- 1. R-square (R2 ): It shows the amount of variance of DV explained
duals or time) by IV (described in percent)
Dependent/Outcome Variable (DV): Presumed effect in an analysis 2
2. p-value of the model: It tests whether R is different from
0. A value less than 0.05 shows statistically significant relationship
Independent/Explanatory Variable (IV): The presumed cause between IV and DV
in an analysis
3. p-value of the coefficient: p-value of the hypothesis testing
Control Variable/Covariate: Variables that are not studied but coefficient is different from 0 (H0 ).
included in the model/analysis
4. Coefficient (βi ): For each one-point increase in IV, the DV
Parameter: Unknown population characteristics is expected to increases by βi (or decreases if βi is negative), holding
all the other independent variables constant
Best Fitting Line: When plotting data, the most appropriate Note: Coefficient of logistic regression is interpreted differently
line showing the relationship between dependent and independent
variables 5. Regression results also provides:
t-statistics: Coefficient divided by standard error
Residual: Deviations from the fitted line (estimated value) to Standard error: Standard deviation of the coefficient
Mathematical Representation
the observed values (data point) Confidence Interval of the coefficient - Usually 95%
1. Linear regression (DV is a continuous variable)
Root mean squared error: Standard deviation of the regression
Error: Difference between the observed value and the true va- Constant Coefficient IV
DV Error
lue (often unobserved) z}|{ z}|{ z}|{ z}|{ z}|{ Check for Key Assumptions
yi = β0 + β0 xi + i
1. Linearity: The relationship between IV and the mean of DV is
Regression Coefficient: Describes the relationship between a 2. Logistic Regression (DV is a binary variable) linear.
DV and IV
1 2. Homoscedasticity: The variance of residual is the same for
Understanding Regression Pr(yi = 1 | xi ) = 1+exp−(β0 +βi xi ) any of IV.
Can we explain how much, on average, DV changes because of IV?
Steps to Regression Analysis 3. Normality: For any fixed value of IV, DV is normally dis-
Can we predict, on average, what DV values might be for a 1. Choose a suitable model tributed.
value of IV? 2. Choose a suitable predictors
3. Select a technique for fitting the model 4. Multicollinearity: The independent variables are not not
Can we determine if the amount of change in one variable is 4. Fit the model perfectly multicolinear (one IV should not be a linear function of
related to, on average, the amount of change in another? 5. Assesses goodness of fit of the model another).
Model Building 6. Interpret and Conclude - make causal inference, Note: Assumptions 1, 2 and 3 do not apply to
Model is a simplified representation of reality issue prediction logistic regression.
When building a model, think: Selecting the "BEST" Model
1. What are the variables that interact? N EED H ELP ? Goal is to minimize the residual mean square (which maximizes
2. How do those variables interact?
Contact Us R2 ) - by comparing regression models
3. Apply underlying theories
Graduate Resource Center
Mesa Vista Hall, Suite 1057 Use informtion criterion statistics -
Phone: 505-277-1407 Akaike’s Information Criterion (AIC)
Email: unmgrc@unm.edu Bayesian Information Criterion (BIC)
Website: https://unmgrc.unm.edu/