You are on page 1of 6

Regression Assumptions

5. Auto Correlation
1. Linearity : data has no pattern at all
ways to determine: -if residuals have downward or upward curve, it
• -scatter plot: relationship graphically of Y and X will be biased
o graphically must posses an upward, -regression must be unbiased
downward or elastic curve
ways to determine:
• Pearson Correlation • Durbin Watson test: 1.5to 2.5

2. Normality 6. Exogeneity: there are variables that u


-DV is normal committed in model but should have been in the
-IV not normal is not important model instead of other IVs
-no statistics involved
ways to determine: -use observation skills
• Shapiro-wilk test: aim to know if data is normal
or not normal using P-value that spss will give
Other test to check assumptions
P-value > .05 1. multicollinearity:
-Ho must be rejected pearson < .30
-HO: data is normally distributed
2. Normality of error: scatterplots of error within
3. Multicollinearity + - 3(standard residuals)
-for multiple regression only
-in simple regression, the VIF is always 1 because 3. Cook’s Distance: for auto correlation
there’s no other IV, no multicollinearity <1
-an IV is predicting another IV: should not be the
case Remedies
-if any of other assumptions are violated
ways to determine:
Pearson R: the value must be very low • Check the Model:
R<.30 are the variables in the model do have
relationship?
VIF: must be less than 3 ex: price of hammer to a pandesal
VIF< 3
perfect= 1 • Transform the data: if data are transformend
data should be interpreted using transformed
4. Homoskedasticity: purpose is if the error in p value
plots or your expected value are well scattered -natural log: commonly used
-log
-no upward and downward, it must be scattered -inverse
-square
ways to determine:
• Look at P-Plots: the normality p-plots in spss • Exogeneity: Omit, add, delete variables
• Bruesch pagan Test: gives probability value
HO: data is homoskedastic
P-value > .05
SPSS INTERPRETATION the main objective is to know the
maginitude of impact of IV to DV
1. Significance of the Model ex:
6. partial Correlation: how much is the effect
ways to determine: relationship of each Iv to the dv (Semi-Partial)
ANOVA P-Value: < .05
if you want to omit variables sa model: the lowest
HO: model is insignificant (reject) value in semi partial ( Partial R) in IV will
increased the R. pwede mo remove yung lowest

2. Significance of each IV
MLR BASKETBALL
ways to determine:
T test < .05 Data: In a basketball world, we wanted to know the
efficiency rate of the player
HO: IV is significant (reject)
does the relationship trully posses the the suppose
3. Pearson R: measures the relationship from IV to relationship? yes, but not 100% sure
DV
if R is almost 1, variables are no longer dependent assuming that the rating is dependent of 3 factors
with each other which is Points-assist-Rebound

ways to determine:
R: .5 -.75 / 50%- 70%

4. Coefficient of determinination
the main purpose to know how many percent of
the variance of the Dv is attributed or can be
explain by the Iv

the value must not be perfect, because there’s


always an error

5. interpretations of coefficient
• standardized: there is an equal footing for
the purpose of comparing the effects of one there will always be an error: u can input fouls,
from the other time u played, number of violations

impact of each IV to the DV


ex:

• unstandardized: no comparison, you want


to use the metrics of measurement.
CHECK ASSUMPTIONS
1. Linearity:
ways to determine: Scatter Plot
2. Normality:
use graphical illustration: ways to determine:
graphs > chart Builder > scatter plot > drag the • Shapiro-wilk test: aim to know if data is normal
scatter plot > drag variable on each axis > click okay or not normal using P-value that spss will give
P-value > .05
rating: Y axis
points: x axis steps:
interpretation: analyze > descriptive stats > explore > rating: DV >
upward (linear) go to plots > check normality plots with tests >
remove stem and leaf> continue> okay> check P-
Value in tests of normality “Sig.”

rating: Y axis
Assist: x axis
interpretation:
downward
( we are not sure
if its linear)
Interpretation: .45 > .05
rating: Y axis conclusion: the data are normal
Rebounds: x axis if p-value is higher we accept the HO
interpretation: OVERALL: passed the assumption
downward
( we are not 3. Homoskedastic
sure if its linear)
ways to determine: Bruesch pagan Test
OVERALL: we are not sure, but set aside first HO: data is homoskedastic
P-value > .05
YT Source steps:
analyze> compare means > means > DV: select analys > General Linear Model > Univariate >
IV:select > options > linearity > okay rating: DV, The rest Covariate > options >check
modified and Breusch pagan-test > continue> okay

Modified Breusch pagan-test: lower value, sensitive


Breusch pagan-test: use if not so strict

there is significant linearity between the IV and DV

deviation from linearity must be greater than .05


interpretation: there is no significant deviation
from linearity
conclusion: they have a linear relationship

Interpretation: .277 > .05


conclusion: data is homoskedastic
OVERALL: passed the assumption NOTES: every assumptions passed except linearity
CHECK Regression

steps SPSS INTERPRETATION


analyze> regression> linear > rating: DV, the other 1. Significance of the Model
to IV> Statistics: check all except confidence ANOVA P-Value: < .05
interval > continue > plots: ZRESID-Y ZPRED-X >
check normal probability plot > continue > Save:
check Cook’s > continue > Okay

5. Multicollinearity
VIF: must be less than 3
VIF< 3
interpretation: .09 > .05
conclusion: model is not significant ( reject HO)
at .05 error, but at .10 model is significant

2. Significance of each IV
Interpretation: T test < .05
1.764 < 3 HO: IV is significant (reject)
1.959 < 3
1.175 < 3
Conclusion: every IV does not have
multicollinear, they are not related with each
other
OVERALL: passed the assumption
interpretation:
6. Auto correlation: check model summary points: .035 - significant
assists: .546 – not significant
Durbin Watson test: 1.5 to 2.5 rebounds: .633- not significant

3. Pearson R:
R: .5 -.75 / 50%- 75%

Interpretation: 1.5 < 2.392 < 2.5 interpretation:


OVERALL: passed the assumption R: .789: given the 3 variables rebound,
assist, points, their relationship to the DV is
or .789 (highly correlated)

Cooks: there is no 1 above (in variables) R^2: .623: 62.3% of the value of the
OVERALL: passed the assumption Efficiency rating is explained by points,
rebound, and assist
rebounds: -.428
4. partial R
interpretation: impact only, there’s
no comparison

points: pag tumaas ang points mo , tataas


ang efficiency rating by 1.12

points: .683 (biggest effect in DV) assists: pag tumaas ang points mo, tataas
assists: .160 ang efficiency rating by .883
rebounds: -.126
rebounds: pag tumaas ang rebounds mo,
interpretation: points has the biggest effect in the bababa ang efficiency rating by -.428
efficiency rating, this is the reason why the two IV
(assists and rebounds) are not significant in T-test

5. interpretations of coefficient alpha: 62.472

INTERPRET CORRELATION

Standardized: ex. comparing the 3 disregarding


their gender, equal comparison
points: .907
Assists: .225
rebounds: -.137 (insignificant)

interpretation:
if u will compare points, assists, rebound to check IV to DV relationship:
efficiency rating points to rating: .769
assist to rating: -.393
points: pag tumaas ang points mo, expect rebounds to rating: -.093
that your efficiency rating will increase by
almost 1 check IV to IV relationship:
points to assist: -.633 (suspect of correlation
assists: pag tumaas ang assists mo, expect kasi mataas/ suspect of multiple
that your efficiency rating will increase by collinearity)
almost .22 points to rebounds: -.3 (no relationship)

rebounds: pag tumaas ang rebounds mo, ANALYSIS:


expect that your efficiency rating will in terms of model:
decrease by -.14 passed
in terms of Individual Significance:
point passed
Unstandardized: including the gender
points: 1.119 Partial R:
Assists: .883 points has biggest effect in DV
find nicer model: you may drop IV assist
and rebound para gumanda yung
regression: assist and points are suspected
to have multicollinearity

why remove rebound?


bukod sa insignificant sya, mali rin yung
coefficient nya which is negative

create another regression:

assignment: jaybob data


give refined regression model using 3 variables
odometer
price of car
age

submit on or before next meeting

You might also like