Professional Documents
Culture Documents
Korelasi Dan Regresi
Korelasi Dan Regresi
Regression
dr. Zulvikar Syambani Ulhaq, M.Biomed, Ph.D
FKIK UIN MALANG
Correlation and Regression
The test you choose depends on level of measurement:
Independent Dependent Statistical Test
-
Family Income Number of Children
Correlation and Regression
n For example:
¨ Null Hypothesis: There is no relationship between
education and self-esteem.
¨ Null Hypothesis: There is no relationship between
family income and the number of children in families.
¨ Ho: b = 0 “b” is a symbol for a statistic
¨ Ha: b ≠ 0 that describes the relationship
Independent Variables Dependent Variables
Education Self-Esteem
Case: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Children (Y): 2 5 1 9 6 3 1 0 3 7 7 2 4 2 1 0 1 2 4 3 0 1 2 5 7
Income 1=1jt (X): 3 4 9 5 4 12 14 10 1 4 3 11 4 9 13 10 7 5 2 5 15 11 8 3 2
Correlation and Regression
Y Plotted coordinates for income and children
10
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 X
Case: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Children (Y): 2 5 1 9 6 3 1 0 3 7 7 2 4 2 1 0 1 2 4 3 0 1 2 5 7
Income 1=1jt (X): 3 4 9 5 4 12 14 10 1 4 3 11 4 9 13 10 7 5 2 5 15 11 8 3 2
Correlation and Regression
Y Plotted coordinates for income and children
10
9
8
Is it positive?
7
6
5
4
3
2
1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 X
Case: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Children (Y): 2 5 1 9 6 3 1 0 3 7 7 2 4 2 1 0 1 2 4 3 0 1 2 5 7
Income 1=1jt (X): 3 4 9 5 4 12 14 10 1 4 3 11 4 9 13 10 7 5 2 5 15 11 8 3 2
Correlation and Regression
Y Plotted coordinates for income and children
10
9
8
Is it negative?
7
6
5
4
3
2
1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 X
Case: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Children (Y): 2 5 1 9 6 3 1 0 3 7 7 2 4 2 1 0 1 2 4 3 0 1 2 5 7
Income 1=1jt (X): 3 4 9 5 4 12 14 10 1 4 3 11 4 9 13 10 7 5 2 5 15 11 8 3 2
Correlation and Regression
Y Plotted coordinates for income and children
10
9
8
Is there no
7
6
relationship?
5
4
3
2
1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 X
Case: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Children (Y): 2 5 1 9 6 3 1 0 3 7 7 2 4 2 1 0 1 2 4 3 0 1 2 5 7
Income 1=1jt (X): 3 4 9 5 4 12 14 10 1 4 3 11 4 9 13 10 7 5 2 5 15 11 8 3 2
Correlation and Regression
What is the slope of a “fitted line?”
10
Slope = 0, No relationship!
4
Y
3
2
(# Children)
1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
X
(Income)
Correlation and Regression
Y What is the slope of a “fitted line?”
10
9
8
7
6
5
0.5
4
(# Children)
1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
X
(Income)
The slope is the change in Y along the line
as you go up one on X while following the
line (rise over run).
Correlation and Regression
Y What is the slope of a “fitted line?”
10
9
0.5
5
4
1
3
2
1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
ANOVAb
Sum of Mean
Model Squares df Square F Sig.
1 Regression 71.194 1 71.194 19.623 .000a
Residual 83.446 23 3.628
Total 154.640 24
Y = b + mX a. Predictors: (Constant), INCOME
Coefficients a
b. Dependent Variable: CHILD
Standardi
zed
Unstandardized Coefficien
Coefficients ts
Model B Std. Error Beta t Sig.
1 (Constant) 6.003 .754 7.960 .000
INCOME -.414 .094 -.679 -4.430 .000
a. Dependent Variable: CHILD
Correlation and Regression
Y Plotted coordinates for income and children
10
Y = 6 - .4X
6
5
4
3
2
1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 X
Case: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Children (Y): 2 5 1 9 6 3 1 0 3 7 7 2 4 2 1 0 1 2 4 3 0 1 2 5 7
Income 1=1jt (X): 3 4 9 5 4 12 14 10 1 4 3 11 4 9 13 10 7 5 2 5 15 11 8 3 2
Correlation and Regression
r2 = E1 - E2 r2 is called the “coefficient of
determination”…
E1 It is also the square of the
Pearson correlation
TSS
∧ Error 2
= Σ (Y – Y)2 - Σ (Y – Y)2 Y
Σ (Y – Y)2 X
Correlation and Regression
n R2
¨ Is the improvement obtained by using X (and drawing a line
through the conditional means) in getting as near as possible to
everybody’s value for Y over just using the mean for Y alone.
¨ Falls between 0 and 1
n 1 means an exact fit (and there is no variation of scores around the
regression line)
n 0 means no relationship (and as much scatter around the line as in
the original Y variable and a flat regression line (slope = 0) through
the mean of Y)
¨ Would be the same for X regressed on Y as for Y regressed on
X
¨ Can be interpreted as the percentage of variability in Y that is
explained by X.
Correlation and Regression
Back to the SPSS output:
Model Summary
r2
Std. Error
Adjusted R of the ∧
Model R R Square Square Estimate Σ (Y – Y)2 - Σ (Y – Y)2
1 .679a .460 .437 1.9048 Σ (Y – Y)2
a. Predictors: (Constant), INCOME
ANOVAb
Sum of Mean
Model Squares df Square F Sig.
1 Regression 71.194 1 71.194 19.623 .000a
Residual 83.446 23 3.628
Total 154.640 24
a. Predictors: (Constant), INCOME
b.
CoefficientsaDependent Variable: CHILD
Standardi
zed
71.194 ÷ 154.64 = .460
Unstandardized Coefficien
Coefficients ts
Model B Std. Error Beta t Sig.
1 (Constant) 6.003 .754 7.960 .000
INCOME -.414 .094 -.679 -4.430 .000
a. Dependent Variable: CHILD
Correlation and Regression
Back to the SPSS output:
Model Summary
Std. Error
Adjusted R of the
Model R R Square Square Estimate
1 .679a .460 .437 1.9048
a. Predictors: (Constant), INCOME
ANOVAb
Sum of Mean
Model Squares df Square F Sig.
1 Regression 71.194 1 71.194 19.623 .000a
Residual 83.446 23 3.628
Total 154.640 24
a. Predictors: (Constant), INCOME
b.
CoefficientsaDependent Variable: CHILD
Standardi
zed Of course, you get the
Unstandardized Coefficien
Coefficients ts
standard error and
Model B Std. Error Beta t Sig.
1 (Constant) 6.003 .754 7.960 .000
t (z) on your output,
INCOME -.414 .094 -.679 -4.430 .000
…and the p-value too!
a. Dependent Variable: CHILD
Correlation and Regression
Plotted coordinates for income and children
Y
Y = 6 - .4X
10
So in our example,
9
the slope is
8
significant, there is a
7
relationship in the
6
of the variation in
4
number of children is
3
explained by income.
2
1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 X
Case: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Children (Y): 2 5 1 9 6 3 1 0 3 7 7 2 4 2 1 0 1 2 4 3 0 1 2 5 7
Income 1=1jt (X): 3 4 9 5 4 12 14 10 1 4 3 11 4 9 13 10 7 5 2 5 15 11 8 3 2
Correlation and Regression
Y Plotted coordinates for income and children
10
Y = 6 - .4X
7
6
direction of
4
association… How
3
strong is that?
2
1
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 X
Case: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Children (Y): 2 5 1 9 6 3 1 0 3 7 7 2 4 2 1 0 1 2 4 3 0 1 2 5 7
Income 1=1jt (X): 3 4 9 5 4 12 14 10 1 4 3 11 4 9 13 10 7 5 2 5 15 11 8 3 2
Correlation and Regression
n To find the strength of the relationship
between two variables, we need
correlation.
n The correlation is the standardized
slope… it refers to the standard deviation
change in Y when you go up a standard
deviation in X.
Correlation and Regression
10
9
8
7
6
5
4
3
2
1
Std. Error
Adjusted R of the
Model R R Square Square Estimate
1 .679a .460 .437 1.9048
a. Predictors: (Constant), INCOME
ANOVAb
Sum of Mean
Model Squares df Square F Sig.
1 Regression 71.194 1 71.194 19.623 .000a
Residual 83.446 23 3.628
Total 154.640 24
a. Predictors: (Constant), INCOME
b.
CoefficientsaDependent Variable: CHILD
Standardi
zed
Unstandardized Coefficien Correlation, r,
Coefficients ts is significant.
Model B Std. Error Beta t Sig.
1 (Constant) 6.003 .754 7.960 .000
INCOME -.414 .094 -.679 -4.430 .000
a. Dependent Variable: CHILD
Correlation and Regression
If you were to use the “correlate, bivariate” command, you’d get this
ouput…
Correlation, r, is significant.