Professional Documents
Culture Documents
2 Correlation & Regression
2 Correlation & Regression
2 Correlation & Regression
Correlation
linear pattern of relationship between one variable (x) and another variable (y) an association between two variables relative position of one variable correlates with relative distribution of another variable graphical representation of the relationship between two variables
Warning:
No proof of causality Cannot assume x causes y
Scatterplot!
No Correlation
Random or circular assortment of dots
Positive Correlation
ellipse leaning to right GPA and SAT
Smoking and Lung Damage
Negative Correlation
ellipse learning to left Depression & Self-esteem Studying & test errors
0.0 No Rel.
Go to website!
playing with scatterplots
r = .__ __
r = .__ __
r = .__ __
r = .__ __
Correlation Guestimation
Correlations Miles walked per day 1 12 -.797** .002 12 -.800** .002 12 -.774** .003 12
Weight
Depression
Anxiety
Pearson Correlation Sig. (2-tailed) N Pearson Correlation Sig. (2-tailed) N Pearson Correlation Sig. (2-tailed) N Pearson Correlation Sig. (2-tailed) N
Weight Depression -.797** -.800** .002 .002 12 12 1 .648* .023 12 12 .648* 1 .023 12 12 .780** .753** .003 .005 12 12
**. Correlation is significant at the 0.01 level (2-tailed). *. Correlation is significant at the 0.05 level (2-tailed).
correlation for a sample based on a the limited observations we have actual correlation in population the true correlation
Case #2
Correlation between aiming and points, r = .628 Sample small (n=6), and r is only moderate in size We guess = 0 (we guess there is NO correlation in pop.)
Bottom-line
We can only guess about We can be wrong in two ways
Time spun Total ball Distance before Aiming Manual College gradeConfidence toss points from target throwing accuracy dexterity point avg for task Total ball toss points Pearson Correlatio 1 -.904* -.582 .628 .821* -.037 -.502 Sig. (2-tailed) . .013 .226 .181 .045 .945 .310 N 6 6 6 6 6 6 6 Distance from target Pearson Correlatio -.904* 1 .279 -.653 -.883* .228 .522 Sig. (2-tailed) .013 . .592 .159 .020 .664 .288 N 6 6 6 6 6 6 6 Time spun before Pearson Correlatio -.582 .279 1 -.390 -.248 -.087 .267 throwing Sig. (2-tailed) .226 .592 . .445 .635 .869 .609 N 6 6 6 6 6 6 6 Aiming accuracy Pearson Correlatio .628 Sig. (2-tailed) .181 N 6 Manual dexterity Pearson Correlatio .821* Sig. (2-tailed) .045 N 6 College grade pointPearson Correlatio -.037 a Sig. (2-tailed) .945 N 6 Confidence for task Pearson Correlatio -.502 Sig. (2-tailed) .310 N 6 a.Day sample collected = Tuesday -.653 .159 6 -.883* .020 6 .228 .664 6 .522 .288 6 -.390 .445 6 -.248 .635 6 -.087 .869 6 .267 .609 6 1 . 6 .758 .081 6 -.546 .262 6 -.250 .633 6 .758 .081 6 1 . 6 -.553 .255 6 -.101 .848 6 -.546 .262 6 -.553 .255 6 1 . 6 -.524 .286 6 -.250 a.633 6 -.101 .848 6 -.524 .286 6 1
Co e a ons
Time spun r = Manual Total ball Distance before Aiming-.904 College grade Confidence throwing toss points target from accuracy task p.628013 -- Probability of = . dexteritypoint avg for-.502 Total ball toss p Pearson Corre 1 -.904 -.582 * .821* -.037 correlation this size Sig. (2-tailed) . .013 .226 getting a.045 .181 .945 .310 N 6 6 6 by sheer chance. Reject Ho 6 6 6 6 Distance from ta Pearson Corre -.904 * 1 .279 if p .05. * -.653 -.883 .228 .522 Sig. (2-tailed) .013 . .592 .159 .020 .664 .288 sample N 6 6 6 6 6 6 size 6 r (4)-.248 -.087 .267 = -.904, pe.05 Time spun befor Pearson Corre -.582 .279 1 -.390 throwing Sig. (2-tailed) .226 .592 . .445 .635 .869 .609 N 6 6 6 6 6 6 6
. 6 *.Correlation is significant at the 0.05 level (2-tailed).
Predictive Potential
Coefficient of Determination
r Amount of variance accounted for in y by x Percentage increase in accuracy you gain by using the regression line to make predictions Without correlation, you can only guess the mean of y [Used with regression]
0%
20%
40%
60%
80%
100%
Limitations of Correlation
linearity:
cant describe non-linear relationships e.g., relation between anxiety & performance
truncation of range:
underestimate stength of relationship if you cant see full range of x value
no proof of causation
third variable problem: could be 3rd variable causing change in both variables directionality: cant be sure which way causality flows
Regression
Regression: Correlation + Prediction
predicting y based on x e.g., predicting. throwing points (y) based on distance from target (x)
Regression equation
formula that specifies a line y = bx + a plug in a x value (distance from target) and predict y (points) note y= actual value of a score y= predict value Go to website!
Regression Playground
y=47 y=20
T t l
Rsq .
ll t ss
i ts
Dist
c fr
t r
if x=18 then
if x=24 then
Regression Equation
y= bx + a
y = predicted value of y b = slope of the line x = value of x that you plug-in a = y-intercept (where line crosses y access) See correlation & regression worksheet
In this case.
y = -4.263(x) + 125.401
y = 40.141
Adjusted Std. Error of R R Square R Square the Estimate a .777 .603 .581 18.476
y = b (x)
a
a Coefficients
y = -4.263(20) + 125.401
Unstandardized Standardized Coefficients Coefficients Model B Std. Error Beta 1 (Constant) 125.401 14.265 istance from targe -4.263 .815 -.777 a. ependent Variable: Total ball toss points
t 8.791 -5.230
Predictive Ability
Mantra!!
As variability decreases, prediction accuracy ___ if we can account for variance, we can make better predictions
As r increases:
r increases variance accounted for increases the prediction accuracy increases prediction error decreases (distance between y and y) Sy decreases the standard error of the residual/predictor measures overall amount of prediction error
2. Plug in a large value for x (just so it falls on the right end of the graph), plug it in for x, then plot the resulting point 3. Connect the two points with a straight line!