## Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

Independent Variable Categorical

Continuous

Continuous

This Week

Categorical Categorical or Continuous

Categorical Categorical or Continuous

Next Week

**Association Between Two Variables
**

• No association • Linear association

– Positive association – Negative association

• Curvilinear association

1

1

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0 0 0.2 0.4 0.6 0.8 1

0 0 0.2 0.4 0.6 0.8 1

1

1

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0 0 0.2 0.4 0.6 0.8 1

0 0 0.2 0.4 0.6 0.8 1

**Strength of Linear Association
**

1

1

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0 0 0.2 0.4 0.6 0.8 1

0 0 0.2 0.4 0.6 0.8 1

**Strength of Linear Association
**

1 1 0.8 0.8

0.6

0.6

0.4

0.4

0.2

0.2

0 0 0.2 0.4 0.6 0.8 1

0 0 0.2 0.4 0.6 0.8 1

**Strength of Linear Association
**

1 1 0.8 0.8

0.6

0.6

0.4

0.4

0.2

0.2

0 0 0.2 0.4 0.6 0.8 1

0 0 0.2 0.4 0.6 0.8 1

**Quantifying the Strength of Linear Correlation
**

• What does a positive linear correlation mean?

– Large numbers on one variable go with large numbers on the other variable.

**• How to decide what are large and small numbers?
**

– Relative to the means.

Student # 1 2 3 4 5 µ σ

SAT (X) 450 520 600 470 460 500 55.5

GPA (Y) 2.7 3.1 3.5 2.6 3.1 3.0 0.32

3.6 3.4 3.2 3 500 2.8 2.6 2.4

350

400

450

550

600

650

Student SAT GPA X – µ X Y – µ Y # (X) (Y) 1 450 2.7 -50 -0.3 2 3 4 5 Sum µ σ 520 600 470 460 3.1 3.5 2.6 3.1 20 100 -30 -40 0 0 0.1 0.5 -0.4 0.1 0 0

(X – µ X)(Y – µ Y) 15 2 50 12 -4 75 (Cross Product) 15 (Covariance)

2500 15.0 500 55.5 3.0 0.32

**Quantifying the Strength of Linear Correlation
**

• Is 15 a large or smaller number? • At least we know it is positive. • Magnitude relative to the variance (or standard deviation) of X and Y.

r=

Co var iance σ XY = σ X ⋅σ Y σ X ⋅σ Y

• r = 15 / (55.5 x 0.32) = 0.84

Alternative Approach

• Standardize X and Y first (z-scores), then calculate the covariance between the zscores.

r=

∑z

X

⋅ zY

N

Student SAT GPA # (X) (Y) 1 450 2.7 2 3 4 5 Sum µ σ 520 600 470 460 3.1 3.5 2.6 3.1

zX

zY -0.93 0.31 1.55 -1.24 0.31 0 0

zX zY 0.84 0.11 2.79 0.67 -0.22 4.19 0.84

r

-0.90

0.36 1.80 -0.54 -0.72 0 0

2500 15.0 500 55.5 3.0 0.32

**Interpreting the Magnitude of Correlations
**

• Always between -1 and +1 • Proportion of variance explained by the other variable: r2 • r = .84, r2 = .71 = 71% • A correlation of .8 is NOT two times stronger than a correlation of .4.

– How much stronger? – 4 times. (.8)2 = .64; (.4)2 = .16

Significance Testing

• The following has a t distribution:

t=

r N −2 1− r2

df = N – 2 r = .84, t = 2.68, df = 3, p = .075 Not significant at .05 level. Small sample size.

**When There’s a Significant Correlation
**

• • • • Correlation and Causation X causes Y Y causes X Z causes both X and Y

**When There’s No Significant Correlation
**

• Small sample • Other Noise • Attenuation due to unreliability of measurement • Outliers • Restriction in range • Curvilinearity

**From Correlation to Regression
**

• Correlation: to describe the relationship between two variables • Regression: to use one variable to predict another variable • The accuracy of prediction depends on the strength of correlation

**Strength of Linear Association
**

1 1 0.8 0.8

0.6

0.6

0.4

0.4

0.2

0.2

0 0 0.2 0.4 0.6 0.8 1

0 0 0.2 0.4 0.6 0.8 1

An Example

• Research Question: Does eating spinach increase strength? • Randomly sampled 20 individuals. • IV: How many cans of spinach one consumed in the past week. • DV: How many push-ups one can do in a minute.

70

60

r = .86

50

Pushup

40

30

20

10

0 0 5 10 Spinach 15 20 25

Coefficientsa Unstandardized Coefficients B Std. Error 19.443 3.494 1.550 .220 Standardized Coefficients Beta .856

Model 1

(Constant) spinach

t 5.565 7.031

Sig. .000 .000

a. Dependent Variable: pushup

ˆ Y =19.48+1.55X

) zY = (.856) z X

**Understanding R2: Proportion of Variance Explained, or Proportion Reduction in Error
**

70 60

50

Pushup

40

30

20

10

0 0 5 10 Spinach 15 20 25

70

60

50

Pushup

40

30

20

10

When you don’t know X, you can only use the mean of Y to predict the Y score of any individual.

0 5 10 Spinach 15 20 25

0

70

60

50

Pushup

40

30

20

10

**Errors (or variance) are relatively high when you use the mean of Y as your prediction.
**

0 5 10 Spinach 15 20 25

0

70

60

50

Pushup

40

30

20

10

0 0 5 10 Spinach 15 20 25

70

60

50

Pushup

40

30

20

10

**When you know X, and use X to predict Y, the errors become smaller.
**

0 5 10 Spinach 15 20 25

0

70 60 50 Pushup 40 30 20 10 0 0 5 10 Spinach 15 20 25

R2 =

) ∑ (Y − Y ) 2

Green

∑ (Y − Y )

2

Green and Red

# of push-ups

Spinach consumption

**Association Between Two Categorical Variables
**

• Angelina Jolie or Jennifer Aniston?

**Test for Independence
**

• Null Hypothesis: There is no relationship between JA/AJ preference and which side you are sitting in the classroom. • To rephrase: JA/AJ preference does not depend on which side you are sitting in the classroom. • Another version: People sitting on the right and people sitting on the left do not have different JA/AJ preferences.

**JA Left Right Total
**

Observed Expected

AJ

Total

Expected Frequency

• Expected assuming the null hypothesis is true, i.e., no association between the two variables.

Expected =

C⋅R N

• C: Column total, R: Row total, N: Grand total

Chi-Square

(Observed − Expected ) 2 χ =∑ Expected

2

• Degree of Freedom df = (# of Columns – 1)(# of Rows – 1) • What is the df for a 2 x 2 table? • The shape of Chi-Square distribution depends on the degree of freedom

Chi-Square Distribution

Critical Region

Chi-Square

• The chi-square statistic is always positive. Why? • When df = 1, chi-square distribution is the distribution of z2. • Without looking up in a reference, what is the alpha = .05 cutoff value for the chisquare distribution (df = 1)?

– (1.96)2 = 3.84

**Back to Angelina and Jennifer
**

• In SPSS.

If We Still Have Time…

**Chi-Square Test for Goodness of Fit
**

To test whether a distribution is the same as a predetermined or theoretical distribution.

Next Week

• Integrating t-test, correlation, regression, and chi-square test for independence • They are all special cases of the general linear model • Effect size and power for the above tests

- activity 1 reflectionuploaded byapi-229410609
- Materi 1 a Introduction (REV)uploaded byHabibJazuli
- Manju Chi Squreuploaded bySelva Kumar
- 20uploaded byAsma Imran
- A Study on barriers to E-commerce adoption in Vadodara District SMEsuploaded byKushagra purohit
- Econo Chap02uploaded bytafakharhasnain
- Mobile Phone Service PRovideruploaded byVINEET JOSHI
- Ch03.2.DescriptiveStatPart2uploaded byBrittnie Luu
- Hasil Olahanuploaded byChizuLin
- ID C405 Formatteduploaded bynavketsharma6280
- Mathematical Methods SL - May 2001 - P2 $uploaded bymyasmreg
- 468-1842-1-PBuploaded byVely Kazu
- 4.IJECRAPR20174uploaded byTJPRC Publications
- chi2truploaded bygrandsunil
- 514c02uploaded byLouis Guy
- HW_7uploaded bycincinmindy
- texting speed vs thumb length power pointuploaded byapi-356670570
- Lab Section - Flow rate.docxuploaded byYan Trindade
- project reporrtuploaded byAlan Mathews
- Paper.pdfuploaded bySergio Martínez Camacho
- 2004-04uploaded byAnonymous RrGVQj
- 33851uploaded byvillamor niez
- The Logics of Trade Assn Membershipsuploaded byTesfaye Degechissa
- Correlationuploaded byThakur Sahil Narayan
- ECON1203 Hw Solution week03uploaded byBad Boy
- 2004 Medical Students’ Self-Appraisal of First-year Learning Outcomes Use of the Course Valuing Inventoryuploaded bySheilla Elfira
- Session 3_Bivariate Data Analysis tutorial pracuploaded byXolani Lunga
- ijsrp-p2478.pdfuploaded byIJSRP ORG
- Statistikuploaded byAdelia Puspa
- Correlation and Regression (1)uploaded byClyette Anne Flores Borja

Read Free for 30 Days

Cancel anytime.

Close Dialog## Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

Loading