You are on page 1of 43

STATISTICAL ANALYSIS FOR

EXPERIMENTAL DESIGN AND DATA


PROCESSING

Prof. Mustafa Sait YAZGAN


Assoc. Prof. Alpaslan EKDAL

1
Regression Analysis

2
Regression Analysis

3
Regression Analysis

4
Regression Analysis

5
Regression Analysis

6
Simple Linear Regression Analysis

7
Simple Linear Regression Analysis

8
Simple Linear Regression Analysis

9
Simple Linear Regression Analysis

10
Simple Linear Regression Analysis

11
Simple Linear Regression Analysis

12
Simple Linear Regression Analysis

13
Simple Linear Regression Analysis

14
Simple Linear Regression Analysis

15
Simple Linear Regression Analysis

16
Simple Linear Regression Analysis

17
Simple Linear Regression Analysis

18
Simple Linear Regression Analysis

19
Simple Linear Regression Analysis

20
Simple Linear Regression Analysis

21
Simple Linear Regression Analysis

22
Simple Linear Regression Analysis

23
Simple Linear Regression Analysis

24
Simple Linear Regression Analysis

25
Simple Linear Regression Analysis

26
Simple Linear Regression Analysis

27
Simple Linear Regression Analysis

28
Simple Linear Regression Analysis

29
Simple Linear Regression Analysis

30
Simple Linear Regression Analysis

31
Simple Linear Regression Analysis

32
Simple Linear Regression Analysis

33
Simple Linear Regression Analysis

34
Common Misconceptions About Correlation

Correlation and causality

• The conventional dictum that "correlation does not imply


causation" means that correlation cannot be validly used to infer a
causal relationship between the variables.
• This dictum should not be taken to mean that correlations cannot
indicate causal relations. However, the causes underlying the
correlation, if any, may be indirect and unknown.
• Consequently, establishing a correlation between two variables is
not a sufficient condition to establish a causal relationship (in
either direction).

35
Common Misconceptions About Correlation

Correlation and causality

• Here is a simple example: hot weather may cause both a reduction


in purchases of warm clothing and an increase in ice–cream
purchases. Therefore warm clothing purchases are correlated with
ice–cream purchases. But a reduction in warm clothing purchases
does not cause ice–cream purchases and ice–cream purchases do
not cause a reduction in warm clothing purchases.

36
Common Misconceptions About Correlation

Correlation and causality

• A correlation between age and height in children is fairly causally


transparent, but a correlation between mood and health in people
is less so. Does improved mood lead to improved health? Or does
good health lead to good mood? Or does some other factor
underlie both? Or is it pure coincidence? In other words, a
correlation can be taken as evidence for a possible causal
relationship, but cannot indicate what the causal relationship, if
there is any, might be.

37
Common Misconceptions About Correlation

Correlation and linearity

• Anscombe's quartet comprises four datasets that have identical


simple statistical properties, yet appear very different when
graphed. Each dataset (Table 9.1) consists of eleven (x, y) points.
They were constructed in 1973 by the statistician F. J. Anscombe
to demonstrate both the importance of graphing data before
analyzing it and the effect of outliers on statistical properties.

38
Common Misconceptions About Correlation

Correlation and linearity


Table 9.1 Anscombe's Quartet
1 2 3 4
X1 Y1 X2 Y2 X3 Y3 X4 Y4
10 8.04 10 9.14 10 7.46 8 6.58
8 6.95 8 8.14 8 6.77 8 5.76
13 7.58 13 8.74 13 12.7 8 7.71
9 8.81 9 8.77 9 7.11 8 8.84
11 8.33 11 9.26 11 7.81 8 8.47
14 9.96 14 8.1 14 8.84 8 7.04
6 7.24 6 6.13 6 6.08 8 5.25
4 4.26 4 3.1 4 5.39 19 12.5
12 10.8 12 9.13 12 8.15 8 5.56
7 4.82 7 7.26 7 6.42 8 7.91
5 5.68 5 4.74 5 5.73 8 6.89
39
Common Misconceptions About Correlation

Correlation and linearity


Table 9.1 Statistical Properties of Anscombe's Quartet

Property Value
Mean of x in each case 9
Variance of x in each case 11
Mean of y in each case 7.5
Variance of y in each case 4.12
Correlation between x and y in each case 0.816
Linear regression equation in each case y = 3 + 0.5x

40
Common Misconceptions About Correlation

Correlation and linearity


Figure 9.1 Four sets of data with the same correlation of 0.81

41
Common Misconceptions About Correlation

Correlation and linearity


The images in Figure 9.1 show plots of Anscombe's quartet, and a linear
regression line obtained for each.

• As can be seen from Table 9.2, the correlation coefficient for each set is
the same (0.816). However, the distribution of the variables is very
different.
• The first one (top left) seems to be distributed normally, and corresponds
to what one would expect when considering two variables are correlated
and are following the assumption of normality.
• The second one (top right) is not distributed normally; while an obvious
relationship between the two variables can be observed, it is not linear,
and the Pearson correlation coefficient is not relevant.

42
Common Misconceptions About Correlation

Correlation and linearity


• In the third case (bottom left), the linear relationship is perfect, except
for one outlier which exerts enough influence to lower the correlation
coefficient from 1 to 0.81.
• Finally, the fourth example (bottom right) shows another example when
one outlier is enough to produce a high correlation coefficient, even
though the relationship between the two variables is not linear.

These examples indicate that the correlation coefficient, as a summary


statistic, cannot replace the individual examination of the data.

They also show that the correlation coefficient or regression equation


must be calculated after being visually convicted that there is an actual
correlation between x and y variables.

43

You might also like