You are on page 1of 15

HS16101: Principles of Management

PoM-4.5: The Role of Statistics for


Industrial Management -
Correlation for Data Analysis

Faculty
Dr. MARXIA OLI. SIGO
Department of HSS
NIT SIKKIM
March 2022
CORRELATION

 It is an association measure.
 It measures the association between two
continuous variables.
 It assumes that the association is linear.
 Linear association between two variables
means that one variable increases or decreases
a fixed amount for a unit increase or decrease
in the other.
CORRELATION COEFFICIENT

• It measures the degree of association.


• It measures linear association.
• It is sometimes called Pearson’s correlation
coefficient.
STRENGTH OF ASSOCIATION

• The correlation coefficient is measured on a


scale that varies from +1 through 0 to -1.
• Complete correlation between two variables is
expressed by either +1 or -1.
• When one variable increases as the other
increases the correlation is positive.
• When one decreases as the other increases it
is negative.
• The complete absence of
correlation is represented by 0.
POSITIVE RELATIONSHIP
NEGATIVE RELATIONSHIP

Reliability

Age of Car
NO RELATION
SCATTER DIAGRAMS

• When an investigator has collected two series of observations


and wishes to see whether there is a relationship between
them, he should first construct a scatter diagram.
• The vertical scale represents one set of measurements and
the horizontal scale the other.
• Usually, we put the independent variable on the
horizontal axis and the dependent variable on the vertical
axis.
• Sometimes it is not easy to know which variable is dependent
and which is independent.
• This is a common-sense reasoning, so it is logical to say that
the height of a person depends on his age and not the
converse.
CALCULATION –
Correlation Coefficient

• A pediatric registrar has measured the


pulmonary anatomical dead space (in ml) and
height in (cm) of 15 children.
• The data are given in the following table.
• The first step is to inspect the scatter diagram to
see if the area covered by the dots centers on a
straight line or whether a curved line is needed.
• The next step is to calculate the correlation
coefficient.
CHILD HIGHT=X DEAD X×𝑌
NUMBER SPACE=Y
1 110 44 4840
2 116 31 3596
3 124 43 5332
4 129 45 5805
5 131 56 7336
6 138 79 10902
7 142 57 8094
8 150 56 8400
9 153 58 8874
Type
equation
here.
10 155 92 14260
11 156 78 12168
12 159 64 10176
13 164 88 14432
14 168 112 18816
15 174 101 17574
T 2169 1004 150605
MEAN 144.6 66.93333333
SD 19.36786735 23.64761138
HIGHT=X DEAD SPACE=Y
110 44
116 31
124 43
129 45
131 56
138 79
142 57 120
150 56
100
153 58

dead space
80
155 92
60
156 78
159 64 40

164 88 20
168 112 0
174 101 0 50 100 150 200
hieghte

scatter graph of height and anatomic dead space for the 15 children
THE FORMULA TO BE USED
With x representing the value of an independent variable(in this
case the height) and y representing the dependent variable ( in
this case the anatomical dead space):
∑ 𝑥− ̅ 𝑦−̅
𝑟=
𝑥 − ̅ 2 × (𝑦 − 𝑦)2
Which can be shown as equal to :
∑ 𝑥𝑦 − 𝑛 ̅ ̅
𝑟=
𝑛 − 1 𝑆𝑥𝑆𝑦
Where : x = height in cm
y = anatomical dead space in ml
̅ = mean of height ̅ = mean of anatomical dead
space

𝑆𝑥= standard deviation for height 𝑆𝑦= standard


deviation for anatomical dead space
CALCULATION

150605 − 15 × 144.6 × 66.93


𝑟=
14 × 19.37 × 23.65

𝑟 = 150605 − 145171.17
= 5433,83
= 0.847
5412.06 6412,0609

𝑅 2= 0.8472
= 0.717
COMMENTS ON THE RESULTS
• The correlation coefficient of 0.817 indicates a positive
correlation between the size of the pulmonary
anatomical dead space and the height of the child .
• But in the interpretation of correlation, it is important
to remember that correlation is not causation.
• A part of the variation in one of the variables (as measured
by its variance) can be thought of as being due to the
relationship with the other variable and another part as
due to undetermined often random causes.
• The part due to the de2pendence of one variable on the
cother a n b e measured by 𝑅, and it is equal to 0.717 in
our e x am p le .
• So we can say that 72% of the variation between children
in the size of anatomical dead space is due to the height
of the child.


 The value of r ranges between ( -1) and ( +1).

The value of r denotes the strength of the


association as illustrated by the following
diagram.
strong intermediate weak weak intermediate strong

-1 -0.75 -0.25 0 0.25 0.75 1


indirect Direct
perfect perfect
correlation correlation
no relation

You might also like