Professional Documents
Culture Documents
•It is concerned with the relationship between pairs of variables(X,Y) a data set.
•It is the simultaneous analysis of two variables.
•It is usually undertaken to see if one variable is related to another variable.
• Both variables are numerical
• Both variables are categorical
• One variable is numerical and the other ia
categorical.
Bivariate Statistical Techniques
• Correlation Analysis
• Linear Regression Analysis
• Association of Attributes
• Two-way ANOVA
Cross Tabulation
• A cross tabulation is a technique that describes two or more
variables simultaneously and results in tables that reflect the
joint distribution of two or more variables that have a limited
number of categories or distinct values.
• cross tabulation displays the joint distribution of two or more
variables.
Correlation Analysis
• Study of relationship between two variables
• If changes in the value of one variable will affect the value of
the other variable then both the variables are correlated.
Types of correlation
• On the Basis of Direction
– Positive Correlation
– Negative Correlation
• On the basis of Number of Variables
– Simple Correlation
– Multiple Correlation
– Partial Correlation
• On the Basis of Ratio of Change direction
– Linear Correlation
– Non Linear Correlation
Correlation:-On the Basis of Direction
• Positive Correlation
– Correlation is positive when two variables vary in the same direction.
– For Ex. Correlation between sales and Advt. expenses
Firms A B C D E F
Advt.Expeses 12 14 17 18 20 23
Sales 20 30 40 50 60 70
• Negative Correlation
• Both variables vary in opposite direction
• When variable increases other variable decreases and vice versa.
• For Ex:- Correlation between Production and Price of crop
Use of 1 2 3 4 5 6
fertilizer(in
Kg)
Production of 4 6 9 16 25 28
Rice(in Kg)
• Simple Correlation
– When we measure the linear relationship between two variables then
this relationship is known as simple correlation.
– Ex.-relationship between sales and expenses
• Partial Correlation
– If we have various related variables and try to find out the relationship
between two variables then it is known as partial correlation.
– For Ex:-Two variables height and weight, which are partially correlated
because of effect of third variable ‘age’
• Multiple Correlation
– Measurement of effect of multiple variables on one variable.
– Ex:-Relationship of rainfall and temperature on the yield of wheat.
Degree of correlation
• Perfect Correlation
– Perfect positive correlation
• two variables change in the same direction and in the same
proportion
• Coefficient of correlation in this case is +1
– Perfect negative correlation
• Two variables change in opposite direction
• Coefficient of correlation is -1
• Absence of Correlation
– If series of two variables show no relation between them
or change in one variable does not lead to change in the
other variable then it means there is no relationship
between variables.
– Coefficient of correlation is zero.
• Limited degree of correlation
– If two variables are not perfectly correlated or there is an
absence of perfect correlation, then it is referred to as
Limited correlation
– It may be positive, negative or zero.
– Lies within limits +/- 1
• High Degree(+/- 0.75 to +/- 1)
• Low Degree(0 and +/- 0.25)
• Moderate Degree(+/- 0.25 to +/- 0.75)
strong intermediate weak weak intermediate strong
If r = l = perfect correlation.
Degree Positive Correlation Negative Correlation
coefficient coefficient
Perfect +1 -1
Limited
High Between +0.75 to +1 Between -0.75 to -1
Moderate Between +0.25 to + 0.75 Between -0.25 to - 0.75
220
200
180
160
140
120
100
80 wt (kg)
60 70 80 90 100 110 120
SBP (mmHg)
220
200
180
160
140
120
100
80
Wt (kg)
60 70 80 90 100 110 120
Scatter Diagram:-It is a dot chart specially used to show the correlation
• Merits of Scatter Plot
– It is very simple and non-mathematical technique.
– It is not influenced by the size of extreme items
– It is the basic step to find out the relationship
between two variables
• Demerits of Scatter Plot
– It cannot find out exact degree of correlation
between two variables.
– We can only view the visual form of correlation
and direction on the chart
Karl Pearson’s Coefficient of Correlation
• Calculate Karl Pearson Coefficient of Correlation
X 42 52 55 60 66 68 65 60 58 34
Y 11 13 18 22 26 40 31 27 24 18
X Y x=X-X x2 y=Y-Y y2 xy
42 11
52 13
55 18
60 22
66 26
68 40
65 31
60 27
58 24
34 18
Merits:
3. It is based on a many assumptions, such as: linear relationship, cause and effect
assumptions:
(B) The cause and effect relationship exists between two variables
denoted by r.
Spearman’s Rank Correlation
• Technique to find the correlation between the ranks of two series
• This technique is used when the value of the variable cannot be calculated
quantitatively.
• Professor Charls Spearman worked out a method for determining
correlation in which the values of all data of a series are assigned ranks in
decreasing or increasing (ascending) order.
• In this ranking process, the highest value is given rank 1 and the next higher
value is given rank 2 and so on. In some series the values of two or more
data are similar.
• In that case, the mean of the ranks will be equally shared by those data, as
for example in one series there are two observations; one at S. No. 3 and
the other at S. No. 10 of 67 each. In ranking process 67 at S. No. 3 and 67 at
S. No. 10 instead of being ranked 6 and 7 respectively are ranked at 6.5
(mean of rank 6 and rank 7).
a) Problems where actual rank are given.
2) Square the difference & calculate the sum of the difference i.e. ∑ D 2
1.If the ranks are not given, then we need to assign ranks to the data
series.
2.The lowest value in the series can be assigned rank 1 or the highest
value in the series can be assigned rank 1.
3.We need to follow the same scheme of ranking for the other series.
4. Then calculate the rank correlation coefficient in similar way as we do
when the ranks are given.
c) When the ranks are repeated
Where
r = Rank Coefficient of Correlation
d= Difference between two ranks (R1-R2)
n=Number of Pair of Observations
English 1 2 3 4 5 6 7 8 9 10
History 2 4 1 5 3 10 9 6 7 8
Rank of Rank of D= R1- D2
English( History(R R2
R1) 2)
1 2 -1 1
2 4 -2 4
3 1 +2 4
4 5 -1 1
5 3 +2 4
6 10 -4 16
7 9 -2 4
8 6 +2 4
9 7 +2 4
10 8 +2 4
= 46
X 68 64 74 50 64 80 74 40 55 64
Y 62 58 67 45 81 60 67 48 50 70
X Rank(R1) Y Rank(R2) D=R1-R2 D2
68 4 62 5 -1 1
64 5 58 7 -2 4
74 2.5 67 3.5 -1 1
50 9 45 10 -1 1
64 5 81 1 4 16
80 1 60 6 -5 25
74 2.5 67 3.5 -1 1
40 10 48 9 1 1
55 8 50 8 0 0
64 5 70 2 3 9
• Compute the coefficient of rank correlation between Eco. marks and statistics marks as given
below :
Solution :
This is a case of tied ranks as more than one student share the same
mark both for Economics and Statistics.
For Eco. the student receiving 80 marks gets rank 1 one getting 62
marks receives rank 2, the student with 60 receives rank 3, student
with 56 marks gets rank 4 and since there are two students, each
getting 50 marks, each would be receiving a common rank, the
average of the next two ranks 5 and 6 i.e. (5+6) / 2 = 5.50 and lastly
the last rank..
7 goes to the student getting the lowest Eco marks.
In a similar manner, we award ranks to the students with stats marks.
Computation of Rank Correlation Between Eco Marks and Stats
Marks with Tied Marks
2.This method is useful where we can give the ranks and not the actual
data. (qualitative term)
3.This method is to use where the initial data in the form of ranks.
4. The dependent variable takes any random value but the values of the
independent variables are fixed. „