Professional Documents
Culture Documents
07/14/2020
DNA/BRM/Session 26 (a). Correlation & Regression
BRM
MODULE 5 SESSION 26 (a)
07/14/2020
DNA/BRM/Session 26 (a). Correlation & Regression
2. Preference for Quality, price, availability in order is a type
of………….scale.
3. Container’s code is an example of……………………………………..
4. Cronbach alpha is a measure of …………………………………………
5. For ordinal data, measure of central tendency used is………………
6. Simplest measure of dispersion is………………………………………………
7. COV is calculated as………………………………………
8. BCG matrix is an example of ……………………
Quantitative Techniques
PURPOSE: TO PROVIDE A RATIONAL BASIS FOR MAKING
DECISIONS IN THE ABSENCE OF COMPLETE
INFORMATION.
IT DEALS WITH THREE CLASSICAL ASPECTS OF SCIENCE:
- DESCRIBING THE BEHAVIOUR OF SYSTEMS
- ANALYZING THE BEHAVIOUR BY CONSTRUCTING
APPROPRIATE MODELS.
- APPLYING THESE MODELS TO PREDICT FUTURE
BEHAVIOUR OR DESCRIBING RELATIONSHIP AMONG VARIOUS
VARIABLES.
O.R. FUNCTION IS A STAFF FUNCTION.
CORRELATION
Definitions
Correlation
A method used to determine if a relationship between variables exists
Correlation Coefficient
Wt. 67 69 85 83 74 81 97 92 114 85
(kg)
SBP 120 125 140 160 130 180 150 140 200 130
(mmHg)
Wt. 67 69 85 83 74 81 97 92 114 85
SBP(mmHg) (kg)
SBP 120 125 140 160 130 180 150 140 200 130
(mmHg)
220
200
180
160
140
120
100
80 wt (kg)
60 70 80 90 100 110 120
200
180
160
140
120
100
80
Wt (kg)
60 70 80 90 100 110 120
16
14
Height in CM 12
10
0
0 10 20 30 40 50 60 70 80 90
Age in Weeks
Negative relationship
Reliability
Age of Car
No relation
Correlation Coefficient
If r = l = perfect correlation.
How to compute the simple correlation
coefficient (r)
xy x y
r n
x
2
( x) 2
. y
2
( y) 2
n n
Example:
A sample of 6 children was selected, data about their
age in years and weight in kilograms was recorded as
shown in the following table . It is required to find the
correlation between age and weight.
xy x y
r n
x2
( x) 2
. y 2
( y) 2
n n
Weight Age
Serial
Y2 X2 xy (Kg) (years)
.no
(y) (x)
144 49 84 12 7 1
64 36 48 8 6 2
144 64 96 12 8 3
100 25 50 10 5 4
121 36 66 11 6 5
169 81 117 13 9 6
r = 0.759
strong direct correlation
EXAMPLE: Relationship between Anxiety and Test Scores
Anxiety Test X2 Y2 XY
)X( score (Y)
10 2 100 4 20
8 3 64 9 24
2 9 4 81 18
1 7 1 49 7
5 6 25 36 30
6 5 36 25 30
X = 32∑ Y = 32∑ X2 = 230∑ Y2 = 204∑ XY=129∑
Calculating Correlation Coefficient
r = - 0.94
Cov( x, y )
rx , y
sx s y
Test scores and work experience of 5 executives is
given below, determine the coefficient of correlation.
X1= Test Score, X2 = Experience (years)
X1
X2
50 2
80 8
20 6
90 5
60 4
Using
EXCEL
r=0.204124
Properties of “r”
r is always between -1 and 1 inclusive. -1means
perfect negative linear correlation and +1 means
perfect positive linear correlation.
r only measures the strength of a linear relationship.
There are other kinds of relationships besides linear.
r has the same sign as the slope of the regression
(best fit) line.
r does not change if the independent (x) and
dependent (y) variables are interchanged.
r does not change if the scale on either variable is
changed. You may multiply, divide, add, or subtract
a value to/from all the x-values or y-values without
changing the value of r.
Spearman’s Rank Correlation
Coefficient
Nonparametric correlation between two ordinal
variables.
Rank correlation coefficient rs =
1- [(6 ∑D2 )/( N3-N)]
where D (Difference of ranks) = R1-R2
N= Total number of observations
Procedure
1. Rank the values of X from 1 to n where n is the numbers of pairs
of values of X and Y in the sample.
2. Rank the values of Y from 1 to n.
3. Compute the value of di for each pair of observation by subtracting
the rank of Yi from the rank of Xi
4. Square each di and compute ∑di2 which is the sum of the squared
values.
5. Apply the following formula
6 (di) 2
rs 1
n(n 2 1)
∑ di2=64
6 64
rs 1 0.1
7(48)
Comment:
There is an indirect weak correlation between level of education and
income.
Causation
There is significant linear correlation. (That is, when we reject the null
hypothesis that rho=0 in a correlation hypothesis test.)
The value of the independent variable being used in the estimation is close
to the original values not the values much beyond the range. (That is, we
should not use a regression equation obtained using x's between 10 and
20 to estimate y when x is 350).
Regression (contd..)
The regression equation should not be used with different populations.
( That is, if x is the height of a male, and y is the weight of a male, then
you shouldn't use the regression equation to estimate the weight of a
female).
The regression equation should n't be used to forecast values not from that
time frame. (That is, if data is from the 1970's, it probably isn't valid estimate
for 2000's).
Regression Equation
The regression equation is:
y' = a + bx + e
‘b’ is the slope of the regression line, ‘a’ is the y-
intercept of the regression line. The regression
line is sometimes called the "line of best fit" or
the "best fit line".
Since it "best fits" the data, it makes sense that the
line passes through the means.
Coefficient of Determination
The coefficient of determination is:
the percent of the variation that can be explained by the
regression equation.
the explained variation divided by the total variation
the square of r
Every sample has some variation in it. The total variation
is made up of two parts, the part that can be explained
by the regression equation and the part that can't be
explained by the regression equation.
The ratio of the explained variation to the total variation
is a measure of how good the regression line is. If the
regression line passed through every point on the scatter
plot exactly, it would be able to explain all of the
variation. The further the line is from the points, the less
it is able to explain.
Total V= UV + EV
Applications
Finance:
Profit and sales revenue
Return on stock and return of BSE/NIFTY
Capital reserve and return on stock
Marketing:
Sales of any product and Advt. budget
Sales revenue and salary of executive
Economics:
Inflation and GDP
Demand of any product and Temperature
A Case on Multiple Regression
Himalayan Plastics Ltd. manufactures different type of plastic
products. The company has taken the decision to train its
supervisors. The company conducted an aptitude test for
selection of supervisors. A random sample of five supervisors
was selected who had experience of minimum two years. They
were provided training for two weeks as Quality Control
Inspectors and after the completion of training their proficiency
was measured by their output per shift basis. The output per shift
(Number of units) was recorded for all the five supervisors as 20,
60,30,50 and 70 respectively.
Contd.
The values of test scores (X 1) and experience (X2) in terms of number of years are as under:
X1 X2
50 2
80 8
20 3
90 5
60 4
i) Determine the relationship between the performance of the Inspectors and their test scores as well as experience
ii) Determine the correlation between test score and experience of Inspectors.
{Hint: Yc = a + b1 X1 + b2 X2
∑Y = Na + b1∑X1 + b2 ∑X2
2
∑X1Y = a ∑X1 + b1 ∑X1 + b2 ∑ X1X2
2
∑X2Y = a ∑X2 + b1 ∑X1X2 + b2∑X2
FORECAST USING EXCEL
X Y
40 2
45 1.8
50 2.3
52 2.5
60 3
65 3.4
68 3.1
72 4.2
80 4.8
THANK YOU