You are on page 1of 49

mhkorai.blogspot.

com

Question No. 1 Spring 2011

a) The average runs scored by seven leading test cricketers during the year 2010 are given
below:

Average runs scored in 1st innings (x) 46 73 68 79 49 43 81


Average runs scored in 2nd innings (y) 66 31 45 26 58 63 35

Find the Spearman's rank correlation coefficient for the runs scored in first and second
innings and interpret your result.

b) Find the coefficient of correlation between x and y if:


Regression line of x on y is: 5x – 4y + 2 = 0
Regression line of y on x is: x – 5y + 3 = 0

Answer No. 1(a) Spring 2011


Scores Ranks
1st 2nd X y d=x–y d2
46 66 2 7 -5 25
73 31 5 2 3 9
68 45 4 4 0 0
79 26 6 1 5 25
49 58 3 5 -2 4
42 63 1 6 -5 25
81 35 7 3 4 16
Total 104

6∑d2 6(104) 624


r = 1 – n(n2 −1) = 1 – 7(49−1) = 1 – 336

r = 1 – 1.8571
r = – 0.8571

there is high negative correlation b/w two innings score.

Answer No. 1(b) Spring 2011


Rearranging regression lines

Line x on y
5x – 4y + 2 = 0
5x – 4y – 2

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

x = 4/5y – 2/5

here bxy = 4/5

Line y on x
x – 5y + 3 = 0
5y = x + 3
y = 1/5 x + 3/5
Here byx= 1/5

We know that r = √bxy × byx = √4/5 × 1/5 = √4/25 = 2/5 = 0.4

Question No. 2 Autumn 2010

The quantities sold by T&P Limited during the past seven months are as follows:

Product x 11 20 04 zero 18 07 16
Product y 15 02 32 35 05 28 10

i) Determine the regression equation for product on x.


ii) Calculate the coefficient of correlation and determination and interpret the results.

Answer No. 2 Autumn 2010

x y xy x2 y2
11 15 165 121 225
20 2 40 400 4
4 32 128 16 1,024
0 35 0 0 1,225
18 5 90 324 25
7 28 196 49 784
16 10 160 256 100
76 127 779 1,166 3,387

n∑xy−∑x∑y 7(779)−(76)(127) 5453−9652 −4199


i) byx = n∑x2 −(∑x)2 = = = = – 1.7598
7(1166)−(76)2 8162−5776 2386

since a = y – bx
127 76
therefore a = − (−1.7598)
7 7

a = 18.1429 +1.7598(10.8571) = 18.1429 + 19.1069 = 37.2498

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

We know that regression equation for y on x is determined by the relation:

y = a + bx

 y = 37.2498 – 1.7598x

n∑xy−∑x∑y 7(779)−76×127
ii) r= =
√[n∑x2 −(∑x)2 ][n∑y2 −(∑y)2 ] √[7(1166)−(76)2 ][7(3387)−(127)2 ]

5453−9652 −4199 −4199


= = = = – 0.9874
√[8162−5776][23709−16129] √(2386)(7580) 4252.7497

Co-efficient of determination = r2 = (– 0.9874)2 = 0.9749

Results Interpretation

The value of r shows high negative correlation between the quantities sold of the two
products. r2 = 0.9749 signifies that 97.5% variation in sale of Product y is due to variation in
sale of Product x. The other 2.5% variation is due to other factors.

Question No. 3 Spring 2010


The following data shows the price and demand of a product at different points in time:

Price (Rs.) 33 55 50 42 48 61 53 33
Demand (1000 kg) 91 60 59 65 61 49 42 91

a) Determine the regression equation for the demand on price.


b) Find the coefficient of correlation and coefficient of determination.
c) Interpret the results obtained in (b) above.

Answer No. 3 Spring 2010


Price (x) Demand (y) xy x2 y2
33 91 3003 1089 8281
55 60 3300 3025 3600
50 59 2950 2500 3481
42 65 2730 1764 4225
48 61 2928 2304 3721
61 49 2989 3721 2401
53 42 2226 2809 1764
33 91 3003 1089 8281

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

375 518 23129 18301 35754

a) Regression equation of y upon x y = a + bx


where a = y – b x

n∑xy−∑x∑y
and b=
n∑x2 −(∑x)2

8(23129)−375(518)
= 8(18301)(375)2

185032−194250 −9218
= 146108−140625 = = – 1.59
5783

∑y 518
y= = = 64.75
n 8
∑x 375
x= = = 46.875
n 8

a = 64.75 – (– 1.59) (46.875) = 64.75 + 74.53 = 139.28

Regression line

y = 139.28 – 1.59x

b) Correlation coefficient
n∑xy−∑x∑y
r=
√[n∑x2 −(∑x)2 ][n∑y2 (∑y)2 ]

8(23129)−375(518)
=
√[8(18301)−(375)2 ][8(35754)(518)2 ]

185032−194250
=
√[(146408)−140625][286032−268324]

−9218 −9218
= = = – 0.91
√(5783)(17708) 10119.55

Coefficient of determination = r2 = (0.91)2 = 0.8281

c) There is a strong inverse correlation between the demand of a product and price of the
product. The coefficient of determination 0.8281 shows that 82.81% is the relationship
which is explainable i.e., the demand decreases when prices increase. However, 17.19%
is relationship is due to other reasons which are unexplainable.

Question No. 4 Autumn 2009

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

a) Following data shows the marks obtained by 11 students in Mathematics and Physics:

Mathematics 27 73 34 25 64 91 70 62 55 48 59
Physics 67 62 41 21 74 85 66 49 55 44 68

Find the Spearman's rank correlation coefficient for the above data and interpret your
result.

b) In order to determine the relationship between experience of its employees and their
respective output, a company has gathered the following data:

Experience in years 2 4 6 8 10 12 14 16 18 20
Output in % 30 35 44 43 46 50' 45 48 39 34

i) Determine the regression equation for output on experience.


ii) Describe the apparent relationship between experience and output.
iii) Predict the output of a 13 year experienced employee, using the regression
equation.

Answer No. 4(a) Autumn 2009


Mathematics Physics Rank Math Rank Physics d d2
27 67 10 4 6 36
73 62 2 6 -4 16
34 41 9 10 -1 1
25 21 11 11 0 0
64 74 4 2 2 4
91 85 1 1 0 0
70 66 3 5 -2 4
62 49 5 8 -3 9
55 55 7 7 0 0
48 44 8 9 -1 1
59 68 6 3 3 9
80

6∑d2
P=1–
n(n2 −1)

6(80)
= 1 – 11(121−1)

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

480
= 1 – 1320

= 1 – 0.36 = 0.64

The rank correlation coefficient of 0.64 shows that normally the students good in mathematics
are good in physics as well.

Answer No. 4(b) Autumn 2009


Experience in years output
x y xy x2
2 30 60 4
4 35 140 16
6 44 264 36
8 43 344 64
10 46 460 100
12 50 600 144
14 45 630 196
16 48 768 256
18 39 702 324
20 34 680 400
110 414 4648 1540

i) Regression line y = a + bx and for a and b


∑y = na + b∑x
∑xy = a∑x + b∑x2

Or

414 = 10a + 110 b____________(i)


4648 = 110a + 1540 b _________(ii)

Multiplying (i) by 11 and subtracting from (ii)

4648 = 110 a + 1540 b


– 4554 = –110 a ± 1210 b
94 = 330 b

94
b= = 0.285
330

Substituting the value of b in (i)


414 = 10 a+ 110 (0.285)
414 = 10 a+ 31.35
10 a = 414 – 31.35 = 382.65

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

a = 38.265
Hence line y = 38.265 + 0.285x

ii) Apparently more experience shows better output. However, after 12 years of
experience the employees start feeling complacency resultantly show lower output.

iii) y = 38.265 + 0.285(13)


= 38.265 + 3.605 = 42% (Rounded)

Question No. 5 Spring 2009


A company wants to assess the impact of advertising expenditures on its annual profit. The
following table presents the information for eight years:

Rupees in million
Year Annual profit
Advertising expenditure
2001 90 45
2002 100 42
2003 95 44
2004 110 60
2005 130 30
2006 145 34
2007 150 35
2008 140 30

a) Construct the least square regression equation and predict the annual profit for the
year 2009 if the advertising expenditure is budgeted at Rs. 160 million.
b) Determine the coefficient of correlation and interpret your result.

Answer No. 5(a) Spring 2009


Advertising Annual
Year expenditure Profit
x y xy x2 y2
2001 90 45 4,050 8,100 2,025
2002 100 42 4,200 10,000 1,764
2003 95 44 4,180 9,025 1,936
2004 110 60 6,600 12,100 3,600
2005 130 30 3,900 16,900 900
2006 145 34 4,930 21,025 1,156
2007 150 35 5,250 22,500 1,225
2008 140 30 4,200 19,600 900
Total 960 320 37,310 119,250 13,506

In least square regression line y = a + bx

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

a = y – bx

n∑xy−∑x∑y 8(37310)−960(320) 298480−307200 −8720


b = n∑x2 −(∑x)2 = 8(119250)−(960)2
= 954000−921600
= 32400
= – 0.27

∑y 320
y= = = 40
n 8
∑x 960
x= = = 120
n 8

a = 40 – (– 0.27) (120) = 40 + 32.4 = 72.4


y = 72.4 – 0.27x

At x = 160 million
y = 72.4 – 0.27 (160) = 72.4 – 43.2 = 29.4 million

Answer No. 5(b) Spring 2009


n∑xy−∑x∑y 8(37310)−960(320) 298480−307200
r= = =
√[n∑x2 −(∑x)2 ][n∑y2 −(∑y)2 ] √[8(119250)−(960)2 ][8(13506)−(320)2 ] √[954000−921600][108048−102400]

−8720 −8720
= = = – 0.64
√(32400)(5648) 13527.57

The coefficient of correlation suggests that with the increase in advertising expenditure annual
profit decreases to a reasonable proportion.

Question No. 6 Autumn 2008


The data in the following table shows the monthly maintenance cost and the ages of nine similar
machines operating in a factory:

Machine 1 2 3 4 5 6 7 8 9
Age (in months) 5 10 15 20 30 30 30 50 50
Cost (Rs. In 000) 19 24 25 30 31 32 30 30 35

a) Find the least square regression line of maintenance cost on age.


b) Describe the apparent relationship between maintenance cost and age.
c) Find the coefficient of correlation and interpret your result.

Age(x) Cost(y) xy x2 y2
5 19 95 25 361
10 24 240 100 576
15 25 375 225 625
20 30 600 400 900

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

30 31 930 900 961


30 32 960 900 1,024
30 30 900 900 900
50 30 1,500 2,500 900
50 35 1,750 2,500 1,225
240 256 7,350 8,450 7,472

Answer No. 6(a) Autumn 2008


Least square line of regression
y = a + bx
To find out a, b

∑y = na + b∑x
∑xy = a∑x + b∑x2

256 = 9a + 240b _______(i)


7,350 = 240a + 8,450___(ii)

Multiplying (i) by 80, (ii) by 3 and subtracting (i) from (ii) (i) from (ii)

22,050 = 720a + 25,350b


20,480 = 720a + 19,200b
– – –
1570 = 6150b

b = 1570/6150 = 0.255

Putting this value in (i)


256 = 9a + 240 (0.255)
256 = 9a + 61.2
9a = 256 – 61.2 = 194.8
a = 21.644

Answer No. 6(b) Autumn 2008


In the earlier ages maintenance cost are relatively more. However, costs increase with the age but
relatively very slowly after 20 years of age.

Answer No. 6(c) Autumn 2008


n∑xy−∑x∑y 9(7350)−240(256) 66150−61440
r= = =
√[n∑x2 −(∑x)2 ][n∑y2 −(∑y)2 ] √[9(8450)−(240)2 ][9(7472)−(256)2 ] √[76050−57600][67248−65536]

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

4710 4710
= = 5620
= 0.84
√[18450][1712]

Coefficient of correlation suggests that there is a relatively strong positive relationship between
age of a machine and its maintenance cost.

Question No. 7 Spring 2008


a) The two regression lines obtained in a correlation analysis are:
5x = 6y + 24 and 1000y = 768x – 3708
Determine bxy; byx and the correlation coefficient 'r'

b) Students who finish the examinations more quickly than the rest are often thought to be
smarter. The following set of data shows the score of 12 students and the order in which
they finished their examination:

Order of finish 1 2 3 4 5 6 7 8 9 10 11 12
Exam score 90 78 76 60 92 86 74 60 60 78 68 64

Find the Spearman's rank correlation co-efficient for the above data.

Answer No. 7(a) Spring 2008


Regression lines
5x = 6y + 24
6 24
x = 5 𝑦 + 5 ______(i) (x upon y)
6
bxy = 5 = 1.2 and
1000y = 768x – 3708
768 3708
y = 1000x – 1000 ____(ii) (y on x)

768
byx = 1000 = 0.768

r = √bxy × byx = √(1.2)(0.768) = √0.9216 = 0.96

Answer No. 7(b) Spring 2008


Order of finish Score Rank for d=x-y d2
x scores (y)
1 90 2 -1.0 1.00

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

2 78 4.5 -2.5 6.25


3 76 6 -3.0 9.00
4 60 11 -7.0 49.00
5 92 1 4.0 16.00
6 86 3 3.0 9.00
7 74 7 0.0 0.00
8 60 11 -3.0 9.00
9 60 11 -2.0 4.00
10 78 4.5 5.5 30.25
11 68 8 3.0 9.00
12 64 9 3.0 9.00
151.50

6∑d2
P = 1 – n(n2−1)

6(151.5) 909
= 1 – 12(122 – 1) = 1 – 1,716 = 1 – 0.53 = 0.47

Question No. 8 Autumn 2007


For the following set of data:

X 13 16 14 11 17 9 13 17 18 12
Y 6.2 8.6 7.2 4.5 9.0 3.5 6.5 9.3 9.5 5.7

a) Develop the estimation equation that best describes the data.


b) Predict y for x = 10

Answer No. 8(a) Autumn 2007


x y xy x2
13 6.2 80.6 169
16 8.6 137.6 256
14 7.2 100.8 196
11 4.5 49.5 121
17 9.0 153.0 289
9 3.5 31.5 81
13 6.5 84.5 169
17 9.3 158.1 289

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

18 9.5 171.0 324


12 5.7 68.4 144
140 70 1035 2038

∑x 140 ∑y 70
x= = = 14, y = = =7
n 10 n 10

y – y = byx (x – x)

n∑xy – ∑x∑y 10(1035) – (140) (70) 10350 – 9800 550


byx = = = = = 0.71
n∑x – (∑x)2
10(2038) – (140)2
20380 – 19600 780

y – 7 = 0.71 (x – 14)
y – 7 = 0.71x – 9.94, y = 0.71x – 9.94 + 7
y = 0.71x – 2.94

Answer No. 8(b) Autumn 2007


x =10, then y
y =(0.71) (10) – 2.94
= 7.1 – 2.94 = 4.16

Question No. 9 Spring 2007


a) Construct the scatter diagram for the following data:

X 9 2 12 7 16 5 8
Y 12 18 11 16 9 16 14

b) Determine the coefficient of correlation between x and y.


c) Calculate the coefficient of determination.
d) Interpret your results in (a), (b) and (c).

Answer No. 9 Spring 2007


a)
x y xy x2 y2
9 12 108 81 144
2 18 36 4 324
12 11 132 144 121
7 16 112 49 256
16 9 144 256 81
5 16 80 25 256
8 14 112 64 196
59 96 724 623 1,378

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

b) Coefficient of correlation
n∑xy−∑x∑y 7(724)−59(96) 5068−5664
r= = =
√[n∑x2 −(∑x)2 ][n∑y2 −(∑y)2 ] √[7(623)−(59)2 ][7(1378)−(96)2 ] √[4361−3481][9646−9236]

−596
= = – 0.992
√(880)(400)

c) Coefficient of determination = r2 = (– 0.992)2 = 0.9841


d) a, b show that there is about perfect negative correlation between x and y variables.
However, c shows that 0.9841 (100) = 98.41% is direct correlation between x and y
which is explainable, whereas 1.59% relationship cannot be explained.

Question No. 10 Autumn 2006


a) The heights and weights of six men are given below:

Height (meters 2.00 1.80 1.85 1.72 1.75 1.79


Weight (kgs) 85.0 78.0 80.0 74.0 75.0 76.0

i) Determine both the lines of regression. Interpret the results as well.


ii) Estimate weight when height is 1.70 meters.
iii) Estimate height when weight is 70.0 kg.
b) Different values of co-efficient of correlation r are given below:

(i) r = 1 (ii)r = – 1 (iii)r = 0


(iv) r = 0.90 (v)r = 0.10 (vi) r = – 0.88

Explain the type of relationship you would expect between x and y in each of the above
cases.

Answer No. 10(a) Autumn 2006


Height Meters Weight Kgs xy x2 y2
x y
2.00 85.0 170.00 4.0000 7225
1.80 78.0 140.40 3.2400 6084
1.85 80.0 148.00 3.4225 6400
1.72 74.0 127.28 2.9584 5476
1.75 75.0 131.25 3.0625 5625
1.79 76.0 136.04 3.2041 5776
10.91 468.0 852.97 19.8875 36586

i) Line of regression of y upon x

y – y = byx (x – x)

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

∑y 468
where y = = = 78
n 6

∑x 10.91
and x= = = 1.82
n 6

n∑xy−∑x∑y 6(852.97)−(10.91)(468) 5117.82−5105.88 11.94


byx = n∑x2 −(∑x)2 = 6(19.8875)−(10.91)2
= 119.325−119.0281
= 0.2969
= 40.22

y – 78 = 40.22 (x – 1.82)
y = 40.22x – 73.2 + 78 = 4.8 + 40.22x

To find out the weight of a person, the height of person will be multiplied by 40.22
and adding 4.8 in it

Line of regression of x upon y


x – x = bxy (y – y)

n∑xy−∑x∑y 6(852.97)−(10.91)(468) 11.94


where bxy = n∑y2−(∑y)2 = 6(36586)−(468)2
= 492
= 0.024
x – 1.82 = 0.024 (y – 78)
x = 0.024y – 1.872 + 1.82 = 0.024y – 0.052

To find out the height of a person, the weight of person will be multiplied by 0.024
and subtracting 0.052 from it

ii) Weight when height is 1.7 meters


y = 4.8 + 40.22(1.7) = 73kgs rounded

iii) Height when weight is 70 kg


x = 0.024(70) – 0.052 = 1.63 meters rounded

Answer No. 10(b) Autumn 2006


The coefficient of correlation measures the strength of the association between two variables:

i) r = 1 indicates perfect positive correlation between two variables i.e the change in one
variable will cause change in other variable in definite proportion and in the same
direction.
ii) r = –1 indicates perfect inverse correlation between two variables i.e. the change in
one variable will cause change in other variable in definite proportion but in opposite
direction.

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

iii) r = 0 indicates no association between two variables i.e. both variables are
independent.
iv) r = 0.90 indicates very strong correlation between two variables. It is the indication of
change in the same direction. However, it is not said to be perfect.
v) r = 0.10 shows that relationship between two variables is quite weak. However, this
relationship is positive
vi) r = – 0.88 indicates very strong inverse correlation between two variables. But the
correlation is not perfect.

Question No. 11 Spring 2006


a) For the following data set, construct a scatter diagram and comment on the relationship
between x and y:

X –3 –1 0 1 3
Y 12 7 6 4 1

b) Calculate the co-efficient of correlation 'r' for the above data set. Does it seem consistent
with the above scatter diagram?

Answer No. 11(a) Spring 2006


There is a negative relationship between x and y i.e. an increase in any variable will bring
decrease in the other variable.

Answer No. 11(b) Spring 2006


x y xy x2 y2
–3 12 – 36 9 144
–1 7 –7 1 49

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

0 6 0 0 36
1 4 4 1 16
3 1 3 9 1
0 30 – 36 20 246

n∑xy−∑x∑y 5(−36)−0(30) −180 −180


r= = = = = – 0.99
√[n∑x2 −(∑x)2 ][n∑y2 −(∑y)2 ] √[5(20)−(0)2 ][5(246)−(30)2 ] √(100)(330) 181.66

Coefficient of correlation being negative shows the same result as of the scatter diagram.

Question No. 12 Spring 2006


Data regarding age and price for a sample of 11 cars of a particular type are given below:

Age (in years) Price (million rupees)


(x) (y)
5 0.85
4 1.03
6 0.70
5 0.82
5 0.89
5 0.98
6 0.66
6 0.95
2 1.69
7 0.70
7 0.48

a) Determine the least square regression equation of y on x.


b) Describe the apparent relationship between age and price.
c) Use the above regression equation to predict the price of a 3-year old car.

Answer No. 12 Spring 2006


x y xy x2
5 0.85 4.25 25
4 1.03 4.12 16
6 0.70 4.20 36
5 0.82 4.10 25
5 0.89 4.45 25
5 0.98 4.90 25
6 0.66 3.96 36
6 0.95 5.70 36
2 1.69 3.38 4
7 0.70 4.90 49
7 0.48 3.36 49

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

58 9.75 47.32 326

a) Regression line is y = a + bx
For a & b
∑y = na + b∑x
∑xy = a∑x + b∑x2
9.75 = 11a + 58b____(i)
47.32 = 58a + 326b___(ii)

Multiplying (i) by 58 (ii) by 11 and subtracting (ii) from (i)

565.5 =638a + 3364b


520.52 =638a –3586b
44.98 = – 222b

b = – 44.98/222 = – 0.2
Putting this value – in (i)
9.75 = 11a+ (58) (–0.2)
11a = 9.75 + 11.6 = 21.35
a = 21.35/11 =1.94

Hence the regression line is


y = 1.94 – 0.2x

b) Slope being negative, the price of cars will have negative relationship with respect to age.
c) y = 1.94 – 0.2 (3) = 1.94 – 0.6 = 1.34 million

Question No. 13 Autumn 2005


A study by the transportation department on the effect of bus-ticket prices upon the number of
passengers, produced the following data:

Ticket price in Rs. (x) 5 6 7 8 9 10 11 12


No. of Passengers (y) 800 780 780 660 640 600 620 620

a) Develop the least square regression line of y on x


b) How do you interpret the slope of the above regression line?
c) Estimate the number of passengers if the ticket price is 8.5 rupees.

Answer No. 13 Autumn 2005


x y xy x2

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

5 800 4000 25
6 780 4680 36
7 780 5460 49
8 660 5280 64
9 640 5760 81
10 600 6000 100
11 620 6820 121
12 620 7440 144
68 5500 45440 620

a) For least square regression line y = a + bx


For a and b
a = y – bx

n∑xy−∑x∑y 8(45440)−68(5500) 363520−374000 −10480


byx = n∑x2 −(∑x)2 = = =
6(620)−(68)2 4960 336

b = – 31.19
a= y – bx

∑y 5500
y= = = 687.5
n 8

∑x 68
x= = = 8.5
n 8

a = 687.5 – (– 31.19) (8.5)


= 687.5 + 265.1 = 952.6
Hence y = 952.6 – 31.2 x
b) With every increase of one unit in x (Ticket price in Rs.), there is decrease in y (N0. of
passengers) by 31.2
c) y =952.6 – 31.2x
y =952.6 – 31.2 (8.5)
= 952.6 – 265.2 = 687.4
or say 687

Question No. 14 Autumn 2005


The following table contains the data on the heights and weights of a group of women:

Height (x in inches) Weight (y in pounds)


65 140
69 150

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

62 110
71 170
66 120
68 150
70 150
67 130
63 120
65 100

Compute the co-efficient of correlation and co-efficient of determination and interpret your
results.

Answer No. 14 Autumn 2005


Height Weight
In inches in pounds
x y xy x2 y2
65 140 9100 4225 19600
69 150 10350 4761 22500
62 110 6820 3844 12100
71 170 12070 5041 28900
66 120 7920 4356 14400
68 150 10200 4624 22500
70 150 10500 4900 22500
67 130 8710 4489 16900
63 120 7560 3969 14400
65 100 6500 4225 10000
666 1340 89730 44434 183800

n∑xy−∑x∑y
Coefficient of correlation r =
√[n∑x2 −(∑x)2 ][n∑y2 −(∑y)2 ]

10(89730)−666(1340) 897300−892440 4860


r= = =
√[10(44434)−(666)2 ][10(183800)−(1340)2 ] √(444340−443556)(1838000−1795600) √(784)(142400)

4860
= = 0.84
5765.55

Coefficient of Determination is square of coefficient of correlation


r2 = (0.84)2 = 0.7056

The interpretation of the result is that 0.7056 or 70.56% of the relationship of weight and height
is due to proportionate increase or decreased of both. However 100 – 70.56 = 29.44% is
difference due to unknown other factors which might be due to diet, parents, atmosphere etc.

Question No. 15 Spring 2005

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

a) The age and price data for a sample of 11 Nissan Sunny Cars are presented in the
following table:

Age (years) Price (in Rs. '000')


X y
5 85
4 103
6 70
5 82
5 89
5 98
6 66
6 95
2 169
7 70
7 48

i) Calculate the linear correlation coefficient, r, of the data.


ii) Interpret the value r obtained in part (a) in terms of linear relationship between the
two variables.

b) If 8 members of a tennis club are classified A players, 6 are classified B players and 10
are classified C players, in how many different ways can 2 players from each group be
chosen to represent the club.

Answer No. 15 Spring 2005


a)

Age (x) Price (000) (y) xy x2 y2


X y
5 85 425 25 7225
4 103 412 16 10609
6 70 420 36 4900
5 82 410 25 6724
5 89 445 25 7921
5 98 490 25 9604
6 66 396 36 4356
6 95 570 36 9025
2 169 338 4 28561
7 70 490 49 4900
7 48 336 49 2304
58 975 4732 326 96129

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

n∑xy−∑x∑y
i) Coefficient of correlation r =
√[n∑x2 −(∑x)2 ][n∑y2 −(∑y)2 ]
11(4732)−58(975) 52052−56550 −4498
r= = = = – 0.92
√[11(326)−(58)2 ][11(96129)−(975)2 ] √(3586−3364)(1057419−950625) 4869.11

ii) There is a strong negative correlation between the ages of cars and their prices i.e. the
more the age of car, the less the relative price

b) The required number is:


8
C2 x 6C2 x 10C2 = 28 x 15 x 45 = 18900

Question No. 16 Autumn 2004


The sales manager of a firm randomly selected 10 sales representatives. He gathered data on the
number of calls and the number of units sold by each representative in one month, which is as
follows:

Sales Representative Number of calls made Number of units sold


1 14 28
2 35 66
3 22 38
4 29 70
5 6 22
6 15 27
7 17 28
8 20 47
9 12 14
10 29 68

Plot the data in a scatter diagram. Based on the scatter diagram, what observations can you
make?

Answer No. 16 Autumn 2004


The scatter diagram shows that units sold have positive strong relationship with number of calls
made i.e. the more the number of calls the more the units sold.

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

Question No. 17 Autumn 2004


A farmer wishes to predict the number of tons per acre of crop which will result from a given
number of applications of fertilizer. Data collected is given below:

Fertilizer applications 1 2 4 5 6 8 10
Tons of crop per acre 2 3 4 7 12 10 7

Find a suitable linear regression relationship to help the farmer in making the required prediction
and from your result predict the number of tons per acre of crop from 7 applications of fertilizer.

Answer No. 17 Autumn 2004

Fertilizes Application Tons per acre xy x2


X y
1 2 2 1
2 3 6 4
4 4 16 16
5 7 35 25
6 12 72 36
8 10 80 64
10 7 70 100
36 45 281 246

Regression line of y upon x


y = a + bx

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

n∑xy−∑x∑y 7(281)−36(45) 1967−1620 347


where b = n∑x2 −(∑x)2 = = = = 0.81
7(246)−(36)2 1722−1296 426

45 36
a = y – bx = – 0.81 ( ) = 6.43 – 4.17 = 2.26
7 7

y = 2.26 + 0.81x

At x = 7, y = 2.26 + 0.81 (7) = 2.26 + 5.67 = 7.93 say 8

Question No. 18 Autumn 2004


The sales manager of a firm randomly selected 10 sales representatives. He gathered data on the
number of calls and the number of units sold by each representative in one month, which is as
follows:

Sales Representative Number of calls made Number of units sold


1 14 28
2 35 66
3 22 38
4 29 70
5 6 22
6 15 27
7 17 28
8 20 47
9 12 14
10 29 68

Plot the data in a scatter diagram. Based on the scatter diagram, what observations can you
make?

Answer No. 18 Autumn 2004


The scatter diagram shows that units sold have positive strong relationship with number of calls
made i.e. the more the number of calls the more the units sold.

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

Question No. 19 Spring 2004


a) The following table shows the data on shelf pace allotment (x) and sales (y):

Space allotted Sales


(sq. feet-x) (Number of boxes-y)
2 20
4 36
6 38
8 38
10 52
12 54

i) Determine the equation of least square regression line of y on x.


ii) Using the above equation, estimate sales for a space allotment of 7 square feet.

b) A computer while calculating the correlation co-efficient between two variables X and Y
form 25 pairs of observation obtained the following sums:

x = 125
x2 = 650
y = 100
y2 = 460
xy = 508

The following mistakes were discovered at the time of checking:

Wrong Value Correct Values Need to be Recorded


Recorded
X Y X Y
6 14 8 12

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

8 6 6 8

Find out the correct value of the coefficient of correlation.

Answer No. 19 Spring 2004


a) Space allotted sales

x y xy x2
2 20 40 4
4 36 144 16
6 38 228 36
8 38 304 64
10 52 520 100
12 54 648 144
Total 42 238 1884 364

Regression line of y upon x


y = a + bx

n∑xy−∑x∑y 6(1884)−42(238) 11304−9996 1308


where b = n∑x2 −(∑x)2 = = = = 3.1
6(364)−(42)2 2184−1764 420

∑𝑦−𝑏∑𝑥 238 3.1(42)


a= = y – bx = −
𝑛 6 6

= 39.67 - 21.70 = 18 Rounded

Required line y = 18 + 3.1x


when x = 7
y = 18 + 3.1 (7) = 18 + 21.7 = 39.7

b) x = 125, ∑x2 = 650, ∑y = 100, ∑y2 = 460, ∑xy = 508

Corrected sums

∑x =125 – 6 – 8 + 8 + 6 = 125
∑y =100 – 14 – 6 + 12 + 8 =100
∑x2 = 650 – 36 – 64 + 64 + 36 = 650
∑y2 =460 – 196 – 36 + 144 + 64 = 436
∑xy = 508 – 84 – 48 + 96 + 48 = 520

n∑xy−∑x∑y 25(520)−125(100)
r= =
√[n∑x2 −(∑x)2 ][n∑y2 −(∑y)2 ] √[25(650)−(125)2 ][25(436)−(100)2 ]

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

13000−12500
=
√(16250−15625)(10900−10000)

500 500
= = = 0.67
√(625)(900) 750

Question No. 20 Autumn 2003


a) The following data is available for sales volume and total costs during a period of four
quarters:

Quarter Sale Volume in million Total costs in million (Rs.)


1 5 142
2 6 137
3 7 149
4 5 129

Required:

By using regression analysis, find out the following:


i) What is the best estimate of the variable cost per unit (to the nearest Rupee)?
ii) What is the best estimate of the fixed cost per quarter (to the nearest Rupees in
million).

b) For the following two sets of bivariate data, the regression lines for each set are,
respectively:

i) y = 1.94x + 10.83 (y on x) and


x = 0.15y + 6.18 (x on y)

ii) y = – 1.96x + 15 (y on x) and


x = – 0.45y + 7.16 (x on y)

Required:
Find the product moment coefficient of correlation in each case.

Answer No. 20 Autumn 2003


a)
Quarter Sales Volume Total costs
Million in million Rs.
(x) (y) xy x2
1 6 142 852 36

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

2 5 137 685 25
3 7 149 1043 49
4 5 129 645 25
Total 23 557 3225 135

Regression line being y = a + bx where b is the variable cost and a is fixed cost.

n∑xy−∑x∑y 4(3225)−23(557) 12900−12811 89


byx = n∑x2 −(∑x)2 = = = = 8 rounded
4(135)−(23)2 540−529 11
557 23
a = y – bx where y = 4 = 139.25 and x = 4 = 5.75
a = 139.25 – 8 (5.75) = 139.25 – 46.00 = 93.25 = 93 rounded

b) (i) Regression coefficient in y on x (byx) = 1.94


Regression coefficient in x on y (bxy) = 0.15
Product moment coefficient of correlation
r = √byx × bxy = √(1.94) × (0.15) = 0.54

(ii) Regression coefficient in y on x (byx) = – 1.96


Regression coefficient in x on y (bxy) = – 0.45
Product moment coefficient of correlation
r = √byx × bxy = √(−1.96) × (−0.45) = – 0.94

Question No. 21 Spring 2003


a) A firm train employees to use a statistical software package. A random sample of trainees
turned in the following performance.

Trainee Hours of Training Number of Errors


(x) (y)
A 1 6
B 4 3
C 6 2
D 8 1
E 2 5
F 3 4
G 1 7
i) Determine the least square regression line of y on x.
ii) Interpret the coefficient of regression.
iii) Predict the number of errors for a person with 5 hours of training.

b) The research director of a bank collected 24 observations of mortgage interest rates (x)
and number of house sale (y) at each interest rate. The director computed.

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

∑x = 276, ∑y = 768, ∑xy = 8690


∑x2 = 3,300 and ∑y2 = 25,000

Compute the correlation coefficient between x and y.

Answer No. 21 Spring 2003


a)

Trainee Hours of Training Number of Errors


(x) (y) xy x2
A 1 6 6 1
B 4 3 12 16
C 6 2 12 36
D 8 1 8 64
E 2 5 10 4
F 3 4 12 9
G 1 7 7 1
25 28 67 131

i) Regression line of y upon x


y = a + bx

n∑xy−∑x∑y 7(67)−25(28) 469−700 −231


where b = n∑x2 −(∑x)2 = = = = – 0.79
7(131)−(25)2 917−625 292
28 (−0.79)(25)
a = y – bx = 7 − = 4 + 2.83 = 6.83
7
regression line y = 6.83 – 0.79x

ii) Regression coefficient byx = – 0.79


This means that with every hour of training errors decrease by 0.79

iii) Number of errors with 5 hours of training


y = 6.83 – 0.79 (5) = 6.83 – 3.95 = 2.88 say 3

b) Given ∑x = 276, ∑y = 768, ∑xy = 8690


∑x2 = 3300, ∑y2 = 25000, n = 24

n∑xy−∑x∑y 24(8690)−276(768)
Correlation coefficient r = =
√[n∑x2 −(∑x)2 ][n∑y2 −(∑y)2 ] √[24(3300)−(276)2 ][24(25000)−(768)2 ]

208560−211968 −3408 −3408


= = = 5547.27
= – 0.61
√(79200−76176)(600000−589824) √(3024)(10176)

Question No. 22 Autumn 2002

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

a) Calculate the equation of the least squares regression line of y on x from the following
data:

X 1 3 3 4 5 5
Y 5 3 2 2 0 1
b) Five students were given following marks in a general knowledge competition by two
different judges:

Student Name Ali Adil Asif Ahmed Ayub


Marks given by Judge A 70 92 80 65 70
Marks given by Judge B 54 43 43 67 64

Required: Calculate Spearman's Rank Correlation Coefficient.

Answer No. 22 Autumn 2002


a)
x y xy x2
1 5 5 1
3 3 9 9
3 2 6 9
4 2 8 16
5 0 0 25
5 1 5 25
21 13 33 85

In least square regression


y = a + bx

For the values of a & b


∑y = na + b∑x
∑xy = a∑x + b∑x2

putting the values of x & y

13 = 6a + 21b__(i)
33 = 21 a + 85b_(ii)

Multiply (i) by 7, (ii) by 2 and subtracting (i) from (ii)

66 = 42a + 170b
– 91 = – 42a + – 147b
– 25 = 23b

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

b = – 25/23 = – 1.09

Putting value of b in (i)

13 = 6a + 21 (–1.09) = 6a – 22.89
6a = 13 + 22.89 = 35.89
a = 5.98 say 6.00
Line of regression y = 6 – 1.09x

b)
Student Marks Marks Rank Rank
Name A B A B d d2
Ali 70 54 3.5 3 0.5 0.25
Adil 92 43 1 4.5 – 3.5 12.25
Asif 80 43 2 4.5 – 2.5 6.25
Ahmad 65 67 5 1 4.0 16.00
Ayub 70 64 3.5 2 1.5 2.25
37.00

6∑d2 6(37) 222


Coefficient of rank correlation P = 1 – n(n2−1) = 1 – 5(25−1) = 1 – 120 = 1 – 1.85 = – 0.85

Question No. 23 Spring 2002


a) Compute two lines of regression from the following data:

r = 0.68, x = 68, y = 52, Sx = 5.12, Sy = 5.6

b) An equal number of a families from eight different cities of various sizes were asked
how much money they spend on food, clothing and housing per year. The data on city
sizes and average family expenditures are given below:

City size (000) 30 50 75 100 150 200 175 120


Expenditure (Rs. 000) 65 77 79 80 82 90 84 81

i) Compute the correlation coefficient r.


ii) Compute the coefficient of determination and interpret the result.
Answer No. 23 Spring 2002
a) Given r = 0.68, x = 68, y = 52

Sx = 5.12, Sy = 5.6

Regression line of y upon x

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

rSy
(y – y) = (x – x)
Sx

(0.68)(5.6)
(y – 52) = 5.12 (x – 68)
y – 52 = 0.74(x – 68)
y – 52 = 0.74x – 50.32

y = 0.74x – 50.32 + 52 = 0.74x + 1.68

Regression line of x upon y

rSx
(x – x) = (y – y)
Sy

(0.68)(5.12)
(x – 68) = (y – 52)
5.6
x – 68 = 0.62 (y – 52)
x = 0.62y – 32.24 + 68
x = 0.62y + 35.76

b) City Size Expenditure

x y xy x2 y2
30 65 1950 900 4225
50 77 3850 2500 5929
75 79 5925 5625 6241
100 80 8000 10000 6400
150 82 12300 22500 6724
200 90 18000 40000 8100
175 84 14700 30625 7056
120 81 9720 14400 6561
900 638 74445 126550 51236

n∑xy−∑x∑y 8(74445)−900(638)
i) r= =
√[n∑x2 −(∑x)2 ][n∑y2 −(∑y)2 ] √[8(126550)−(900)2 ][8(51236)−(638)2 ]

595560−574200 21360 21360


= = = 23992.20
= 0.89
√(1012400)−(810000)(409888)−(407044) √(202400)(2844)
ii) Coefficient of determination is square of coefficient of correlation

r2 = 0.7921

The result shows that increase in expenditure at (0.7921) (100) i.e. 79.21% is due to
increase in population. However, the increase in expenditure 100 – 79.21 = 20.79% is
due to the reasons other than the population increase, which might be inflation.

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

Question No. 24 Spring 2001


a) What will be value of Standard Error if all the observed values fall on regression line.
b) While estimating the value of Y through a regression line of Y on X, we find that

at x = 0; y = 8.25
and at x = 3; y = 12
Find the value of,
i) a (y intercept)
ii) byx (Regression Coefficient)
iii) What does byx represent

c) In (b) find the value of value of ∑y if ∑x = 80 and n = 10


d) Compute coefficient of Co-relation if byx = 9.5 and bxy = 0.09
e) What is the relationship between ryx and rxy
f) Draw a Scatter Diagram showing negative and linear relationship

Answer No. 24 Spring 2001


a) The standard error if all the observed values fall on regression line is zero
b) In regression line of y = a + bx
If x = 0 y = a = 8.25
If x = 3 y = a + 3b = 12

Subtracting (i) from (ii)


3b = 12 – 8.25 = 3.75
b = 1.25
i) a (y intercept) = 8.25
ii) byx (Regression coefficient) = 1.25
iii) byx represents rate of change in variable y with respect to one unit change in
variable x
c) ∑y if ∑x = 80 & n = 10
∑y = na + b∑x
= 10 (8.25) + 1.25 (80) = 82.5 + 100 = 182.5
d) r = √byx × bxy = √9.5 × 0.09 = 0.92
e) ryx and rxy have a commutative relationship i.e. rxy = ryx
f)

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

Question No. 25 Autumn 2000


Obtain the least square regression line, y = a + bx. Given
n = 7, ∑x = 77, ∑y = 225
∑xy = 2506, ∑x2 = 863

Answer No. 25 Autumn 2000


y = a + bx
Regression equations
∑y = na + b∑x2
∑xy = a∑x + b∑x2

225 = 7a + 77b_____________(i)
2506 = 77a + 863b__________(ii)

Multiplying (i) by 11 and subtracting from (ii)

2506 = 77a + 863b


– 2475 = – 77a ± 847b
31 = 16b
b = 31/16 = 1.94

Putting this value in (i)

225 = 7a + 77(1.94) or 22.5 = 149.38 + 7a


7a = 225 – 149.38 = 75.62, a = 10.80
Hence, line of regression
y = 10.80 + 1.94x

Question No. 26 Autumn 2000

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

A researcher compiled the following information to investigate the relationship between poking
and lung cancer:

Per capita Deaths per 100,000


Country Cigarette consumption From lung cancer
USA 1300 20
UK 1100 46
Finland 1100 35
Switzerland 510 25
Canada 500 15
Holland 490 24
Australia 480 18
Denmark 380 17
Sweden 300 11
Norway 250 9
Iceland 230 6

Compute r and r2 and describe what they mean.

Answer No. 26 Autumn 2000

Y X
Sr. Country Per capita Cigarette Deaths per 100,000 xy y2 x2
No. Consumption From lung cancer
1 U.S.A 1300 20 26000 1690000 400
2 UK 1100 46 50600 1210000 2116
3 Finland 1100 35 38500 1210000 1225
4 Switzerland 510 25 12750 260100 625
5 Canada 500 15 7500 250000 225
6 Holland 490 24 11760 240100 576
7 Australia 480 18 8640 230400 324
8 Denmark 380 17 6460 144400 289
9 Sweden 300 11 3300 90000 121
10 Norway 250 9 2250 62500 81
11 Iceland 230 6 1380 52900 36
Total 6640 226 169140 5440400 6018

n∑xy−∑x∑y (11)(169140)−(226)(6640)
r= =
√[n∑x2 −(∑x)2 ][n∑y2 −(∑y)2 ] √[(11)(6018)−(226)2 ][(11)(5440400)−(6640)2 ]

1860540−1500640 359900 359900


= = = 488102.5359
√(66198−51076)(59844400−44089600) √(15122)(15754800)

r = 0.7373, r2 = 0.5437

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

The above information shows that in 54.37% of cases the deaths from lungs cancer is due to
cigarette consumption and that there is a positive relationship in cigarette consumption and
deaths from lungs cancer.

Question No. 27 Spring 2000


An organization while determining the relationship between number of responses to an
advertisement (y); size of advertisement in column inches (x1) and newspaper circulation
in '000' (x2) obtained the following data:
__

x1 x2 y
1 2 1
8 8 4
3 1 1
5 7 3
6 4 2
10 6 4

i) Derive the multiple regression equation.


ii) Calculate the number of responses to a 2 column inch advertisement in a newspaper
with a circulation of 15,000.

Answer No. 27 Spring 2000

x1 x2 y x1x2 x1y x2y x21 x22

1 2 1 2 1 2 1 4
8 8 4 64 32 32 64 64
3 1 1 3 3 1 9 1
5 7 3 35 15 21 25 49
6 4 2 24 12 8 36 16
10 6 4 60 40 24 100 36
33 28 15 188 103 88 235 170

Multiple regression equations.

i) y = a + b1x1 + b2 x2
To solve for a, b1, b2 simultaneous equations are
∑y = na + b1∑x1 + b2∑x2
∑x1y = a∑x1 + b1∑x21 + b2∑x1x2
∑ x2y = a∑x2 + b1∑x1x2 + b2∑x22

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

putting the values

15 = 6a + 33b1 + 28b2_______________(i)
103 = 33a + 235b1 + 188b2___________(ii)
88 = 28a + 188b1 + 170b2____________(iii)

Multiplying (i) by 11, (ii) by 2 and subtracting (i) from (ii)

206 = 66a + 470b1 + 376b2


–165 = – 66a ± 363b1 ± 308b2
41 = 107b1 + 68b2__________(iv)

Multiplying (i) by 14, (iii) by 3 and subtracting (i) from (iii)

264 = 84a + 564b1 + 510b2


– 210 = – 84a ± 462b1± 392b2
54 = 102b1 + 118b2 ____________(v)

Multiplying (iv) by 102, (v) by 107 and subtracting (iv) from (v)

5778 = 10914b1 + 12626b2


– 4182 = – 1091.4b1 ± 6936b2
1596 = 5690b2 _________________(vi)

b2 = 0.28

Putting this value in (iv)

41 = 107b1 +68(0.28)
107b1 = 41 – 19.04 = 21.96 or b1 = 0.21
Putting values of b1 & b2 in (i)
15 = 6a+ 33 (0.21) + 28 (0.28)
15 = 6a+ 6.93+ 7.84
15 – 14.77 = 6a = 0.23, a = 0.04
y = 0.04 + 0.21x1 + 0.28x2

ii) Number of responses at x1 = 2 and x2 = 15


y = 0.04 + 0.21 (2) + 0.28 (15)
= 0.04 + 0.42 + 4.2 = 4.66 or say 5

Question No. 28 Autumn 1999

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

a) if ∑x = 19, ∑y = 29, ∑xy = 126


∑x2 = 87, ∑y2 = 189, n = 5

Find the product moment correlation coefficient, r

b) Find the equation of the least square regression of y on x for the following data.

X 1 2 4 6 7 8 10
Y 10 14 12 13 15 12 13

Answer No. 28 Autumn 1999


a) Given ∑x = 19, ∑y = 29, ∑xy = 126
∑x2 = 87, ∑y2 = 189. n = 5
n∑xy−∑x∑y 5(126)−19(29) 630−551
r= = =
√[n∑x2 −(∑x)2 ][n∑y2 −(∑y)2 ] √[(5)(87)−(19)2 ][(5)(189)−(29)2 ] √(435−361)(945−841)

79
= = 79/87.73 = 0.9
√(74)(104)

b)
x y xy x2
1 10 10 1
2 14 28 4
4 12 48 16
6 13 78 36
7 15 105 49
8 12 96 64
10 13 130 100
38 89 495 270

The equations for least square regression line

∑y = na + b∑x
∑xy = a∑x + b∑x2
89 = 7a + 38b______________(i)
495 = 38a + 270b____________(ii)

Multiplying (i) by 38, (ii) by 7 and


Subtracting (i) from (ii)

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

3465 = 266a + 1890b


3382 = 266a +1444b
83 = 446b

b = 0.186
Putting it in (i)
89 = 7a + 38 (0.186)
89 = 7a + 7.07 or 7a = 89 – 7.07 = 81.93, a = 11.704

Equation of Regression line


y= 11.704 + 0.186x

Question No. 29 March 1999


i) Calculate the coefficient of correlation for the following data:

Annual percentage increase in Annual percentage increase in


advertising expenditure sales revenue
(X) (Y)
1 1
3 2
4 4
6 4
8 5
9 7
11 8
14 9

ii) What is the purpose of finding the correlation coefficient and what does its value
indicate in respect of the above data on advertising expenditure and sales revenue?

Answer No. 29 March 1999


i)

x y xy x2 y2
1 1 1 1 1
3 2 6 9 4
4 4 16 16 16
6 4 24 36 16
8 5 40 64 25

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

9 7 63 81 49
11 8 88 121 64
14 9 126 196 81
Total 56 40 364 524 256

n∑xy−∑x∑y 8 × 364−56 ×40


Coefficient of correlation r = =
√[n∑x2 −(∑x)2 ][n∑y2 −(∑y)2 ] √[8 ×524−(56)2 ][8 ×256−(40)2 ]

2912−2240 672 672


= = =
687.81
= 0.98
√[4192−3136][2048−1600] √(1056)(448)

ii) Coefficient of correlation is a measure to describe the degree of interdependence of


two variables i.e. if one variable changes its value the other variable will also change
its value to the volume of relationship described by the coefficient of correlation.

In this particular question where r = 0.98 it can be argued that up to (0.98)2 = 0.9608
or 96% of change in the sale volume is due to advertising expense or vice versa.

Question No. 30 Section II September 1998


x = output in thousands of unit,
y = profit per unit of output (in Rs.)

X 5 7 9 11 13 15
Y 1.7 2.4 2.8 3.4 3.7 4.4

i) Calculate the values of 'a' and 'b' for the equation


y = a + bx to show the regression of y on x.

ii) Estimate from this equation the profit per unit on an output of 10500 units.

Answer No. 30 Section II September 1998

y x xy x2
1.7 5 8.5 25
2.4 7 16.8 49
2.8 9 25.2 81
3.4 11 37.4 121
3.7 13 48.1 169
4.4 15 66.0 225
Total 18.4 Total 60 Total 202.0 Total 670
i) y = a + bx
∑y = na + b∑x

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

∑xy = a∑x + b∑x2


18.4 = 6a + 60b ___________(i)
202.0 = 60a + 670 b_________(ii)

Multiplying (i) by 10 and subtracting from (ii)


202.0 = 60a + 670 b
–184.0 = – 60a ± 600 b
18 = 70b

18
b = 70 = 0.26

Putting value of b in (i)


18.4 = 6a + 60 (0.26) = 6a + 15.6
6a = 2.8, a = 0.47
y = 0.47 + 0.26 x

ii) Profit at 10500 units


y =0.47 + 0.26(10.5)
= 0.47 + 2.73 = 3.20

Question No. 31 Section II April 1998


The following data give the earning per share of a company.

Year 1991 1992 1993 1994 1995 1996 1997


Earning per share 2.15 2.39 2.74 2.75 3.00 3.35 3.66
Find the least squares equation of straight line and estimate the earning per share for the year
2000.

Answer No. 31 Section II April 1998


Year Earning per share x (year – 1994) xy x2
1991 2.15 -3 -6.45 9
1992 2.39 -2 -4.78 4
1993 2.74 -1 -2.74 1
1994 2.75 0 0 0
1995 3.00 1 3.00 1
1996 3.35 2 6.70 4
1997 3.66 3 10.98 9
Total 20.04 0 6.71 28
The line by least square
y = a + bx and

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

The simultaneous equations are

∑y = na + b∑x
∑xy = a∑x + b∑x2
20.04 = 7a, So a = 2.86
6.71 = 28b, b = 0.24

Line y = 2.86 + 0.24x


Earning per share for year 2000
y = 2.86 + 0.24 (6) = 2.86 + 1.44 = Rs. 4.3

Question No. 32 October 1997


The following results were obtained for a random sample of ten pairs of values,

x = 5.0 s2x = 8.41 sxy = 3.96


y = 3.5 s2y = 2.25

i) Calculate the Coefficient of correlation.


ii) Find the Regression equation of y on x and estimate y for x = 10

Answer No. 32 October 1997


Given x = 5.0, s2x = 8.41, sxy = 3.96
y = 3.5, s2y 2.25

Sxy Sxy 3.96 3.96


i) Coefficient of correlation r = S = = r= = r= = 0.91
x Sy √S2x S2y √(8.41)(2.25) 4.35

ii) Regression equation of y upon x

rSy
(y – y) = (x – x)
Sx

sy = √Sy2 = √8.41 = 2.9

sx = √Sx2 = √2.25 = 1.5

(0.91)(2.9)
y – 3.5 = (x – 5)
1.5

y = 1.76 (x – 5) + 3.5
y = 1.76x – 8.8 + 3.5 = 1.76x – 5.3

For x = 10, y = 1.76(10) – 5.3 = 12.3

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

Question No. 33 Section II April 1997


For six paired observation (x,y), where
x = Expenditures for research,
y = Annual profit.
n=6

The following computations were made.

∑x =30, ∑y =180
∑x2 = 200, ∑xy=1000

i) Determine regression equation of y on x.


ii) Estimate y when x = 8

Answer No. 33 Section II April 1997


Give ∑x = 30, ∑y = 180, ∑x2 = 200, ∑xy = 1000 and n = 6

Regression of y upon x;

i) y – y = byx (x – x )

∑y 180
where y = = = 30
n 6
∑x 30
and x = = = 5
n 6

n∑xy−∑x∑y 6(1000)−30(180) 6000−5400 600


byx = n∑x2 −(∑x)2 = = = = 2
6(200)−(30)2 1200−900 300

y – 30 = 2 (x – 5)
y = 2x – 10 + 30 = 20 + 2x
y = 20 + 2(8)
y = 20 + 16 = 36

ii) x = 1430, 1190, 1280, 1270, 1310, 1380

∑x 7860
∑x = 7860, x= = = 1310
n 6

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

Question No. 34 Section II November 1996


The number of units of an item produced by a manufacturer and the cost per unit are recorded
below:

No. of units (000) x 1 3 5 10 12 15 24


Cost per unit (Rs.) y 55 52 48 36 32 30 25

i) Find regression equation of y on x.


ii) Estimate per unit cost when 20,000 units are produced.

Answer No. 34 Section II November 1996

No. of units Cost per unit y xy x2


"000"x
1 55 55 1
3 52 156 9
5 48 240 25
10 36 360 100
12 32 384 144
15 30 450 225
24 25 600 576
70 278 2245 1080

i) y = a + bx, Simultaneous equations will be


∑y = na + b∑x
∑xy = a∑x + b∑x2

Putting the values

278 = 7a + 70b_______(i) Multiplying (i) by 10 and subtracting from (ii)


2245 = 70a + 1080b___(ii)
– 2780 = – 70a ± 700b
– 545 = 380b

b = – 1.4, Putting this value in (i)


278 = 7a+ 70 (–1.4)
7a = 278 + 98 = 376, a = 53.7
Regression of y upon x is y = 53.7 – 1.4x

ii) Estimated per unit cost at 20000 units

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

y = 53.7 – 1.4 (20) = 53.7 – 28


y = Rs. 25.7 or say Rs. 26/-

Question No. 35 Section II (Part 1) May 1996


a) What is a Time Series? Describe its components.
b) For the data of heights and weights of 5 men:

Height (x). 64 68 70 72 74
Weight (y). 160 170 180 190 195

i) Establish a least squares equation of Regression of y on x.


ii) Calculate Coefficient of Correlation.

Answer No. 35 Section II (Part 1) May 1996

a) Time series:

The arrangement of data according to the time of occurrence at regular intervals of time
like hours, days, months or years is called time series. Examples of time series are hourly
temperature recorded at a locality for a specific period, the production of fertilizer of
certain kind at Pak-Saudi Fertilizer Plant at Multan, the enrolment of students appearing
in C.A final examination over past few years etc.

Component of a Time Series:

A time series might be composed of four basic types of movements which are called its
components. These are;

i) Secular Trend which is long term movement and persists for a long period
normally not less than a decade.
ii) Seasonal variations which are mainly due to change in season and are short-term
movements the fluctuations being repeated during a year or shorter.
iii) Cyclic variations. These tend to occur in a more or less regular pattern over a
period of certain number of years fluctuating from peak to some lowest point and
then back to peack at a maximum point. It is like a business cycle with duration of
3, 5, 7 etc. years.
iv) Irregular variations which are also called random or accidental fluctuations, are
unsystematic in nature like floods, strikes earthquakes, wars and some political
events, etc. The study of these variations is somewhat difficult.

The net affect of all the four components is either additive or multiplicative

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

y = T (Trend) + S (Seasonal) + C (Cyclic) + I (Irregular)

or y=TxSxCxl

Height (x) Weight (y) xy x2 y2


64 160 10240 4096 25600
68 170 11560 4624 28900
70 180 12600 4900 32400
72 190 13680 5184 36100
74 195 14430 5476 38025
348 895 62510 24280 161025

y = a + bx

The simultaneous equations are

∑y = na + b∑x or 895 = 5a + 348b______________(i)


∑xy = a∑x + b∑x2 or 62510 = 348a + 24280b_____(ii)

Multiply (i) by 348 by, (ii) by 5 and subtracting (i) from (ii)
312550 = 1740a + 121400b
– 311460 = – 1740a ± 121104b
1090 = 296b
b = 3.68
Putting this value in (i)

895 = 5a + 348 (3.68), 895 = 5a + 1280.64


5a = – 1280.64 + 895 = – 385.64, a = – 77.13
y = – 77.13 + 3.68x

Coefficient of correlation r

n∑xy−∑x∑y 5(62510)−(895)(348)
r= =
√[n∑x2 −(∑x)2 ][n∑y2 −(∑y)2 ] √[5(24280)−(348)2 ][5(161025)−(895)2 ]

312550−311460 1090 1090


r= = = 1101.6
= 0.99
√[121400−121104][805125−801025] √(296)(4100)

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

Question No. 36 Section II May 1996


A manufacturing company has 9 years data of its advertisement cost and sales.

Sales y (Rs. 100,000) 45 51 56 52 61 60 51 63 65


Advertisement cost (Rs. 1000) 'x' 33 40 43 47 50 52 56 63 66

On the basis of these data, it wanted to know, how much sales it could do if it spends Rs. 73,000
on advertisement?

Answer No. 36 Section II May 1996


Given

Advertisement
Cost (Rs. 1000) Sales (Rs. 100,000)
X (y) xy x2
33 45 1485 1089
40 51 2040 1600
43 56 2408 1849
47 52 2444 2209
50 61 3050 2500
52 60 3120 2704
56 51 2856 3136
63 63 3969 3969
66 65 4290 4356
450 504 25662 23412

The line is y = a + bx

The linear equations are ∑y = na + b∑x


∑xy = a∑x + b ∑x2

504 = 9 a + 450 b_____________(i)


25662 = 450 a + 23412 b__________(ii)

Multiplying equation (i) by 50 and subtracting from (ii)

25662 = 450 a + 23412 b


– 25200 = – 450 a ± 22500 b
462 = 912 b
b = 0.5

Putting this value in (i) 504 = 9a + 450 (0.5)

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

504 = 9a + 225, 9a = 504 – 225 = 279, a = 31


The line y = 31 + 0.5 x

If Advertising expenses are 73000 the sales would be

y = 31 + (0.5) (73)

y = 31 + 36.50 = 67.50 or 67.50 (100,000) = Rs. 6,750,000

Question No.37 Section II (Part 1) November 1995


The sales and profits of 7 steel companies were as follows:

Company A B C D E F G
Sales 5.7 6.7 0.2 0.6 3.8 12.5 0.5
Profit 0.27 0.12 0.00 0.04 0.05 0.46 0.00

Construct a least squares equation of regression to estimate profit.

Answer No. 37 Section II (Part 1) November 1995


Company Sales (x) Profit (y) xy x2
A 5.7 0.27 1.539 32.49
B 6.7 0.12 0.804 44.89
C 0.2 0.00 0 0.04
D 0.6 0.04 0.024 0.36
E 3.8 0.05 0.190 14.44
F 12.5 0.46 5.750 156.25
G 0.5 0.00 0 0.25
Total 30.00 0.94 8.307 248.72

The least square equation y = a + bx

∑y = na + b∑x2
∑xy = a∑x + b∑x

0.94 = 7a + 30b__________________(i)
8.307 = 30a + 248.72b ____________(ii)

Multiplying (i) by 30 and (ii) by 7 and subtracting (i) from (ii)

58.149 = 210a + 1741.040b


28.200 = 210a + 900b
29.949 = 841.04b

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

b = 0.036

Putting this value in (i)

0.94 = 7a + 30 (0.036) 0.94 = 7a + 1.08


7a = 0.94 – 1.08 = – 0.14 or a = – 0.02
The profit line y = – 0.02 + 0.036x

Question No. 38 Section II (Part 1) April 1995


a) What does a coefficient of correlation measure? It lies between what two values?
b) A sample of paired observations is given below:

X 2 3 4 5 6 7 8
y 2 8 11 9 19 14 14

i) Determiner the regression equation of y on x.


ii) Verify that the sum of deviations of the observed values of y from their estimated
values is zero.

Answer No. 38 Section II (Part 1) April 1995


a) The correlation coefficient is a measure which measures the degree of inter dependence
of two variables or quantities. Or if there are two quantities x and y so connected with
each other that change in one will bring a change in other, the empirical relationship
between them is correlation coefficient. For example change in height signifies change in
weight of a teenager or vice-versa. The change may be positive as well as negative.

The coefficient of correlation varies from – 1 to + 1 i.e. – 1 < r < + 1

b) i)

x y xy x2 y y–y
2 2 4 4 5 -3
3 8 24 9 7 +1
4 11 44 16 9 +2
5 9 45 25 11 -2
6 19 114 36 13 +6
7 14 98 49 15 -1
8 14 112 64 17 -3
35 77 441 203 0

∑x 35 ∑y 77
x= = = 5, y= = = 11
n 7 n 7

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)


mhkorai.blogspot.com

The line of regression of y upon x


y – y = byx (x – x)

Where byx is the regression coefficient of y upon x

n∑xy−∑x∑y 7(441)−(35)(77) 3087−2695 392


byx = n∑x2 −(∑x)2 = = = = 2.0
7(203)−(35)2 1420−1225 196

y – 11 = 2(x – 5), y – 11 = 2x – 10
y = 2x – 10 + 11, y = 1 + 2x

ii) From the table above it is obvious that ∑ (y – y) = 0

Prepared By: Dawood Shahid (CPA, CA(f), M.Phil, MBA,)

You might also like