You are on page 1of 33

STATISTICS

INTRODUCTION
➢ Statistics is the area of science that deals with collection,
organization, analysis, and interpretation of data.
➢ It also deals with methods and techniques that can be used
to draw conclusions about the characteristics of a large
number of data points--commonly called a population--
➢ By using a smaller subset of the entire data.
A TAXONOMY OF STATISTICS
DATA REPRESENTATION

There are two types of data representations

1.Graphical representation

2.Numerical representation
NUMERICAL REPRESENTATION
A fundamental concept in summary statistics is that of a central value for a set of
observations and the extent to which the central value characterizes the whole set of data.
Measures of central value such as the mean or median must be coupled with measures of
data dispersion
MEASURES OF CENTRAL TENDENCY

G.M
A.M
Measures
of
Central
Tendency
H.M
Mode Media
n
ARITHMETIC MEAN

X i
Individual n
Data

Grouped
fX i i

n
Data
DISPERSION
❖ Variability (or dispersion) measures the amount of scatter in a
dataset.
❖ Commonly used methods: range, variance, standard deviation,
interquartile range, coefficient of variation etc.

 X X
2

2 =
i i 2
-( )
Variance n n

 fi X i fX )
2

2 =
i i 2
Standard -(
Deviation n n


C.V.= − *100
Covariance x
EX AMPLES

1.)Standard deviation of the data 15,10,5 is

X 15 10 5 Σ(X)=30
X2 225 100 25 Σ (X2)=350


x=
 x 30
= = 10
3 3

x 2 − 2 350
 = − x = − 100 = 4.082
3 3
MOMENTS

Moments

Central Raw
Moments Moments
CENTRAL MOMENTS

Moments about any number mean  r

r =
1
N
 f x−x ( )r

 0 = 1 1 = 0
Relation Between cental and raw moments

 2 =  2' − (1' )
2

 3 =  3' − 3 2' 1' + 2(1' )


3

 4 =  4' − 4 3' 1' + 6 2' (1' ) − 3(1' )


2 4
Skewness and Kurtosis

To Analyse
graph of the
distribution

Coefficient of
Coefficient of
skewness  32 
1 = 3 kurtosis  2 = 42
2 2
1.The first and second moments of the distribution about the value 2 are 1 and 16.Variance of the
distribution is

Here  2 =16, 1 =1
' '

'2
 2 =  2 − 1
'

=16-1
=15
MCQ’S
1. Standard deviation of four numbers 9, 11, 13, 15 is
[A] 2 [B] 4 [C] 6 [D] 5

2. From the given information  x = 235,  x 2 = 6750, n = 10. standard deviation of x


is
[A] 11.08 [B] 13.08 [C] 8.08 [D] 7.6

3. Coefficient of variation of the data 1, 3, 5, 7, 9 is


[A] 54.23 [B] 56.57 [C] 55.41 [D] 60.19
4. The Standard deviation and arithmetic mean of the distribution are 12 and 45.5 respectively.
Coefficient of variation of the distribution is
[A] 26.37 [B] 32.43 [C] 12.11 [D] 22.15

5. The Standard deviation and Arithmetic Mean of three distribution x, y, z are as follow:

Arithmetic Mean Standard deviation

X=18.0 5.4

Y=22.5 4.5

Z=24.0 6.0

The more stable


[A] x [B] y [C] z [D] x and z
6. The first and second moments of the distribution about the value 3 are 2 and 20. Second
moment about the mean is
[A] 12 [B] 14 [C] 16 [D] 20

7. The first three moments of a distribution about the value 5 are 2, 20, and 40. Third moment
about the mean is
[A] -64 [B] 64 [C] 32 [D] -32

8. The first four moments of a distribution about the value 5 are 2, 20, 40 and 50. Fourth
moment about the mean is
[A] 160 [B] 162 [C] 210 [D] 180
9. The first and second moments of the distribution about the value 2 are 1 and 16. Variance of
the distribution is
[A] 12 [B] 3 [C] 15 [D] 17

10. The second and third moments of a distribution about the arithmetic mean are 16 and -64
respectively. Coefficient of skewness 1 is given by
[A] -0.25 [B] 1 [C] 4 [D] -1

11. The second and fourth moments of a distribution about the arithmetic mean are 16 and 162
respectively. Coefficient of kurtosis  2 is given by
[A] 1 [B] 1.51 [C] 0.63 [D] 1.69
12.Arithmetic mean of four numbers is 16, one item 20 is replaced by 24, what is the new
arithmetic mean
[A] 15 [B] 17 [C] 18 [D] 16

13.The first moments of a distribution about the value 2 are -2, 12, -20 and 100. Fourth moment
about the mean is
[A] 200 [B] 190 [C] 170 [D] 180

14. The first three moments of a distribution about the value 2 are -2, 12, -20. Third moment
about the mean is
[A] 36 [B] 30 [C] 22 [D] 8
PREVIOUS UNIVERSITY THEORY
QUESTIONS
1 The first three moments about the value 2 of a distribution are 1, 16 and- 40 . Find Dec,2019
first three central moments, standard deviation and skewness of distribution

2 First four moments of a distribution about value 5 are 2,20,40,and 50.Obtain the [May,2019,
first four central moments,mean,S.D and coefficient of skewness and kurtosis. May 2015

3 First four moments of a distribution about value 4 are-1.5,17,-30 and 108.Obtain Dec
the first four central moments and coefficient of skewness and kurtosis. 2018,Nov
2016
4 First four moments of a distribution about value 5 are-4,22,-117 and 560.Obtain the MAY-18
first four central moments and coefficient of skewness and kurtosis.

5 Find the first four moments about mean for the following data Dec, 2017

X 0 1 2 3 4 5 6

F 5 15 17 25 19 14 5

6 First four moments of a distribution about value 5 are-1.5,17,-30 and 108.Obtain May 2017
the first four central moments and coefficient of skewness and kurtosis

7 First four moments of a distribution about value 30.2 are-0.255,6.222,30.211and Nov 2016
400.25.Obtain the first four central moments and coefficient of skewness and
kurtosis
CORRELATION
cov(x, y )
r ( x, y ) =
 x y

cov( x, y ) =
1
n
( )(
 x−x y− y )

Example: 1. Calculate the coefficient of correlation between the marks obtained by 8 students in
Mathematics and Statistics from the following table.

Students A B C D E F G H

Mathematics 25 30 32 35 37 40 42 45
(x)
Statistics(y) 8 10 15 17 20 22 24 25
Solution:
X 25 30 32 35 37 40 42 45 286
Y 8 10 15 17 20 22 24 25 141
X*X 625 900 1024 1225 1369 1600 1764 2025 10532
Y*Y 64 100 225 289 400 484 576 625 2763

5328
X*Y 200 300 480 595 740 880 1008 1125


x=
x =
286
= 35.75

y=
 y
=
141
= 17.625
8 8 8 8

Cov (x,y)=
 xy − x y = 5328 − (35.75 *17.625) =35.9063
− −

n 8

Var(x)=
x 2

−x =
−2
10532
− (35.75) 2 =38.4375
n 8

Var(y)=
y 2
−2
−y =
2763
− (17.625) 2 =34.4375
n 8

Cov (x, y
r(x,y)= =0.9826
 x y
REGRESSION LINES
Regression Regression
Line Y on X Line X on Y

y
y−y=r

(x − x) ( )
x
x−x =r y− y
y
x
MCQ’S
1. If  xy = 1242, x = −5.1, y = −10, n = 10, then cov(x, y ) is
[A] 67.4 [B] 83.9 [C] 58.5 [D] 73.2

2. If  x 2 = 2291,  y 2 = 3056, (x + y 2 ) = 10623, n = 10, x = 14.7, y = 17 then


cov(x, y ) is
[A] 1.39 [B] 13.9 [C] 139 [D] -13.9

3. If the two regression coefficient are 0.16 and 4 then the correlation coefficient is
[A] 0.08 [B] -0.8 [C] 0.8 [D] 0.64
4. You are given the following information related to a distribution comprising 10 observation
x = 5.5, y = 4,  x 2 = 385,  y 2 = 192, (x + y ) = 947. The correlation coefficient r (x, y ) is
2

[A] -0.924 [B] -0.681 [C] -0.542 [D] -0.813

5. Given the following data


r = 0.022,  xy = 33799,  x = 4.5,  y = 64.605, x = 68, y = 62.125. The value of n
[number of observation] is
[A] 5 [B] 7 [C] 8 [D] 10

6. Coefficient of correlation between the variables x and y is 0.8 and their covariance is 20, the
variance of x is 16. Standard deviation of y is
[A] 6.75 [B] 6.25 [C] 7.5 [D] 8.25
7. Line of regression y on x is 8x-10y+66=0. Line of regression x on y is
40x-18y-214=0. Mean deviation values of x and y are
[A] x = 12, y = 15 [B] x = 10, y = 11 [C] x = 13, y = 17 [D] x = 9, y = 8

8. If the two lines of regression of 9x+y-  =0 and 4x+y=  and the mean of x and y are 2 and -3
respectively then the values of  and  are
[A]  = 15and = 5 [B]  = −15and = −5 [C]  = 5and = 15 [D]
 = 15and = −5

9. Line of regression y on x is 8x-10y+66=0. Line of regression x on y is 40x-18y-214=0.


Correlation coefficient r (x, y ) is given by
[A] 0.6 [B] 0.5 [C] 0.75 [D] 0.45
10. The regression lines are 9x+y=15 and 4x+y=5. Correlation r(x, y) is given by
[A] 0.444 [B] -0.11 [C] 0.663 [D] 0.7

11. Line of regression y on x is 8x-10y+66=0. Line of regression x on y is 40x-18y-214=0. The


value of variance of x is 9. The standard deviation of y is equal to
[A] 2 [B] 5 [C] 6 [D] 4

12. Line of regression y on x is 8x-10y+66=0. Line of regression x on y is 40x-18y-214=0. The


value of variance of y is 16. The standard deviation of x is equal to
[A] 3 [B] 2 [C] 6 [D] 7
13. Given bxy = 0.8411, b yx = 0.4821 and the standard deviation of y is 1.7916 then the value of

correlation coefficient r (x, y ) and standard deviation of x is


[A] r = −0.6368,  x = −2.366 [B] r = 0.63678,  x = 2.366
[C] r = 0.40549,  x = 2.366 [D] r = 0.63678,  x = 5.6

14. The correlation coefficient between two variable x and y is 0.6.


If  x = 1.5,  y = 2.00, x = 10, y = 20 then the lines of regression are
[A] x = 0.45 y + 12andy = 0.8 x + 1 [B] x = 0.45 y + 1andy = 0.8 x + 12
[C] x = 0.65 y + 10andy = 0.4 x + 12 [D] x = 0.8 y + 1andy = 0.45 x + 12
15. Given the following data x = 36, y = 85,  x = 11,  y = 8, r = 0.66. By using line of regression
x on y, the most probable value of x when y=75 is
[A] 29.143 [B] 24.325 [C] 31.453 [D] 26.925
PREVIOUS UNIVERSITY THEORY QUESTIONS

1 Find the regression equation of Y on X for bivariate data with the following n=25 Dec 19
n n n n n

x
1
i = 75  y i = 100  x i = 250  y i = 500  x i y i = 325
1 1
2

1
2

2. Calculate the coefficient of correlation for the following data.Estimate value of x=10 [MAY-
18

X 2 4 5 6 8 11

y 18 12 10 8 7 5
3 The Regression equation are given by 8 x − 10 y + 66 = 0,40 x − 18 y = 214 May,20
19,Nov
The value of variance of x is 9.Find 2015
i)The mean values of x &y ii) The correlation coefficient between x&y iii)S.D
of y

4. If the two lines of regression of 9x+y-  =0 and 4x+y=  and the mean of x and y are [Dec
2 and -3 respectively then find the values of  and  and the coefficient of 18, May
correlation between x & y 17 ]

5 Two lines of regression are given by 3x+2y-26=0 and 6x+y-31=0 . find i)The mean Dec
value of x and y ii) 2 x iii) Coefficient of correlation between x and y. 2017

6 Find the coefficient of correlation for the following data May


2017
X 10 14 18 22 22 30

y 18 12 24 6 30 36
7 Obtain regression lines for the data

n = 5 ,  x = 30,  y = 40,  x 2 = 220,  y 2 = 340,  xy = 214 Nov


is 2016

8 Calculate the correlation coefficient for the following data May 16

n = 20,  x = 40,  y = 40,  x 2 = 190,  y 2 = 200,  xy = 150 is

9 Find the coefficient of correlation for the following data May


X 1 2 3 4 5 6 7 8 9 2014

Y 9 8 10 12 11 13 14 16 15
CURVE FITTING

You might also like