You are on page 1of 81

INTRODUCTION TO STATISTICS

Definitions
 “Statistics is a numerical statement of facts in any department of
enquiry placed in relation to each other’. -Bowley
• “Statistics are the classified facts representing the conditions of the
people in a State specially those facts which can be stated in
numbers or any tabular or classified arrangement”. -Webster
• “Statistics can be defined as the aggregate of facts affected to a
marked extent by multiplicity of causes, numerically expressed,
enumerated or estimated according to a reasonable standard of
accuracy, collected in in a systematic manner, for a pre-determined
purpose and placed in relation to each other”. -Secrist
• Statistics is the science of collecting, organizing , analyzing,
interpreting and presenting data.
SCOPE OF STATISTICS
1.Social Sciences
-Man Power Planning
-Crime Rates
-Income & Wealth Analysis of Society
-In studying Pricing, Production, Consumption, Investments &
Profits etc.
2.Planning
-Agriculture
-Industry
-Textiles
-Education etc.
For ex. Five Year Plans in India.
SCOPE OF STATISTICS
contd…
3. Mathematics
-Extensive use of Differentiation, Algebra, Trigonometry, Matrices
-Statistics now treated as Applied Mathematics.

4. Economics
- Family Budgeting
-Applied in solving economic problems related to production,
consumption, distribution of products as per income & wealth
related patterns, wages, prices, profits & individual savings,
investments, unemployment & poverty etc.
SCOPE OF STATISTICS
contd…
- Trend Analysis
- Market Research & Analysis
- Product Life Cycle
i) Marketing
Marketing Policy Decisions depend on forecasting, demand
analysis, time & motion studies, inventory control, investments &
analysis of consumer data for production & sales.
SCOPE OF STATISTICS
contd…
ii) Production
- Designs
- Methods of Production
- Technology Selection
- Quality Control Mechanisms
- Product Mix
- Quantities
- Time Schedules for Manufacturing & Distribution
SCOPE OF STATISTICS
contd…
iii) Finance
-Correlation Analysis of profits & dividends, assets & liabilities
-Analysis of income & expenditure
- Financial forecasts, break-even analysis, investment & risk
analysis
iv) Sales
-Demand Analysis
-Sales Forecasts
v) Personnel
- Wage plans, Incentive plans, Cost of living, Labor turnover ratio,
Employment trends, Accidental Rates, Performance Appraisals
etc.
SCOPE OF STATISTICS
contd…
vi) Accounting & Auditing
-Analysis of Income, Expenditure, Investment, Profits and
Optimization of Production etc
- Forecasting costs of production & price
vii) Other Areas
-Insurance, Astronomy, Social Sciences, Medical Sciences,
Psychology, Education etc.
LIMITATIONS OF STATISTICS
 Does not study individual items, deals with
aggregates.
 Statistical laws are not exact.
 Not suitable for the study of qualitative
phenomenon.
 Statistical methods are only means and not
end for solving problems.
ROLE OF STATISTICS IN
MANAGEMENT DECISIONS
A. Marketing & Sales
- Product selection & competence strategies
- Utilization of resources including territory control
- Advertising decisions for cost & time effectiveness
- Forecasting & trend analysis
- Pricing & market research
ROLE OF STATISTICS IN
MANAGEMENT DECISIONS contd…

B. Production Management
- Product mix & product positioning
- Facility & production planning
- Distribution management
- Material handling & facility planning
- Maintenance policies
- Activity planning & resources allocation
- Quality control decisions
ROLE OF STATISTICS IN
MANAGEMENT DECISIONS contd…
C. Materials Management
- Buying policy- Sourcing & Procurement
- Material Planning & Lead Times
D. Finance, Investments & Budgeting
- Profit planning
- Cash Flow Analysis
- Investment decisions
- Dividend policy decisions
- Risk Analysis
- Portfolio Analysis
ROLE OF STATISTICS IN
MANAGEMENT DECISIONS contd…

E. Personnel Management
- Optimum organization level
- Job evaluation & assignment analysis
- Social / habit analysis
- Salary / wage policies
- Recruitment & Training
ROLE OF STATISTICS IN
MANAGEMENT DECISIONS contd…

F. Research & Development

- Area of thrust – Analysis & Planning
- Project Selection Criteria\
- Alternatives analysis
- Trade – off analysis - cost & revenue
ROLE OF STATISTICS IN
MANAGEMENT DECISIONS contd…

G. Defense
- Optimization of weapon system
- Force deployment
- Transportation Cost Analysis
- Assignment Suitabilities
Definitions Continued
 Observations: Numerical quantities that
measure specific characteristics.
Examples include height, weight, gross
sales, net profit, etc.
Some More Definitions
 Raw Data: Data collected in original form.

Classes / class intervals: Subgroups within a

set of collected data. Ex.10-20,20-30 etc

 Frequency: The number of times a certain

value or class of values occurs.

 Frequency Distribution Table: The organization of

raw data into table form using classes and
frequencies.
More Definitions

 Cumulative Frequency of a class is the sum of

the frequency of that class and the frequencies
of all the preceding or succeeding classes which
are listed in some sensible order (numerical
order, alphabetical order, etc.)
Illustration – Individual Series
Marks of ten students of a class in Statistics
15, 35, 55, 67, 78, 84, 79, 90, 89, 94
Illustration – Discrete
Frequency Distribution
Height No. of
(in Students
inches)
60 12

62 18

64 10

66 6

68 4
Illustration – Grouped or Continuous
Frequency Distribution
 Exclusive Type Class – Class- Frequency
Intervals Intervals
20-25 8

25-30 2

30-35 40

35-40 23

40-45 9
Illustration – Grouped or Continuous
Frequency Distribution contd…
 Inclusive Type Class - Class- Frequency
Intervals Intervals
1-10 2

11-20 6

21-30 10

31-40 15

41-50 12
CONVERSION OF INCLUSIVE TYPE
CLASS-INTERVALS TO EXCLUSIVE
TYPE CLASS INTERVALS
 Calculate ADJUSTMENT FACTOR as follows:
A.F= Lower Limit of Next C.I – Upper Limit of Previous C.I
2
using the given inclusive type class intervals.
2. Obtain new class intervals as follows:
New Lower Limit = Old Lower limit – A.F
New Upper Limit = Old Upper Limit + A.F
CONVERSION OF INCLUSIVE TYPE
CLASS-INTERVALS TO EXCLUSIVE TYPE
CLASS INTERVALS contd…

Class- Frequency  A.F = (11 – 10)/2

Intervals = 0.5
1-10 2 For 1st C.I i.e 1-10
11-20 6 New L.L = 1(old L.L) – 0.5
= 0.5
21-30 10 New U.L=10(old U.L) +0.5
31-40 15 = 10.5
And so on.
41-50 12
CONVERSION OF INCLUSIVE TYPE CLASS-
INTERVALS TO EXCLUSIVE TYPE CLASS
INTERVALS contd…

Class- Frequency
Intervals
0.5-10.5 2

10.5-20.5 6 Now calculations

20.5-30.5 10

30.5-40.5 15

40.5-50.5 12
Obtaining Cumulative
Frequency Distribution
Class Frequency Less than type More than type
-Intervals
Cum.frequency
20-25 15 15 60cum.frequency
+ 15 = 75
25-30 34 15 +34 =49 26 + 34 = 60
30-35 6 49 + 6 =55 20 + 6 = 26
35-40 10 55 + 10 = 65 10 + 10 = 20
40-45 8 65 + 8 = 73 2 + 8 = 10
45-50 2 73 + 2 = 75 2
Introduction to Measures of
Central Tendency
 Also known as averages.
 Values show a distinct tendency to cluster or
group around a value.
 This behavior is central tendency of data.
 The value around which the data clusters is
the measure of central tendency which
represents the whole set of data.
Objectives of Averages
 To find out one value that represents the
whole mass of data.
 To enable comparison.
 To establish relationship.
 To derive inferences about universe to which
sample belongs.
 To aid decision – making.
Requisites of a Good Average
 Should be rigidly defined.
 Should be mathematically expressed.
 Should be readily comprehensible & easy to
calculate.
 Should be calculated on the basis of all the
observations.
 Should be least affected by extreme values and
sampling fluctuations.
 Should be suitable for further mathematical
treatment.
Common Measures of Central
Tendency
 Arithmetic Mean
 Geometric Mean
 Harmonic Mean
 Median
 Mode
 Partition Values like Deciles ,Quartiles &
Percentiles.
Averages

A.M G.M H.M Median Mode

Arithmetic Mean
 Individual Series
μ = x1 + x2 +…… + xn
n
For ex. A.M of 3, 6, 24 and 48
μ = 3 + 6 + 24 + 48
4
= 81/4 = 20.25 Ans.
Arithmetic Mean contd…
 Discrete Frequency X Freq. fx
Distribution x1 f1 f1x1
x2 f2 f2x2
μ = f1x1 + f2x2 + …..fnxn = fx x3 f3 f3x3
Σ
x4 f4 f4x4
N
Σf
Where N = f1 +f2 +…+fn
n = no. of observations
Illustration
Height No. of Students
(in inches) f fX
X

60 12 60 x 12 = 720
62 18 1116
64 10 640
66 6 396
68 4 272

50 = N 3144 = Σ fx

μ = 3144 / 50 = 62.88 Ans.

Arithmetic Mean contd…
 Continuous Frequency Distribution
- Direct Method
- Assumed Mean Method
- Step Deviation Method
Arithmetic Mean Formulae
 Direct Method

μ = f1x1 + f2x2 + …..fnxn = Σfx

N Σf
Where N = f1 +f2 +…+fn
x = mid value of a C.I
= (U.L + L.L)
2
Arithmetic Mean Formulae
contd…
Assumed Mean Method
μ = A + Σ fd
N
Where A = assumed mean
N=Σf
d=x–A
x = mid - value
Arithmetic Mean Formulae
contd…
 Step Deviation Method
μ = A + Σ fd x i
N
where A = assumed mean
N=Σf
d=x–A
i
x = mid – value
i = width of C.I = U.L – L.L
Illustration – Direct Method
C.I Freq Mid- fX
f Value
μ = Σ fx
X
Σf
4-6 6 5 30
6-8 12 7 84
8-10 17 9 153
= 442/50
10-12 10 11 110
= 8.84 Ans.
12-14 5 13 65
Total 50 = 442 =
Σf Σfx
Illustration – Assumed Mean Method
C.I Freq. Mid Values d =(x-A) fd
f (x) μ = A + Σ fd
10-15 2 12.5 -10 -20 Σf

15-20 7 17.5 -5 -35 = 22.5 + 105

36
20-25 9 22.5 = A 0 0
= 22.5 + 2.916
25-30 8 27.5 5 40 = 25.416 Ans.

30-35 6 32.5 10 60

35-40 4 37.5 15 60

Σf= Σfd = 105

36
Illustration- Step Deviation Method
C.I Freq.(f) MidValues d= (x-A) fd
(x) I
μ = A + Σ fd x i
(i= 5) Σf
10-15 200 12.5 -2 -400

15-20 700 17.5 -1 -700

= 22.5 + 2100 x 5
20-25 900 22.5 = A 0 0 3600
25-30 800 27.5 1 800 = 22.5 + 2.916
30-35 600 32.5 2 1200 = 25.416 Ans.
35-40 400 37.5 3 1200
Σf= Σfd =
3600 2100
Illustration
Marks X or Cum. C.I Freq.
more Freq.
10 140 10-20 140-133= 7
20 133 20-30 133-118=15
30 118 30-40 118-100=18
40 100 40-50 100-75=25
50 75 50-60 75-45=30 Proceed
60 45 60-70 45-25=20 as usual

70 25 70-80 25-9=16
80 9 80-90 9-2=7
90 2 90-100 2-0=2
100 0
What if…
C.I Frequency
50-59 1
40-49 3
30-39 8
20-29 10 ?
10-19 15
0-9 3
Total N=40
A.F = (L.L of 1st C.I – U.L of 2nd C.I)/2
= (50-49)/2
= 0.5
New C.I
L.L of new C.I = L.L of original C.I – A.F
U.L of new C.I= U.L of original C.I + A.F
For ex. For 1st C.I,new L.L = 50-0.5
= 49.5
new U.L = 59 +0.5
= 59.5 and so on.
Now Continue as usual.
Determining missing frequency when A.M is known –
Illustration Mean = 16.82

Marks Freq. M.V (x) d= fd

(x –A)/i
0-5 10 2.5 -3 -30
5-10 12 7.5 -2 -24
10-15 16 12.5 -1 -16
15-20 ? = f4 17.5 = A 0 0

20-25 14 22.5 1 14
25-30 10 27.5 2 20
30-35 8 32.5 3 24
N = 70 + f4 Σfd = -12
Determining missing frequency when A.M is known -
Illustration

Soln. μ = A + Σ fd x I
Σf
μ = 16.82 (given) , I = 5
Hence 16.82 = 17.5 + ( -12 )x 5
70 + f4

- 0.68 = - 60
70 + f4

- 0.68 (70 + f4) = - 60

f4 = 12.4/0.68 = 18 approx. Ans.
Some More Applications of A.M
Q1.The avg. marks secured by 50 students was 44.Later on it was
discovered that a score 36 was misread as 56. Find the correct
average marks secured by the students.
Soln. Given N = 50 and mean μ = 44
μ = ΣX
N
ΣX = 44N
i.e ΣX = 44x55
ΣX = 2200
Since 36 was misread as 56
Hence correct ΣX = 2200 – 56 + 36 = 2180
Correct mean = 2180/50 = 43.6 Ans.
Combined A.M
 Suppose for k different series with n1,n2……nk
observations each, the respective A.M s are
μ1,μ2,….μk. Then the A.M of the new series
obtained on combining all the n1,n2,…nk
observations is obtained using the formula:
μ = n1μ1+n2μ2+….+nkμk
n1+n2+….+nk
Illustration- Combined A.M
 There are two branches of a Co. employing 100 and
80employees respectively .If A.Ms of the monthly
salaries paid by the two branches are Rs.4570 and
Rs.6750 respectively, find the A.M of the salaries of the
employees of the Co. as a whole.
Soln. Given No. of employees in 1st factory, n1 = 100
Avg. Salary of employees in 1st factory, μ1 = Rs.
4750
No. of employees in 2nd factory, n2 = 80
Avg. Salary of employees in 2nd factory, μ2 = Rs.6750
Avg salary of the employees of the Co. as a whole
= 100 x 4750 + 80 x 6750 = 997000 = Rs. 5538.89
100 + 80 180
Practice Questions- Arithmetic Mean

Weekly Income 20-25 25-30 30-35 35-40 40-45 45-50

(in Rs.)
Q1.
No.of workers 200 700 900 800 600 400

Weight (in kgs) 30-34 35-39 40-44 45-49 50-54 55-59 60-64

Q2
No.of Students 3 5 12 18 14 6 2

Wages(in 125-175 175-225 225-275 275-325 325-375 375-425 425-475

Rs.)
Q3 No.of 8 10 25 35 12 10 4
workers
Practice Questions- Arithmetic Mean contd…
Lifetime (in hrs.) No. of tubes
Q4 Less than 300 0
Less than 400 20
Less than 500 60
Less than 600 116
Less than 700 194
Less than 800 265
Less than 900 324
Less than 1000 374

Less than 1200 400

Merits of A.M
 Is rigidly defined and has a definite value.
 Is based on all the observations.
 Is capable of algebraic treatments for further
data analysis & interpretation.
 Easy to calculate & simple to understand.
 For a large no. of observations, A.M provides
a good basis of comparison.
Drawbacks of A.M
 Being based on all the observations, is considerably
affected by abnormal observations. For ex. A.M of
1000, 25, 35 & 40 will be (1000+25+35+40)/4 = 275
which is not at all a representative figure.
 Cannot be calculated even if a single observation is
missing.
 Cannot be obtained just by inspection as in case of
median & mode.
 May give absurd results. For ex. If avg. no. of
children per family is to be calculated and the result
is 3.4 children per family, how would you interpret it?
Weighted Arithmetic Mean

Formula Used
μw = x1w1+ x2w2 +…….+xnwn
w1+ w2 +…….+wn
Illustration – Weighted A.M
Designation Monthly No. of
Salary employees wX
(in Rs.) (X) (w)
Class I 1500 10 15000
Officers
Class II 800 20 16000
officers
Subordinate 500 70 35000
Staff
Clerical Staff 250 100 25000

Lower Staff 100 150 15000

350 = Σw 106000 =
ΣwX
Illustration – Weighted A.M

Weighted A.M = Σ wX
Σw
= 106000
350
= Rs 302.857 Ans.
Median – Positional Average
 The value of the middle term of a series
arranged in ascending or descending
order of magnitude.
 Its value is the value of the middle item
irrespective of all other values.
Calculation of Median
 Individual Series
N = no. of observations or items in the series
- Arrange all the items in ascending or
descending order of magnitude.
Case I N = Odd
Median = Value at (N+1) th position in
2
the arranged series.
Case II N = Even
Median = A.M of values at (N, N+1)th
2 2
position.
Calculation of Median – Illustration
(Individual Series)
Ex.1 Find the median 5, 7, 9, 12, 10, 8, 7, 15,21
Solution: Arranging in ascending order we get
5, 7, 7, 8, 9, 10, 12, 15, 21
Here N = 9 i.e odd
Hence Md = (N+1) th item in the arranged order
2
= (9 +1) th item
2
= 5 th item
= 9 Ans.
Calculation of Median – Illustration
(Individual Series)

Ex 2. Find the median 10, 18, 9, 17, 15, 24, 30, 11

Solution Arranging in ascending order we get
9, 10, 11, 15, 17, 18, 24, 30
Here N = 8 i.e even
Hence Md = A.M of the ( N , N+1)th items in the
2 2
arranged order.
= A.M of (4th, 5th) items
= (15 + 17)
2
= 16 Ans.
Calculation of Median
 Discrete Frequency Distribution
(i) Find less than type cum.frequency.
(ii) Find N/2.( N = Σf)
(iii) Find the cum.freq. just greater than N/2.
Suppose it is C.
(iv) Find the corresponding value of X. (the
item) This is median.
Calculation of Median-Illustration
(Discrete Freq. Distribution)

Height No. of Cum.

Here N = 50
(in students Freq. (i) N/2 = 25
inches) (ii) Cum. Frequency just greater
than N/2 = 30
60 12 12 (iii)Corresponding value of item
is 62.
62 18 30 Median = 60 Ans.
64 10 40

66 6 46

68 4 50

N = 50
Calculation of Median
 Grouped Frequency Distribution
(i) Find less than type cum.frequency.
(ii) Find N/2.( N = Σf)
(iii) Find the cum.freq. just greater than N/2. Suppose it is
X.
(iv) Look for the cum.freq. preceding X. Find the
corresponding class interval.This is median class
Formula Used Md = L1 + N/2 - C (L2 – L1)
f

Where L1 = L.L of median class

L2 = U.L of median class
C =cum.freq. of class preceding the median class.
f = frequency of median class.
Calculation of Median-Illustration
(Grouped Freq. Distribution)

Cum.freq. just greater than 1800 is 2600.

Hence median class is 25-30.
10-15 200 200 Hence L1 = 25
L2 = 30
15-20 700 900 C = 1800
f = 800
20-25 900 1800
25-30 800 2600 Md = 25 + 1800 - 1800 (30 – 25 )
800
30-35 600 3200 = 25 Ans.
35-40 400 3600
Σf=
3600
Calculation of Missing Frequencies when median is known :
Illustration : Median = 50

Expenditure No. of Families Cumulative Freq.

0-20 14 14

20-40 ? = f1 14 + f1

40-60 27 41 + f1

60-80 ? = f2 41+ f1+f2

80-100 15 56 + f1 + f2

N = 100
Calculation of Missing Frequencies when median
is known : Illustration
Here median = 50 L1 = 40
N = 100 L2 = 60
N/2 = 50 f = 27
Hence median class 40-60 C = 14 + f1
Md = L1 + N/2 - C (L2 – L1)
f
50 = 40 + 50 – (14 + f1)(60 – 40)
27
10 = 720 – 20 f1
27
f1 = 450/20 = 22.5 = 23 families approx.
N = 56 + f1 + f2
100 = 56 + 23 + f2
f2 = 21 Ans. f1 = 23 and f2 = 21
Practice Numericals - Median

Q1. Age No. of Persons Q2. Value Frequency

20-25 14 Less than 10 4
25-30 28 Less than 20 16
Less than 30 40
30-35 33
Less than 40 76
35-40 30
Less than 50 96
40-45 20
Less than 60 112
45-50 15
Less than 70 120
50-55 13 Less than 80 125
55-60 7
Practice Problems- Median

Q3. Determine the missing Class-Intervals Frequency

frequencies. The median is
46.Also determine the A.M. 10-20 12
20-30 30
30-40 ?
40-50 65
50-60 ?
60-70 25
70-80 18
229 = N
Merits - Median
 Is rigidly defined.
 Can be easily calculated.
 Not affected by extreme values.
 Can be located merely by inspection.
Demerits - Median
 May not represent the entire series in many
cases.
 Not suitable for further algebraic treatment.
 More likely to be affected by sampling
fluctuations.
Mode
 The value occurring the largest no. of times in
a series. That is the value having the
maximum frequency.
 Is calculated for discrete and continuous
frequency distributions only.
For ex. How to obtain the mode for 1,2,3,4,5 ?
as the maximum frequency is 1 and each
observation has frequency 1.
Mode – Discrete Frequency
Distribution

 The value Wt. in No.of

corresponding to pounds students
maximum frequency is 120 1
the mode.
130 3
For ex. The weight 132
pounds has the 132 2
maximum frequency 3. 135 2
Hence 130 pounds is
the mode for this 140 1
frequency distribution. 141 1
Total 10
Mode – Continuous Frequency Distribution
1.Look for the class-interval with maximum
frequency. This is the modal class.
2. Note down the following:
L1 = lower limit of the modal class.
i = width of class-interval
f0 = frequency of class preceding the modal
class.
f1 = frequency of modal class.
f2 = frequency of class succeeding the
modal class.
Mode: Formula for Continuous
Frequency Distribution

Mode = L1 + h(f1 – f0)

2f1-f0-f2
Empirical Relationship
between Mean, Median & Mode

Mode = 3 Median – 2 Mean

Geometric Mean
 Individual Series
G = (x1.x2.x3……xn)1/n

log G = 1 (logx1 + log x2 +….+ logxn)

n

G = antilog ( 1 Σ log x)
n
Geometric Mean
 Discrete Frequency Distribution

G = (x1f1.x2f2…….xnfn)1/N

log G = 1( f1logx1 + f2logx2 +……+xnlogfn)

N
G = antilog ( 1 Σfilogxi)
N
Geometric Mean
 Continuous Frequency Distribution
- Formula same as in case of discrete
frequency distribution with x (as
observations) replaced by x (as mid-values)
Harmonic Mean
 Reciprocal of A.M of reciprocals
- Individual Series
H= 1
1( 1 + 1 +…..+ 1 )
n x1 x2 xn

H= n
Σ(1 )
x
Harmonic Mean

-Discrete Frequency Distribution

H= 1
1( f1 + f2+…..+ fn )
N x1 x2 xn

H= N
Σ(fi )
xi
Harmonic Mean
 Continuous Frequency Distribution
- Formula same as that of Discrete Frequency
Distribution with x (as observations) replaced
by x (as mid values).