You are on page 1of 224

Statistics for Management

MB0024-Unit-01-Understand the
usefulness of Statistics

With fast moving technologies, advanced communication network, rapid changes in

consumer behaviour, varied expectations of variety of consumers and new market
openings, modern managers have difficult task of making quick and appropriate
decisions. Therefore there is need for them to depend more upon quantitative techniques
like mathematical models, statistics, operations research and econometrics. These
techniques push back the domain of ignorance and rule of thumb and enlightens them
with new horizon of thought process.

Learning Objective 1

Understand the usefulness of Statistics

Importance of Statistics in Modern Business Environment –

Our day-to-day activities interact with personnel, public, social, political, economic,
business and other environments. Decision making encompasses all these activities.
Suppose we wish to purchase a television we would like to know the price, quality,
durability, maintainability etc. Therefore there is a need for collecting data and making an
optimum decision. Again suppose a company wishes to introduce a new product, it has to
collect data on market potential, consumer likings, availability of raw materials,
feasibility of producing the product etc. In other words data collection is the back-bone of
any decision making process. Many organizations find themselves data-rich but poor in
drawing information from it. Therefore it is important to develop the ability to extract
meaningful information from raw data to make better decisions. Statistics play an
important role in this aspect.

Learning Objective 2


Statistical methods are applied to specific problems in Biology, Medicine, Agriculture,

Commerce, Business, Economics, Industry, Insurance, Sociology, Psychology etc.
In Biology, Medicine and Agriculture, Statistical methods are applied in the study of
growth of plant, movement of fish population in the ocean, migration of birds, effect of
newly invented medicines, theories of heredity, estimation of yield of crop, effect of
fertilizer on yield, birth rate, death rate, population growth, growth of bacteria etc. The
insurance premiums are based on the age composition of the population and the mortality
rates. Actuarial science deals with the calculation of insurance premiums and dividends.
Statistics is a part and parcel of Economics, Commerce and Business. Statistical analysis
of variations of price, demand and production are helpful to businessmen and economists.
Cost of living index numbers help in economic planning and fixation of wages. They are
used to estimate the value of money. Analysis of demand, price, production cost,
inventory costs etc., help in decision making in business activities. Management of
limited resources and labour needs statistical methods to maximize profit. Planned
recruitments and distribution of staff, proper quality control methods, careful study of
demand for goods in the market, balance investment, etc. help the producer to extract
maximum profit out of minimum capital. In industries, statistical quality control
techniques help in increasing and controlling the quality of products at a minimum cost.
A government’s administrative system is fully dependent on production statistics, income
statistics, labour statistics, economic indices of cost, price, etc. Economic planning of any
nation is entirely based on statistical facts. Statistics has become so important today that
hardly any science exists independent of this, and hence the statement ‘Science without
Statistics bear no fruit; Statistics without Science has no root’.

Learning Objective 3


A.L.Boddington defined Statistics as ‘The science of estimates and probabilities’.

According to Croxton and Cowden, ‘Statistics is the science of collection, presentation,

analysis and interpretation of numerical data.’ Thus, Statistics contains the tools and
techniques required for the collection, presentation, analysis and interpretation of data.
This definition is precise and comprehensive.

According to Prof.Horace Secrit Statistics deals with aggregate of facts, affected to

marked extent by multiplicity of causes, numerically expressed, enumerated or estimated
according to a reasonable standard of accuracy, collected in a systematic manner for a
predetermined purpose and placed in relation to each other.

Characteristic of Statistics

Statistics Deals with aggregate of facts: Single figure cannot be analyzed. Thus, the fact
‘Mr Lee is 170 cms. tall’ cannot be statistically analyzed. On the other had, if we know
the heights of 60 students of a class, we can comment upon the average height, variation,
1. Statistics are affected to a marked extent by multiplicity of causes: The statistics
of yield of paddy is the result of factors such as fertility of soil, amount of rainfall,
quality of seed used, quality and quantity of fertilizer used, etc.
2. Statistics are numerically expressed: Only numerical facts can be statistically
analyzed. Therefore, facts as ‘price decreases with increasing production’ cannot
be called statistics.
3. Statistics are enumerated or estimated according to reasonable standards of
accuracy: The facts should be enumerated (collected from the field) or estimated
(computed) with required degree of accuracy. The degree of accuracy differs from
purpose to purpose. In measuring the length of screws, an accuracy upto a
millimeter may be required, whereas, while measuring the heights of students in a
class, accuracy upto a centimeter is enough.
4. Statistics are collected in a systematic manner: The facts should be collected
according to planned and scientific methods. Otherwise, they are likely to be
wrong and misleading.
5. Statistics are collected for a pre-determined
purpose: There must be a definite purpose for collecting facts. Eg. Movement of
wholesale price of a commodity.
6. Statistics are placed in relation to each other: The facts must be placed in such a
way that a comparative and analytical study becomes possible. Thus, only related
facts which are arranged in logical order can be called statistics.

Learning Objective 4

Understand the Functions of Statistics

• It simplifies mass data

• It makes comparison easier
• It brings out trends and tendencies in the data
• It brings out hidden relations between variables.
• Decision making process becomes easier.

Learning Objective 5

Know about the Limitations of Statistics

Major limitations of Statistics are:

1. Statistics does not deal with qualitative data. It deals only with quantitative data.
2. Statistics does not deal with individual fact: Statistical methods can be applied
only to aggregate to facts.
3. Statistical inferences (conclusions) are not exact: Statistical inferences are true
only on an average. They are probabilistic statements.
4. Statistics can be misused and misinterpreted: Increasing misuse of Statistics has
led to increasing distrust in statistics.
5. Common men cannot handle Statistics properly: Only statisticians can handle
statistics properly.

Computers and Statistics

With the advent of computers lot of Statistical programmes are available in the market.
They help us in summarizing, presenting and analyzing the mass data in short time. Some
of them are Minitab, SPSS, Texto & Contexto, Excel, E-View etc.


Decision making process become more efficient with the help of Statistics. It deals with
aggregate of facts. It is applied in all fields of our activities more efficiently. Its
interpretation requires skilled and experienced statistician



Formulation of new theory such as “Tobacco consumption leads to cancer”, to frame

policies according to existing nature of a population, to find the relationship between
characteristics of units in the population etc. require collection and analysis of data in a
systematic manner. In other words a search for knowledge by analyzing numerical data is
known as Statistical Survey or Statistical investigation.

Learning Objective 1

Understand the Mode of Planning and Execution of Statistical Survey

A Statistical Survey is a scientific process of collection and analysis of numerical data. A

statistical survey is divided into two broad categories.

A. Planning B. Execution

A. Planning of a Statistical Survey.

The relevance and accuracy of data obtained in a survey depends upon the care exercised
in planning. A properly planned investigation can lead to best results with least cost and
time. The planning stage consists of the following sequence of activities.

1. Nature of the problem to be investigated should be clearly defined in an un-

ambiguous manner.
2. Objectives of investigation should be stated at the outset. Objectives could be to
obtain certain estimates or to establish a theory or to verify a existing statement to
find relationship between characteristics etc.
3. The scope of investigation has to be made clear. It refers to area to be covered,
identification of units to be studied, nature of characteristics to be observed,
accuracy of measurements, analytical methods, time, cost and other resources
4. Whether to use data collected from primary or secondary source should be
determined in advance.
5. The organization of investigation is the final step in the process. It encompasses
the determination of number of investigators required, their training, supervision
work needed, funds required etc.

B. Execution of Statistical Survey

Control methods should be adopted at every stage of carrying out the

investigation to check the accuracy, coverage, methods of measurements, analysis
and interpretation.

The collected data should be edited, classified, tabulated, presented in diagrams

and graphs, analyzed and interpreted.

Learning Objective 2

Know the Basic Terms used in Statistical Survey

a. In a statistical survey the objects on which characteristics are measured are

called units or individuals.

b. The totality of all such objects in a survey is called population or universe.

c. If the number of objects in a population is finite then it is called finite

population otherwise it is known as infinite population.

d. Sample is a part of the population.

e. A characteristics which is numerically measurable is called a Quantitative


f. A characteristic which is not numerically measurable is called a Qualitative


g. For example consider the survey of Average Number of children below 16

years in a ward of a municipality. The number of houses in the ward is finite.
Therefore the population is finite. The objects are households. The
characteristics measured is number of children below 16 years in a household.
It is measurable and hence quantitative on the other hand if the survey is to
find the number of blind people in a locality. The population is finite, objects
are individuals and characteristics is blindness which is qualitative.
h. In a population some characteristics remain the same for all units and some
others vary from unit to unit. The quantitative characteristic that varies from
unit to unit is called a Variable. The qualitative characteristic that varies from
unit to unit is called an Attribute. A variable that assumes only some specified
values in a given range is known as Discrete Variable. A variable that assumes
all the values in the range is known as Continuous Variable.


Discrete Variable: i) Number of children per family

ii) Number of pedals in a flower

Continuous Variable: i) Height of persons

ii) Weight of persons

Learning Objective 3

Learn about the Methods of Collecting Data

a. Collection of data is the first and most important stage in any statistical

b. The method for collection of data depends upon various considerations

such as objective, scope, nature of investigation, availability of resources.

c. Data collected for the first time keeping in view the objective of the
survey is known as primary data. They are likely to be more reliable. However
cost of collection of such data are much higher.

d. Collection of primary data can be done by anyone of the following


i. Direct personal observation

ii. Indirect oral interview

iii. Information through agencies

iv. Information through mailed questionnaires

v. Information through schedule filled by investigators

e. In Direct personal observation the investigator collects data by having

direct contact with units of investigation. The accuracy of data depends upon
the ability, training and attitude of the investigator. This method is suitable
where i) The scope of investigation is narrow, ii) Investigation requires
personal attention of the investigator, iii) Investigation is confidential and iv)
Accuracy of data is important.

Advantages are, we get i) original data ii) more accurate and reliable iii)
Satisfactory information can be extracted by the investigator through indirect
questions iv) Data are homogeneous and comparable v) additional information
can be gathered and vi) Misinterpretation of questions can be avoided.
However it consumes more time and cost.

f. Indirect oral interview is used when area to be covered is large. The data
is collected from a third party or witness or head of institution. This method is
generally used by police department.

Advantages are i) economical in terms of time, cost and man power, ii)
confidential information can be collected, iii) information is likely to be
unbiased and reliable. However the degree of accuracy of information is less.

g. Method of collecting information through local agencies or

correspondents are generally adopted by newspaper and T.V. Local agents are
appointed in different parts of the area under investigation. They send the
desired information at regular intervals.

It is used where the area to be covered is very large and periodic information
is required. The information is likely to be affected by the bias of the
correspondents or agencies.

h. Very often information is collected through Questionnaires. The

questionnaires are filled by of questions pertaining to the investigation. They
are sent to the respondents with a covering letter soliciting cooperation by
giving correct information and mailing it back. The objectives of investigation
are explained in the covering letter together with assurance for keeping
information provided by them as confidential.

This method is generally adopted by research workers and other official and non-
official agencies. It covers large area of investigation. It is more economical and
free from investigator’s bias. However it results in many “non-response”
situations. The respondent may be illiterate. They can provide wrong information
due to wrong interpretation of questions.

i. Success of Questionnaire method of collection of data depends mainly on

proper drafting of the questionnaire. Following general principle are considered.

i. The number of questions should be less.

ii. Lengthy questions should be avoided.

iii. Answers to them should be short.

iv. Questions regarding personal matters should be avoided.

v. It should be unambiguous.

vi. There should not be any scope for misinterpretation.

vii. They should have been arranged in logical sequence

viii. A covering letter should accompany.

j. Information can be collected through schedules filled by investigator

through personal contact. In order to get reliable information, the investigator
should be well trained, tactful, unbiased and hard working.

It is suitable for extensive area of investigation through investigator’s personal

contact. The problem of non-response is minimized.

k. The information used for the investigation of the current problem and
obtained from the data collected and used by some other agency or person
before for his investigation is known a secondary data.

They are available in published or unpublished form. In published form they

are available in research papers, news papers, magazines, government
publication, international publication, websites etc. They are collected for a
different purpose. Therefore care should be exercised while making use of it.
Their accuracy, reliability, objectives and scope should be examined
thoroughly before use.

l. Primary data are collected by census method. In other words information

with respect to each and every individual of the population is observed.
Whereas secondary data may be collected either by census or sampling

m. Pilot survey: It is a small trial survey undertaken before main survey. It

gives a measure of efficiency of the Questionnaire. It reduces the
inconveniences and loss of information. It helps us to introduce necessary


Before using the data collected it should be checked for its completeness,
accuracy and reliability. By complete we mean that all the required information
should be available.


A Statistical survey is a search for knowledge. There are two main stages in any
statistical survey, namely, planning and execution. Planning encompasses i)
nature of problem, ii) the objectives, iii) the scope, iv) statistical units, v) degree
of accuracy, vi) period, vii) source of information and viii) organization.


Collected data in the raw form would be voluminous and non-comprehensible. Therefore
it should be condensed and simplified for better understanding and usefulness.
Classification is first stage in simplification.

It can be defined as a systematic grouping of the units according to their common


Each of the group is called class. For example in survey of Industrial workers of a
particular industry, workers can be classified as unskilled, semi-skilled and skilled each
of which form a class.

Learning Objective 1

Understand Classification, Tabulation and Presentation of Data

Functions of classification

The functions of classification are:-

a. It reduce the bulk data

b. It simplifies the data and makes the data more comprehensible.

c. It facilitates comparison of characteristics.

d. It renders the data ready for any statistical analysis.

Requisites of a good classification

Requisites of good classification are

i. Unambiguous: It should not lead to any confusion

ii. Exhaustive: every unit should be allotted to one and only one class

iii. Mutually exclusive: There should not be any overlapping.

iv. Flexibility: It should be capable of being adjusted to changing situation.

v. Suitability: It should be suitable to objectives of survey.

vi. Stability: It should remain stable through out the investigation

vii. Homogeneity: Similar units are placed in the same class.

viii. Revealing: Should bring out essential features of the collected data.

Types of classification

The very important types are:-

1. Geographical classification: Data are classified according to region.

2. Chronological classification: Data are classified according to the time of its
3. Conditional classification: Data are classified according to certain conditions.
4. Qualitative classification: Classification of data that are non- measurable. E.g.
Sex of a person, marital status, colour etc.
5. Quantitative classification: Classification of data that are measurable either in
discrete or continuous form.
6. Statistical Series: Data arranged logically according to size or time of occurrence
or some other measurable or non-measurable characteristics.

Methods of Classification

i. Classification is done according to a single attribute or variable, is known as one

way classification.

ii. Classification done according to two attributes or variables is known as two-way


iii. Classification done according to more than two attributes or variables is known as
manifold classification.

iv. Examples:
1. One-way classification

No. of students who secured more than 60 % in various sections of same course

2. Two – way classification

Classification of students according to sex who secured more than 60 %

3. Manifold classification.
Classification of employees according to skill, sex and education.

Note: G: Graduate

NG: Non-Graduate


a. Tabulation follows classification. It is a logical listing of related data in rows and


b. Objectives of tabulation are:-

i. To simplify complex data

ii. To highlight important characteristics

iii. To present data in minimum space

iv. To facilitate comparison

v. To bring out trends and tendencies

vi. To facilitate further analysis

Basic differences between classification and tabulation.

In spite of the fact that they are closely related, the differences are as follows.

Learning Objective 2

Know about the Usefulness of Table as a Mode of Data Presentation

Parts of a Table.

i. Table number: Identifies the table for reference.

ii. Title: It indicates the scope and the nature of contents in concise form.

iii. Captions: They are the headings and subheading of columns.

iv. Stubs: They are the headings and subheadings of rows.

v. Body of the table: It contains numerical information

vi. Ruling and Spacing: They separate columns and rows. However totals are
separated from main body by thick lines.

vii. Head Note: It is given below the title of the table to indicate the units of
measurement of the data and enclosed in brackets.

viii. Source Note: It indicates the source from which data is taken.

Types of Table
Tables are classified on the basis of

a. Purpose of investigation: Consists of two types.

i. General purpose table or also known as reference table. It facilitates easy

reference to the collected data. They are formed without specific objective, but
can be used for any specific purpose. They contain large mass of data. Example:

ii. Specific purpose table or text table or summary table deals with specific
problems. They are smaller in size and they highlight relationship between
characteristics. Example: Cost of living indices.

b.The nature of presented figures: Consists of two types:

i. Primary Table: They contain data in the form in which it were originally collected
Ref table No.1.

ii. Derived Table: They represents figures like totals, averages, ratios etc. derived
from original data. Ref : table – 2

Table – 1

Distribution of Employees according to Age, and Educational Level in various


Department Age
s 20 – 40 40 and Above
under under
Graduate Post Graduate Post
Graduate Graduate
graduate graduate
Accounts 10 40 10 10 15 5 90
Finance 10 30 10 12 14 7 83
Personal 15 25 10 10 14 5 79
Production 10 30 10 8 12 6 76
Marketing 5 25 10 0 15 7 62
Total 50 150 50 40 70 30 390
Table – 2

% of P.G. Employees in Age group and Department-wise

s 20 –
40 & above
Accounts 2.564 1.282
Finance 2.564 1.795
Personal 3.846 1.282
Production 2.564 2.051
Marketing 1.282 1.795
12.920 8.205

c. Construction: Different types are

i. Simple table: Presents only one characteristic. Ref table – 3

ii. Complex table: Presents Two or more characteristics. Ref table 4

iii. The cross – classified Table: entries are classified in both directions. Ref table

The following are examples for the above

i. Simple Table

Table No.3

Defectives produced by Batches

Batches No. of defectives

1 15
2 20
3 40
4 50

ii. Complex Table

Table No.4
Distribution of Defectives according to Batch and Nature of defective

Major Minor
I 8 7
II 15 5
III 25 15
Total 40 27

iii. The cross – classified Table

Table No.5

Population of a city according to age, sex and education during 2003 to 2005

Years Age Sex Educated Not educated

Above Below Above
Below 20 yrs 20 – 40 Total 20 – 40 Total
40 20 yrs 40

Learning Objective 3

Learn about Frequency and Frequency Distribution

Frequency and Frequency Distribution

a. The number of units associated with each value of the variable is called
frequency of that value. Suppose the variable takes the value 15 and the value 15
occurs 3 times then 3 is called the frequency of the value 15.

b. A systematic presentation of the values taken by variable together with

corresponding frequencies is called a Frequency Distribution of the variable. It is
presented in Tabular form called as Frequency Table. If class intervals are not
present, then it is called a discrete frequency distribution Ref table – 6. A frequency
distribution formed with class-intervals is called a continuous frequency
distribution. Ref table – 7

c. A continuous frequency distribution is divided into mutually exclusive sub-

ranges called class-intervals. Class intervals have lower and upper limits known as
lower class limit and upper class limits. The differences between upper class limit
and lower class limit is termed as class width. The middle value of a class interval is
called mid-value of the class. It is the average of class limits.

Table – 6

Eg. i. Discrete frequency distribution

Frequency distribution of numbers of children

Number of Children No. of families

0 15
1 20
2 22
3 16
4 7
Total 80

Table – 7

Eg. ii) Continuous Frequency Distribution

Frequency Distribution of Marks

Marks No. of Students

0 – 20 15
20 – 40 20
40 – 60 28
60 – 80 22
80 –
Total 100

Note: For the class 10 – 20

10 is the Lower class interval

20 is the upper class interval

20 – 10 = 10 is the width of the class

d. There are Two types of class – intervals

The class interval that does not include upper class limit is called Exclusive type of class
interval. The class-interval that includes the upper class limits is called Inclusive – type
of class interval.

Inclusive Type

0 – 9 15
10 –

The class 0 – 9 includes the value “9″

Exclusive Type

The class 0 – 10 does not include the value 10. If the value of 10 occurs, it is included in
the class 10 – 20.

e. From a given frequency distribution we can form five derived frequency distributions.
They are i) Relative frequency distribution, ii) Percentage frequency distribution, iii)
Frequency Density, Distribution, iv) Less than cumulative frequency distribution, v)
More – than cumulative frequency distribution.

If “f” is the class frequency and “N” is the total frequency, the relative frequency
distribution is formed by calculating f/N. Total will always be one.

The percentage frequency distribution is formed by multiplying the ratio f/N by 100.

If “c” is the width of the class-interval and “f” is the frequency of the class, then
frequency density distribution is formed by calculating f/c.

The less than cumulative frequency distribution is formed with number of observations
which are less than a given value.

The more – than cumulative distribution is formed with number of observations which
are more than a given value.

Example. Consider the frequency distribution of marks given in Table 7 of 3.4.2 as

example. The derived frequency distribution is as follows.

Table – 8

Relative freq Density

Marks Percentage Distribution
distribution D
0 – 20 0.15 0.75 15
20 – 40 0.20 1.00 20
40 – 60 0.28 1.40 28
60 – 80 0.22 1.10 22
80 –
0.15 0.75 15
Total 1.00 - 100 %

Table 8 (a)

Less – than cumulative frequency distribution

Marks less than Less than cumulative frequency

0 0
20 15
40 35
60 63
80 85
100 100

Table 8 (b)

More – than cumulative frequency distribution

Marks more than More than cumulative frequency

0 100
20 85
40 65
60 37
80 15
100 0

i. Bivariate and Multi-Variate frequency distribution.

Frequency distribution of more than two variables is known as multi – variate frequency
distribution. If the number of variables is only two then it is called bivariate frequency
distribution. A bivariate frequency distribution will have two Marginal Distributions and
“m+n” conditional distribution.

Table 9

Eg: Distribution of Age and Salary

Age in years Salary / Month (Rs.)

9,000 – 12,000 12,000 – 15,000 15,000 – 18,000 Total
20 – 30 10 3 - 13
30 – 40 8 12 3 23
40 – 50 6 15 10 31
50 – 60 - 3 18 21
Total 24 33 31 88

Numbers in last row & column represents Marginal distribution of Age. Any row or
column number represents conditional distribution of salary.
Represents conditional distribution of Age for given salary.

Similarly for a given age we find the conditional distribution of Salary

There are 4 (m) rows and 3 (n) columns.

We have 4+3=7 conditional distributions.

f. Construction of Frequency Distribution. The steps followed are:-

i. Determine the range = Highest value – Lowest value

ii. No. of class intervals is given by the Sturge’s Rule viz. K = 1+3.2 log N. where N is
the total number of observations.

iii. The width of the class-interval is given by N/K

Note: In Practice divide the range either by 2 or 5 or 10 or multiples of 10 such that the
number of class intervals will be between 7 and 15. Avoid open-end class interval. Make
sure that class-intervals do not overlap. Tally marks are used to construct frequency
Table. Tally Mark is a small vertical line drawn against a class as soon as we observe a
value belonging to the class. The fifth tally mark is crossed for easy counting purposes.

If the class interval that does not prescribe lower limit for 1st class or upper limit for last
class is known as open-end class interval.

Learning Objective 4

Know Different Ways to Present Data

a. Top Management and common man do not have time to go through mass data and
understand its nature. For them diagrammatic and graphical presentations are more
intelligible, attractive and appealing. They give a bird’s eye-view of the data. They
facilitate comparison of various aspects of data. They create ever lasting impressions.
However they can not be considered as alternatives for numerical data. Mathematical
calculations are not possible. They do not give accurate values.

b. Diagrams – Diagrams may be one-dimensional or two dimensional. In one-

dimensional we have Bar Diagrams. In two dimensions we have pie diagram. Different
Bar diagrams are simple bar diagram, component Bar diagram, sub-divided Bar diagram,
Percentage Bar diagram etc.

Example: Represent the following data by a suitable Diagram

Table 10

Composition of MBA Students according to their Graduation course

Solution – Component bar

showing students’ composition.
Note: It is easier to draw the diagram if we first find the cumulative total for each section.

a. Multiple Bar Diagram

They are drawn when we have two or more sets of comparable values.


Simple Bar Diagram: It is drawn when items are to be compared with respect to a single
characteristic. A rectangular bar is constructed with height proportional to the magnitude
of the items.


Represent the following data regarding the yield / acre of paddy in Karnataka over the
last five years.

Year 2001 2002 2003 2004 2005

Yield 20 22 25 27 30

Solution Simple bar showing yield of paddy in Karnataka

Simple Bar Diagram showing Yield of Paddy in Karnataka

Component (sub-divided) Bar Diagram: They are used when two or more
characteristics are observed on a unit. Each Bar is proportionally subdivided.


Represent the following data by Multiple Bars

Product A

Year Cost of Manufacturing / Unit Revenue / Unit

2002 –
40 70
2003 –
45 85
2004 –
55 90

Multiple Bars showing Cost & Revenue per Unit

a. Component Pie Diagram: It is drawn when data have magnitudes for two or more
components. Circles with area proportional to magnitudes are drawn to represent the total
magnitude. Then circles are divided sector-wise according to the magnitude of the

If T is the total magnitude and R is the magnitude of a component, then the angle at the
centre is given by.


Draw pie – diagram for the following data regarding expenses of two families.

Monthly Expenses of
Family A Family B
Food 2000 4000
Rent 1000 1500
Fuel 500 1000
Misc 500 1500
Total 4000 8000

We draw two circles with radii 1.3 cms and 1.8. Where 1 cm = 50 units. The angle at the
center are determined as follows.

Items Monthly Expenses of

Family A Family B
Food 1800 1800
Rent 900 67.50
Fuel 450 450
Misc 450 67.50
Total 3600 3600

Graphical Presentation

Most often used graphs for frequency distribution are.

i. Histogram

ii. Frequency polygon

iii. Frequency curve

iv. Ogives [cumulative frequency curves]

Histogram: The frequency distribution is represented by a set of rectangular bars with

area proportional to class frequency.

If the class intervals have equal width then the variable is taken along X-axis and
frequency along Y-axis and a rectangle is constructed.
Example: For the following distribution of Age the histogram is drawn as follows.

We join the upper left corner of highest rectangle to the right adjacent rectangle’s left
corner and right upper corner of highest rectangle to left adjacent rectangle’s right corner.
From the intersecting point of these lines we draw a perpendicular to the X-axis. The X-
reading at that point gives the mode of the distribution.

If the widths of the rectangles are not equal then we make areas of rectangles
proportional and draw the histogram as follows

Example: ii. Suppose we have the following frequency distribution.

Frequency Polygon: The mid values of class-intervals are plotted against frequency of
the class interval. These points are joined by straight lines.

Example: Consider example (i) under 3.6.1

Frequency Curve: First we draw histogram
for the given data. Then join the mid points of the rectangles by a smooth curve. Total
area under frequency curve represents total frequency. They are the most useful form of
frequency distribution.

Example: Consider example i under 3.6.1


1. Less than-ogive: Variables are taken along X-axis and less than cumulative
frequencies are taken along Y-axis. Less than cumulative frequencies are plotted
against upper limit of class interval and joined by a smooth-curve.
2. More than Ogive: More than cumulative frequencies are plotted against lower
limit of the class-interval and joined by a smooth-curve.

From the meeting point of these two ogives if we draw a perpendicular to X-axis,
the point where it meets X-axis gives Median of the distribution.

Example: Draw ogive for the following data and hence determine Median

Wage Distribution of Workers

Wage / day No. of workers Less than Greater than

0 – 10 5 10 5 0 50
10 – 20 10 20 15 10 45
20 – 30 20 30 35 20 35
30 – 40 12 40 47 30 15
40 – 50 3 50 50 40 3
Total 50 50 0
Note: With help of ogive we can find all positional values of a distribution. It
gives at a glance percentage of readings that will lie below or above a specified


For better understanding and usefulness, the collected data is classified in a

systematic manner according to common characteristics. Classification simplifies
and makes data more comprehensible and renders the data ready for statistical

Classified data is tabulated in rows and columns for presentation, using various
types of classification. Frequency distribution is a special type of tabulation. In
more concise form it brings out the salient features of the distribution.

Data is presented in Diagram or graph form is more appealing and gives rough
idea of the situation for busy executives.


Mass data, which are collected, classified, tabulated and presented systematically, is
analyzed further to bring its size to a single representative figure.

The tendency of data to cluster around a figure is known as central tendency. Measures of
central tendency or averages of first order describes the concentration of large numbers
around a value. It is a single value which represents all units.

Learning Objective 1
Understand the concepts of Central tendency

Objectives of Statistical Averages

i. To present mass data in a concise form

ii. To facilitate comparison

iii. To establish relationship between sets

iv. To provide basis for decision-making

Requisites of a Good Average

i. It should be simple to calculate and easy to understand

ii. It should be based on all values.

iii. It should not be affected by extreme values

iv. It should not be affected by sampling fluctuation

v. It should be rigidly defined

v. It should be capable of further algebraic treatment

Statistical Averages

The commonly used statistical averages are:-

a. Arithmetic Mean is defined as the sum of all values divided by number of

values and is represented by .

b. For discrete data

Example: Arithmetic mean of 15, 17, 22, 21, 19, 26, 20 is given by
i. For discrete data with frequency it is given by


Students Age: xi: 20 23 25 28 30

No. of students: fi: 3 5 10 6 1

Ii .For continuous distribution X is given by

Where d = (X – Assumed Mean) / width of class interval

C.I – width of class-interval

X – Mid value of the class


Height in cms X: 140-150 150-160 160-170 170-180

No. of students: 50 65 80 55


= 155 + 5.6 = 160.6 cm

Properties of Arithmetic Mean

i. Algebraic sum of deviations of a set of values taken from their Mean is always Zero


ii. Sum of squares of deviations of a set of values from their mean is always minimum.

i,e is always minimum.

iii. It is capable of further algebraic treatment. Suppose if X1, X2….. Xn are the means
of n1, n2…….nn sets of values, then their combined arithmetic mean value is given by

Example: If average height of 30 men is 158 cm and average height of another group of
40 men is 162 cm. Find average height of combined group

Given n1 = 30 X1 = 158

n2 = 40 X2 = 162

Note: In the above example given any 4 values, we can find the fifth value.

Example: That is suppose n1=30 =?

n2 = 40 = 162

= 160.28



= 4739.6 = 157.98

Example: In an office there are 84 employees. The distribution of their salaries are

Salary Rs 2430 2590 2870 3390 4720 5160

4 28 31 16 3 2

i. Find the mean salary of the employees

ii. Find total salary payed by the office.


Salary (Rs) Employees

(X) (f)
2430 4 9720
2590 28 72520
2870 31 88970
3390 16 54240
4720 3 14160
5160 2 10320
Total 84 249930
i. Mean = = 2975.36

ii. Total salary paid by the office = ∑ fx = Rs.2,49,930

Example: The following data is related to the marks scored by students of a class in an
examination. Calculate the mean.

e Less than Less than Less than Less than Less than Less than Less than
10 20 30 40 50 60 70
No. of
4 16 20 65 85 97 100

Solution: Since we have cumulative frequency distribution, we convert it to frequency

distribution as follows.

Example 8: Average weight of 100 screws in box “A” is 10.4 gms. It is mixed with 150
screws of box “B”. Average weight of mixed screws is 10.9 gms. Find the average weight
of screws of box “B”

Given = 10.4, n1 = 100 = 10.9

=? n2=150

We know

Solving we get,

Example: Find the missing frequency for the following distribution given the mean value
as 129.

80 – 100 100 – 120 120 – 140 140 – 160 160 – 180 Total
Frequency 8 - 26 14 10 80

Solution: Let the missing frequency be f then

Mid X f fx

90 8 720

110 f 110f

130 26 3380

150 14 2100

170 10 1700

58+f 7900+110f

Missing frequency is 22

Merits of Arithmetic Mean

i. It is simple to calculate and easy to understand

ii. It is based on all values

iii. It is rigidly defined

iv. It is capable of further algebraic treatment.

v. It is more stable.

Demerits of Arithmetic Mean

1. It is affected by extreme values

2. It can not be determined for distributions with open-end class intervals.
3. It can not be graphically located.
4. Sometimes its a value which is not in the series.

Learning Objective 2

Know the Different Measures Available for Computation


Median of a set value is the value is the middle most value when they are arranged in the
ascending order of magnitude and is denoted by M.
In case of Discrete series without or with frequency it is given by M= value

Note: To solve problems on Median, arrange Data in ascending order or descending order
(2) Make class-interval as exclusive type.

Example: Find the median value of the following

Set values 45, 32, 31, 46, 40, 28, 27, 37, 36, 41, 47, 50

Solution: Arranging in ascending order we have

27, 28, 31, 32, 36, 37, 40, 41, 45, 46, 47, 50

n = 12 ∴ Median = value = 6.5th value

M = 37 + 0.5 (40-37) = 37 + 1.5 = 38.5

Example: Find the Median value of x series

X: 12, 16, 10, 14, 17, 20, 15

f: 4, 9, 3, 5, 4, 2, 10


n = 37

M = 15
In case of continuous series, it is given by

M = Lower limit of Median class +

Cfp = Cumulative frequency upto previous class

fc = frequency of class

C.I = Width of class interval

Example: Find the median weight of following distribution.

Weight kg: 30-35 35-40 40-45 45-50 50-55

Frequency: 10 15 40 27 8

Solution: Note it is an exclusive type of interval N /2 = 100 /2 =50

∴ M = 43.125

Example: Find the median height of the following distribution

Solution: Making the class intervals as exclusive type we have

Class interval Frequency Cumulative frequency

144.5 – 149. 15 15
149.5 – 154.5 22 37
154.5 – 159.5 38 75
159.5 – 164.5 17 92
164.5 – 169.5 8 100

∴ Median class is 154.5 – 159.5

Example: Find the missing frequency for the following data given that its median is 34

Solution: Since Median is 34, it falls in the class-interval 30 – 40. Let “f” be the missing
frequency. Therefore we have

Class interval Frequency Cumulative frequency

0 – 10 4 4
10 – 20 9 13
20 – 30 - 13 + f
30 – 40 20 33 + f
40 – 50 18 51 + f
50 – 60 7 58 + f
60 – 70 3 61 + f

Or, f = 35 – 16 = 19

∴ Missing frequency is 19

Merits of median

1. It can be easily understood and computed.

2. It is not affected by extreme values.
3. It can be determined Graphically (ogives).
4. It can be used for qualitative data.
5. It can be calculated for distributions with open-end classes.

Demerits of Median

1. It is not based on all values.

2. It is not capable of further algebraic treatment.
3. It is not based on all values.


Mode is the value which has the highest frequency and is denoted by Z.

Modal value is most useful for business people. For example shoe and Ready made
Garment manufacturers will like to know the modal size of the people to plan their

For discrete data with or without frequency it is that value corresponding to highest
Example: The following data relate to size of shoes. Find the mode.

6, 7, 6, 8, 9, 9, 9, 10, 8, 7, 7, 9, 10, 9, 9, 9, 8, 8, 11

Solution: Arranging the data in ascending order

∴ Modal value = 9 which corresponding to the highest frequency 7.

For continuous frequency distribution it is given by


L.L. = Lower limit of Modal class

fm = frequency of modal class

fp = frequency of previous class

fs = frequency of succeeding class

C.I = Width of class interval

Example: Praveen apartment builders found the number of customers who wishes to
have plinth area of their apartments as follows:
Find the modal plinth area

Solution: we note that the intervals are exclusive type. Highest frequency is 25.
Therefore corresponding interval is 1200 – 1400, which is called Modal class.


Merits of Mode

1. In many cases it can be found by inspection.

2. It is not affected by extreme values.
3. It can be calculated for distributions with open end classes.
4. It can be located graphically.
5. It can be used for qualitative data.

Demerits of Mode

6. It is not based on all values.

7. It is not capable of further mathematical treatment.
8. It is much affected by sampling fluctuations.

Empirical Relationship between Mean, Median and Mode.

Mean – Mode = 3 (Mean – Median), which is same, Mode = 3 Median – 2 Mean.

Geometric Mean
The geometric mean of a series of “n” positive numbers is given by

i. In case of Discrete series without frequency

GM =

ii. In case of Discrete series with frequency

GM =

Where n = f1 + f2 + ………….. + fn

iii. In case of continuous series

GM =

Where n = f1 + f2 + …………. + fn and x1, x2 are the mid points of class intervals.

It also given by G.M =

Example: The growth in bad-debt expense for Das office supply company over the last
few years is as follows. Calculate the average percentage increase in bad-debt expense
over this time period.


G.M =

= 1.09675

Therefore the average increase is 1.09675 – 1 = 0.09675 %

Whenever data deal with rates, ratios, growth rate, etc Geometric mean is the best
Note: Geometric mean is not defined even if one of the value is zero or negative.

Harmonic Mean

If x1, x2, …………xn are “n” values for discrete series without frequency then their
Harmonic Mean

For Discrete series with frequency

Where fi is the corresponding frequencies

Example: calculate harmonic mean of 9.7, 9.8, 9.5, 9.4, 9.7

X f/λ
9.7 0.10.31
9.8 0.10.20
9.5 0.1053
9.4 0.1064
9.7 0.1031
Total 0.5199

∴ HM = 5 /(0.5199) = 9.6172

Example: Find the harmonic mean of the following distribution.

X 121 122 123 124 125

f 5 25 36 37 20


X f f/X
121 5 0.04132
122 25 0.20492
123 36 0.29268
124 37 0.29839
125 20 0.16000
Total 123 0.99731

H.M = 123 / (0.99731) = 123.33

Example : In a locality the distribution of average speed of birds in the evening were
observed to be as follows. Find average speed of birds using harmonic mean.

Class –
80 – 82 82 – 84 84 – 86 86 – 88
Frequency 5 7 3 2


Mid x f f/x
81 5 0.06173
83 7 0.08434
85 3 0.03520
87 2 0.02299
Total 17 0.20435

The harmonic mean = 17/(0.20435) = 83.191

Positional Averages (Quartiles)

→ Median is the midvalue of series of data. It divides the distribution into two equal
portions. Similarly we can divide a given distribution into four, ten or hundred or any
other number of equal portions.

Quartiles: When distribution is divided into four equal portions, then we get first quartile
(Q1), second Quartile (Q2 = Median) and third quartile (Q3) as the positional averages.

For discrete series with or without frequency Q1 is given by (N+1 / 4)th value and Q3 is
given by
3 (N + 1)/4th value

For continuous distribution Q1 and Q3 are given by

Q1 = L.L. + {(N/4 – Cfp)XC.I.}/Fc

Q3 = L.L. + {(3N/4 – Cfp)XC.I.}/Fc

Example : Weekly sales of a product on 8 different shops are as follows. Calculate the

Sales in units: 309, 312, 305, 307, 310, 308, 308, 306, 308


Arranging the data is ascending order. We have

305, 306, 307, 308, 309, 310, 312

Q1 = 306.25

Q2 = 2.25 x 2 = 4.5th value

= 4th value + 0.5 (5th value – 4th value)

= 308 + 0.5 (30/ – 308) = 308

Q3 = 2.25 x 3 = 6.75th value

= 6th value + 0.75 (7th value – 6th value)

= 309 + 0.75 (310 – 309)

= 309 + 0.75 = 309.75

Deciles: For deciles we divide N / 10 and multiply by required deciles value

Weighted Averages: Suppose the values x1, x2, ……. xn are assigned the weights w1,
w2………wn then their weighted average is given by
and their weighted Geometric Mean is given by

Note: W acts as frequency

Learning Objective 3

Understand the Concept of Dispersion and Its Measures


It describes another characteristics of a distribution. Consider the following two

distribution of weights of a product produced by two machines.

Machine A B
Sample size 1000 1000
Average wt 80 80
Minimum wt 20 40
Maximum wt 140 100

Machine B produces products with weights much closer to the average than Machine A.
As a manufacturer or customer we would choose Machine B. In other words we choose
that machine whose spread is smaller.

The property of deviations of values from the average is called Dispersion or Variations.
The degree of variations is found by the measures of variations.

Various measures used to find the degree of variations are

1. Range (R)
2. Quartile Deviations ( Q.D)
3. Mean Deviations (M.D)
4. Standard Deviations (S.D)

They have units of measurement attached to them. Therefore they are known as absolute
measures of variations. However we may want to compare two different distributions
whose measurements are one in terms of Kg and another in terms of cm. Then we use the
following relative measures that do not have any units attached to them. They are

1. Coefficient of Range
2. Coefficient of Quartile Deviations
3. Coefficient of Mean Deviations
4. Coefficient of Variations.
They are known as Relative measures.

In this unit we study both simultaneously.

Prerequisite of a good measure of variations are

1. It should be easy to understand and simple to calculate.

2. It should be based on all values.
3. It should be rigidly defined.
4. It should not be affected by extreme values.
5. It should not be affected by sampling fluctuations.
6. It should be capable of further algebraic treatment.


Range is the difference between highest and lowest value of the data.

R = H-L H: Highest value

L: Lowest value

Coefficient of Range = (H-L)/(H+L)


i. It is easily understood and simple to calculate

ii. It is rigidly defined


i. It is affected by extreme values

ii. It is not based on all values. It uses extreme values only.


1. It is used in Statistical Quality control

2. When the study does not require deep analysis
3. When data has no abnormal values.
Example 30: Find the Range of the following discrete series 26, 28, 28, 26, 28, 30, 27,
29, 26, 24

Solution: R = 30-24 = 6

Example 31: Find the Range of the following continuous series

Class Interval 0-5 5-10 10-15 15-20 20-25

Frequency 10 15 25 12 8

Solution: R = 25-0 = 25

Note: If the class interval’s are open then Range is not defined.

Quartile Deviations

Unlike range, it does not involve the extreme values. It is defined as

Q.D.=(Q3-Q1)/2 (Absolute Measure)

Coefficient of Q.D. = (Q3-Q1)/(Q3+Q1) (Relative Measures)

Note: 1. Q3-Q1 is called inter quartile range

2. Q3-Q1 gives the middle 50% of reading. Q3 and Q1 are also known as upper and lower
limit of middle 50% of readings.

3. It is not capable of further algebraic treatment.

Example: Find the inter quartile Range, Q.D and coefficient of Q.D for the demand
distribution of toothpaste packs for various price categories.

Price category Rs/unit 5-9 10-14 15-19 20-24 25-29

Frequency 15 25 38 14 8

Solution: First we make the class interval as exclusive type

Price Rs/Unit Frequency Cumulative Frequency

4.5-9.5 15 15
9.5-14.5 25 40 Q1 class
14.5-19.5 38 78
19.5-24.5 14 92
24.5-29.5 8 100
Total 100

Q1=(100/4)th value=25th value

Q1=9.5+{(25-15)X5}/25=10.5 Rs

Q3 = 75th value

Q3 =14.5+{(75-40)X5}/38=19.11 Rs

Q.D.=(19.11-10.5)/2=8.61/2=4.305 Rs

Coefficient of Q.D.=(19.11-10.5)/(19.11+10.5)=(8.61)/(29.61)


1. It is easy to understand and compute

2. It is rigidly defined
3. It is not affected by extreme values.


4. It is not based on all values.

5. It is affected by sampling fluctuations.
6. It is not capable of further algebraic treatment.

Mean Deviation:

It is defined as the mean of absolute deviations of the values from central value.

The Mean deviation from Mean for discrete series without frequency is given by

M.D.( )=Σ(X- )/N

For data with frequency it is given by

M.D ( ) = Σ f (X – X)/Σf
In case of continuous series “X” represents Mid value of class-interval.

Similarly we can have Mean Deviation from Median or Mode. X is replaced by Median
or Mode in the above formulae.

However Mean Deviation from Median is the least. It is known as Minimal property of
Mean Deviation.

The corresponding Relative measures are coefficient of Mean Deviation.

Coefficient of M.D.( )=

Coefficient of M.D.(Median)=M.D.(Median)/Median

Example: Calculate mean deviation and also coefficient of Mean deviation using i) Mean
and ii) Median. Compare the results.

Heights of plants (cms) 140, 147,143,145,144,150,142,141


From Mean From Median

 x – 145  x – 143.5
140 5 3.5
141 4 2.5
142 3 1.5
143 2 0.5
144 1 0.5
145 0 1.5
147 2 3.5
158 13 6.5
30 20.0
∴ Mean = 1160 /8 = 145

∴ Mean Deviation from Mean = 3.75

Median is (8 + 1)/2th value = 4.5th

∴ Median = 143 + 0.5 (144 – 143) = 143.5

Mean Deviation from Median = 20 /8 = 2.5

Coefficient of M.D.( )=0.0258

Coefficient of M.D. (Median) = 0.001742

Mean Deviation from Median is less than M.D from Mean

Merits and Demerits of Mean Deviation


1. It is based on all values.

2. It is less affected by extreme values.
3. It is not affected much by sampling fluctuations.


1. It is not capable of further algebraic treatment.

2. It does not take into account negative signs.

Uses of M.D

It is used when sample size is small. It is preferred in Statistical analysis of certain

economic, business and social phenomena.

Standard deviation

Measures of dispersion Range and Q.D are not based on all values. Mean deviation based
on all values does not take into consideration the sign. Therefore a measure that removes
both drawbacks is given by standard Deviation (S.D).
The standard deviation of a set of values is the positive square root of mean of the
squared deviations of the values from their arithmetic mean. It is denoted by σ (sigma).

For discrete series without frequency it is given by

σ = √Variance

For discrete series with frequency and continuous it is given by

σ = √Variance

Where X is the mid value of class interval for continuous series

Alternative form for (A) & (B) S.D are

For (A)

Variance =

σ = √Variance

For (B)

σ = √Variance

Where d = X-A: A is assumed mean

Note: the Square of Standard deviation is called Variance. Denoted

by σ 2
Example: Calculate the S.D for Variation in temperature observed during two months at

Temp0c 18 19 20 21 22 23 24 25 Total
Frequency 3 5 8 16 12 8 5 3 60


X f d = x-21 fd fd2
3 -3 -9 27
5 -2 -10 20
8 -1 -8 8
16 0 0 0
12 1 12 12
8 2 16 32
5 3 15 45
3 4 12 48
60 28 192

Example 37: The diastolic blood pressure of men is distributed as fallows. Find the
standard deviation and variance.
Pressure(men) 78-80 80-82 82-84 84-86 86-88 88-90
No of Men 3 15 26 23 9 4


d = x-83
Class Interval Mid value X Frequency ‘f’ fd fd2
78-80 79 3 -2 -6 12
80-82 81 15 -1 -15 15
82-84 83 26 0 0 0
84-86 85 23 1 23 23
86-88 87 9 2 18 36
88-90 89 4 3 12 36
80 32 122

Properties of standard deviation

1. It is independent of origin but not independent of scale.

2. Standard deviation is always greater than or equal to zero.
3. It is the least of all root – mean – square deviations.

4. Suppose the mean of n1 values is and that of n2 with standard deviations

σ 1 and σ 2 then the combine standard deviation of both the values is given by
Where, and

being the combined mean of n1and n2 values/

Example: The average weight of 100 apples from area “A” is 150gms with standard
deviation of 10gms. Similarly the average weight of 200 apples from area “B” is 200gms
with standard deviation of 15gms. Find the combine standard deviation.


Given n1= 100 n2 = 200

= 150 = 200

σ 1 = 10 σ 2 = 15

= (150 – 183.33)2 = (33.33)2 = 1110.889

= (200 – 183.33)2 = (16.66)2 = 277.5556


Merits and demerits of standard deviation.


1. It is rigidly defined.
2. It is based on all values.
3. It is capable of further algebraic treatment.
4. It is not very much affected by sampling fluctuations.


1. It is difficult to understand.
2. It gives undue weight age for extreme values.
3. It cannot be calculated for classes with open end interval.

1. Coefficient of variation: When we want to compare two different sets of values

pertaining to different characteristic or pertaining to same characteristics, then we
use coefficient of variation. It is a relative measure expressed in percentage and is
defined as


It is used to compare the homogeneity or stability or uniformity or consistency of

two or more sets of data.

A low value of coefficient of variation indicates a low degree of variation.

Example: Find Standard deviations of the following two series and state which is
more stable.

Series A: 192, 288, 236, 229, 184, 160, 384, 291, 330, 243

Series B: 31, 48, 13, 51, 38, 43, 50, 36, 47, 82


Series A
d = x-260 d2
192 -68 4624
288 28 784
236 -24 576
229 -31 961
184 -76 5776
160 0 0
384 124 15376
291 31 961
330 70 4900
243 -17 289
+37 34247


Series B
31 961
48 2304
13 169
51 2601
38 1444
43 1849
50 2500
36 1296
47 2209
82 6724
Total 439 22057

Since c.v for series A (22.15) is less than c.v for series B (38.02), series A is more

Example: The bursting and tensile strength of a type of paper showed the
following results

Bursting Strength Tensile Strength

Mean 40 130
S.D 6 15
Which characteristic is more variable?

C.V for Bursting Strength =

C.V for Tensile Strength =

Since C.V % for Bursting strength is higher, it is varies more.


Measures of Central Tendency and Measures of Dispersion summarise mass data

in terms of its two important features – i) with respect to nature of data to cluster
around a central value and ii) with respect their spread from their central value.


Every human activity has an element of uncertainty. Uncertainty affects the decision
making process. We use the word “Probably” every often, like, probably it may rain
today, probably the share price may go up in the next week. Therefore there is a need to
handle uncertainty systematically and scientifically. Probability theory helps us to make
wiser decisions.

Learning Objective 1

Understand the Basic Concepts of Probability and Baye’s Theorem


Probability is a numerical measure which indicates the chance of occurrence of an event

“A”. It is denoted by P(A). It is the ratio between the favourable outcomes to an event
“A” (m) to the total outcomes of the experiment (n). In other words

Basic Terminology used in this theory.

a. Experiment:

An operation that results in a definite outcome is called an experiment.

Tossing a coin is an experiment if it shows Head or tail on falling. If it stands on its edge,
then it is not an experiment.

b. Random Experiment:

When the outcome of an experiment cannot be predicted, then it is called Random

experiment or stochastic experiment

c. Sample space or total number of outcomes of an experiment is the set of all possible
outcomes of a random experiment and is denoted by S.

In tossing two coins S = {HH, HT, TH, TT}. The number of out comes is denoted by n(s)
= 4.

If the number of outcomes is finite then it is called Finite Sample Space otherwise it is
called Infinite Sample Space.

d. Event:

Events may be a single outcome or combination of outcomes.

In tossing a coin getting a head is (event A) a single outcome. Therefore

P (A) = ½

In tossing two coins getting a head (event A) a combination outcomes HT and TH

therefore P(A) = 2/4 = 1/2 . It is a subset of sample space.

e. Equally likely Events (Equi probable events)

Two or more events are said to be equally likely if they have equal chance of occurrence.

In tossing an unbiased coin getting head and tail are equally likely.

f. Mutually Exclusive Events:

Two or more events are said to be mutually exclusive if the occurrence of one prevents
the occurrence of other events.

In tossing a coin if head falls, it prevents the occurrence of tail and vice versa.

g. Exhaustive set of events:

A set of events is exhaustive if one or other of the events in the set occurs whenever the
experiment is conducted. It can be defined also as the set whose totality of sample points
form the total sample points of the experiment.

h. Complementation of an Event is given by P(Ac) = 1 – P(A)

i. Independent Events Two events said to be independent of each other if the occurrence
of one is not affected by the occurrence of other or does not affect the occurrence of the


Consider tossing of three fair coins. Then


Let A be the event of getting three heads

Let B be the event of getting two heads

Let C be the event of getting one head

Let D be the event of getting No head

Then A = {HHH}; B = {HHT, HTH, THH}

C = {HTT, THT, TTH}; D = {T,T,T}

Event A, B, C and D are mutually exclusive and exhaustive but not equally likely.

Approaches to probability:

There are four approaches to probability They are i) Classical / Mathematical / Priori
approach ii) Statistical / Relative frequency / Empirical / posteriori approach iii)
Subjective approach and iv) Axiomatic approach.

i. Classical / Mathematical / Priori approach

Under this approach the probability of an event is known before conducting the


a) Getting a head in tossing a coin

b) Drawing a king from well shuffled pack

c) Getting a “6″ in throwing a die.

The probability of an event “A” is defined as P (A) = m/n (m – No. of favourable

outcomes, n – total number of outcomes of the experiments.

However it is not possible to give probability to all events of our life. We cannot attach a
definite probability to the event “that it will rain today”.

ii. Statistical / Relative frequency / empirical / posteriori approach.

Under this approach the probability of an event is arrived at after conducting an

experiment. If we want to know the probability that a particular household in an area will
have two earning members, then we have to gather data on all household in that area and
arrive at the probability. The greater number of households surveyed, the more accurate
will be the probability arrived.

The probability of an event ‘A’ in this case is defined as

In real life it is not possible to conduct experiments because of high cost or of destructive
type experiments or of vast area to be covered.

iii. Subjective approach

Under this approach the investigator or Researcher assigns probability to the events either
from his experience or from past records. It is more suitable when the sample size is ten
or less than ten. The investigator has full knowledge about the characteristics of each and
every individual. However there is a chance of personal bias being introduced in such

iv. Axiomatic Approach

This approach is based on set theory. The probability of an event is defined as

such that

a. 0 ≤ P (Ai) ≤ 1
b. ∑ P (Ai) = 1 for i = 1 to n

Where Ai are “n” mutually exclusive and exhaustive events.

Rules of Probability

a. Addition Rule

i. If A and B are any two events then the probability of the occurrence of either A or B is
given by

P (A U B) = P (A) + P (B) – P (A ∩ B)

ii. If A and B are two mutually exclusive

events then the probability of occurrence of either A or B is given by

P (A U B) = P (A) + P (B)

iii. If A, B and C are any three events then the probability of occurrence of either A or B
or C is given by

P (A U B U C) = P(A) + P(B) + P(C) – P(A ∩ B) – P(B ∩ C) – P(A ∩ C) + P(A ∩ B ∩


iv. If A1, A2, A3………An are “n” mutually exclusive and exhaustive events then the
probability of occurrence of at least one of them is given by

P(A1 U A2 U……..U An) = P(A1) + P(A2) +…….+ P(An).

Managers are very often come across with situations where they have to take decision
about implementing either course of action A or course of action B or the course of action
C. Sometimes they have to take decisions regarding the implementation of both A and B.

For example a sales manger may like to know the probability that he will exceed the
target for product A or product B. Sometimes he would like to know the probability that
sales of product A and B will exceed the target. The first type of probability is answered
by addition rule. The second type of probability is answered by multiplication rule.

b. Multiplication Rule

i. If A and B are two independent events then the probability of occurrence of A and B is
given by
Conditional Probability:

Sometimes we wish to know the probability that the price of a particular petroleum
product will rise, given that the finance minister has risen the petrol price. Such
probabilities are known as conditional probability.

Thus the conditional probability of occurrence of an event “A” given that the event “B”
has already occurred is denoted by P (A / B). Here A and B are dependent events.
Therefore we have the following rules.

If A and B are dependent events then the probability of occurrence of A and B is given by

For any Bivariate distribution there exists two marginal distributions and “m + n”
conditional distributions, where m and n are the number of classifications / characteristics
studied on two variables. Consider the following example.

A librarian analyzed the type of visitors and their choice of library section as follows:

Type of visitors Level of

News Paper Magazine Novel (story) Subject Total
Under Graduates 50 100 120 50 320
Graduates 70 90 50 100 310
Post Graduates 100 60 30 150 340
Total 220 250 200 300 970

We can get the following distributions


Type of Visitors Frequency

Undergraduates 320
Graduates 310
Post graduates 340
Total 970

This represents the distribution of level of education irrespective of their sections.

Therefore it is called Marginal Distribution


Magazine Novels Subjects Total
220 250 200 300 970

This represents the distribution of people in sections irrespective of their educational

levels. It is another Marginal Distribution. Thus there are two marginal distributions for
Bivariate data, variables being sections and level of eduction.


Level of
News paper Magazine Novels Subjects Total
Under graduate 50 100 120 50 320

This represents the distribution of people in sections given that they are under graduate.
Therefore it is a conditional distribution. Thus for any Bivariate distributions having m
and n classifications there exits two marginal distributions and m + n conditional
distributions. In this case there are 3 + 4 = 7 conditional distributions.

To solve any problem on probability the steps involved are:

i. Define the events

ii. Find the total outcome of the experiment

iii. Find the probability of each event

iv. If the words “either, or” is used check whether the events are mutually exclusive or
not, to apply addition rule.
v. If the words “both or and” used, check whether the events are independent or
dependent, to apply proper multiplication rules.

vi. To find the total outcome of the experiment use 2n or 6n in the case of coin or die
respectively, where “n” is the number of coins or dice thrown at a time or a coin or die
thrown “n” times. In all other cases use nCr.

Note: i. Calculation of nCr

Example 1:

ii. nCr = nCn-r

Example 2: 16C13 = 16C16-13 = 16C3 = 560

iii. nC0 = nCn = 1

1. 0! = 1

Illustrations using the above concepts

Example 3: Find the probability of getting a head when a coin is tossed.


Let “A” be the event of getting a head.

S = {H, T}
∴ n(S) = 2

n(A) = 1

Example 4: What is probability of getting two heads when 3 coins are tossed and what is
the probability of getting at least one head?

Solution: i) Let “A” be the event of getting two heads.


⇒ n(S) = 8

Note: n(S) = 23 = 8


⇒ n(A) = 3


ii) Let “A” be the event of getting at least two heads.


⇒ n (A) = 4


Example 5: What is the probability of getting a sum “Nine” when two dice are thrown?

Solution: Let “A” be the probability of getting a sum “Nine”

n(S) = 62 = 36

A = {(6,3), (3,6), (4,5), 5,4)}

∴ n(A) = 4


Example 6: A number is selected at random from the numbers 1 to 30. What is the
probability that

i) It is divisible by either 3 or 7 ii) It is divisible by 5 or 13?

Solution: i) Let “A” be the event of selecting a number divisible by 3

Let “B” be the event of selecting a number divisible by 7

n(S) = 30C1 = 30

A = {3, 6, 9, 12, 15, 18, 21, 24, 27, 30}

n(A) =10

B = {7, 14, 21, 28}

n(B) = 4

A and B are not mutually exclusive

∴ P(AUB) = P(A) + P(B) – P(A ∩ B) =

ii) Let “A” be the event of selecting a number divisible by 5

Let “B” be the event of selecting a number divisible by 13

n(S) = 30 C1 = 30

A = {5, 10, 15, 20, 25, 30} ⇒ n(A) = 6

B = {13, 26} ⇒ n(B) = 2

A and B are mutually exclusive

∴ P(A U B) = P(A) + P(B) =

Example 8: Board of Directors of a company want to form a quality management

committee to monitor quality of their products. The company has 5 Scientists, 4
Engineers and 6 Accountants. Find the probability that the committee will contain 2
Scientists, 1 Engineer and 2 Accounts.

Solution: Let “A” be the event of selecting 2 Scientists, 1 Engineer and 2 Accountants.

n(A) = 5C2 x 4C1 x 6C2=

Therefore, P(A)=600/3003

Example 9: The odds favouring the event of a person hitting a target are 3 to 5. The odds
against the event of another person hitting the target are 3 to 2. If each of them fire once
at the target, find the probability that i) both of them hit it ii) at least one of them hit ‘d’.

Solution: i) Let “A” be event of first person hitting a target. Odds in favour means


Let “B” be event of Second person hitting a target.


Both hitting the target mean A ∩ B

A & B are independent


ii. Let “A” be the probability of hitting the target

Let “B” be the probability of hitting the target.

P(A U B) = P(A) + P(B) – P(A n B)=

Example 10: The probabilities that drivers A, B and C will drive home safely after
consuming liquor are 2/5, 3/7 and 3/4, respectively. What is the probability that they will
drive home safely after consuming liquor?

Solution: Let “A” be event of A driving safely after consuming liquor.

Let “B” be event of B driving safely after consuming liquor

Let “C” be event of C driving safely after consuming liquor


∴ P (A ∩ B ∩ C) = P (A).P (B).P(C) [since they are independent] =


Example 11: The probabilities that “A” and “B” will tell truth are 2/3 and 4/5
respectively. What is the probability that i) they agree with each other ii) they contradict
each other while giving a witness in the court.

Solution: Let “A” be the event of A telling truth

Let “B” be the event of B telling truth

Given P(A)= implying that P(AC)=1=P(A)=

And P(B)= implying that P(BC)=1/5

i. Both will agree if they say truth or they together lie i,e A ∩ B or Ac
∩ Bc. They are mutually exclusive.

∴ P(A ∩ B) + P (Ac
∩ Bc)

= P(A) P (B) + P(Ac) P(Bc). (since A and B are independent)=(2/3).(4/5)+(1/3).


ii) They will contradict if A tells truth and B tells lies or B tells truth and A tells lie
Example 12: A bag contains 5 red and 4 blue similar balls. Two balls are drawn at
random from the bag. Find the probability that both of them are red if i) the balls are
drawn together ii) the balls are drawn one after the other, with replacement iii) the balls
are drawn one after the other, without replacement.

Solution: i) Let “A” be the event of drawing two balls together.



Therefore, P(G)=10/36=5/18

ii) Let “A” be the event of drawing a red ball in the first draw

Let “B” be the event of drawing a red ball in the second draw

The required probability is

iii) Let “A” be the event of drawing red ball in the first draw

Let “B” be the event of drawing red ball in the second draw

Since the first ball is not replaced the sample space changes for second draw

∴ The required probability

P(A ∩ B) = P(A) . P(B/A)=(5/9)X(4/8)=5/18

Example 13: Box I contains 5 Red and 6 Blue balls. Box II contains 6 Red and 4 Blue
balls. A ball is drawn at random from box I and is transferred to box II. Now from Box II
a ball is drawn at random. What is the probability that it is red?

Solution: A ball drawn from Box I and transferred to Box II could be either red or blue.
Let “A” be the event of drawing a red ball from Box I.

Let “B” be the event of drawing a blue ball from Box I.

Let “C” be the event of drawing red ball from Box II.

Example 14: The probabilities that component A and component B of a machine will fail
are 0.09 and 0.06 respectively. The machine will fail if any one of them fails. Find the
probability that it will fail?

Solution: Given P(A) = 0.09

P(B) = 0.06

P(A n B) = P(A) . P(B)

= 0.09 x 0.06 = 0.0054

∴ P(A U B) = P(A) + P(B) – P(A ∩ B)

= 0.09 + 0.06 – 0.0054

= 0.1446

Example 15: What is the probability of getting 53 Mondays in a leap year

Solution: There are 366 days in a leap year = 52 weeks + 2 days

It has 52 Mondays. For one more Monday we select from the following combination of
the remaining 2 days.

1. Sunday and Monday 5. Thursday and Friday

2. Monday and Tuesday 6. Friday and Saturday

3. Tuesday and Wednesday 7. Saturday and Sunday

4. Wednesday and Thursday

∴ n(S) = 7 and n(A) = 2

P(A)=2/7 Where A is the event of getting of 53 Monday

Baye’s Probability

Let A1, A2, A3, A4 be mutually exclusive and exhaustive events of a random
experiment. Let “B” be a common event. In Venn Diagram it is presented as

The event “B” is made up of 4 mutually exclusive and exhaustive events.

= [Numerator from (3) and Denominator from (1)]

= [From (3)]

In general Baye’s Theorem states that if A1, A2…………..An are “n” mutually exclusive
and exhaustive events and B is a common event to all theorems then probability of
occurrence of A1 given that “B” has already occurred is given by

Baye’s Probability is also a type of conditional probability. The difference between

conditional probability and Baye’s probability is as follows:-

Baye’s Probability General Conditional Probability

1. Finds the probability of
population Finds the probability of getting a sample value given the
population value.
value, given the sample value
2. It is possible to incorporate
It is not possible to do so.
3. It is possible to incorporate
It is not possible in this case

Whenever there are two probabilities connected with an event then we have to apply
Baye’s approach to solve it.

Example 16: The probabilities that Mr.Aravind, Mr.Anand and Mr.Akil will become
vice-president of a company are 0.40, 0.35 and 0.25 respectively. The probabilities that
they will introduce new product are 0.10, 0.15 and 0.20 respectively. What is the
probability that Mr.Anand introduced a new product by becoming vice-president?


Let “A1” be the event that Mr.Aravind became vice-president

Let “B2” be the event that Mr.Anand became vice-president

Let “A3” be the event that Mr.Akil became vice-president

Let “B” be the event that a new product was introduced

We are given that P(A1) = 0.4, P(A2) = 0.35, P(A3) = 0.25, P(B/A1) = 0.10, P(B/A2) =
0.15, P(B/A3) = 0.20.

The given information can be put in the following form.

We note that P(A ∩ B) = P(A) . P(B/A)

And P(B) = ∑ P(Ai) P(B/Ai)

Joint Prob
t Prior Probability Conditional prob
Posterior Probability
P(Ai) P(B/Ai)
P(Ai n B)
A1 0.4 0.10 0.0400 0.0400/0.1425=0.2807
A2 0.35 0.15 0.0525 0.0525/0.1425=0.3684
A3 0.5 0.20 0.000 0.0500/0.1425=0.3509
Total 1.00 P(B) 0.1425 1.0000

Therefore required Probability P(A2 / B) = 0.3684

Example 17: A factory has three Machines M1, M2 and M3. They produce 4000, 10,000
and 6,000 products per day. From past records it is known that M1, M2, and M3 produce
5%, 4%, and 8% defectives. A product is selected at random from the day’s production.
What is the probability that it was not produced by Machine M3.


Let “A1” be the event that the product was produced by M1

Let “A2” be the event that the product was produced by M2

Let “A3” be the event that the product was produced by M3

Let “B” be the event that the product is defective. Then we are given



P(B/A1) = 0.05 P(B/A2) = 0.04 P(B/A3) = 0.08

The above information can be put in the following form.

Joint Prob
t Prior Probability Conditional prob
Posterior Probability
P(Ai) P(B/Ai)
P(Ai n B)
A1 0.2 0.05 0.010 0.010/0.054=0.1852
A2 0.5 0.04 0.020 0.020/0.054=0.3704
A3 0.3 0.08 0.024 0.024/0.054=0.4444
1.00 P(B) 0.054 1.0000

∴ Required Probability = 1- [P(A3/B)]

= 1 – 0.4444 = 0.5556

Learning Objective 2

Understand the Concepts of Random variable and Its Applications

Random Variable

If we can assign a real-valued function to every value of the variable in the

sample space, such that

i. P (Xi) = P [X = Xi] V i

ii. P (Xi) ≥ 0 V i

iii. ∑ P(Xi) = 1 then it is called as Random Variable.

If Xi is a discrete random variable then P(X) is known as probability mass function of X.

If Xi is a continuous random variable then P(X) is called probability density function and
is denoted by f(X).

1. For example let us consider the tossing of three coin. The resulting events are:

No. of Heads
3 ⅛
2 ⅜
1 ⅜
0 ⅛
Total 1

For every Xi we are able to assign a P(Xi) such that ∑ P(Xi) = 1. No. of heads probability
form a probability distribution.

A systematic presentation of random variable with its value and probabilities is called a
probability distribution of that random variable. The distribution will have its mean and
standard deviation.

Mathematical expectation and variance of a Random Variable.

Mathematical expectation of a random variable is denoted by E(X) and is defined as E(X)

= ∑ Xi P(Xi)

Its variance is given by

Var (X) = E (X2) – [E (X)]2

Where E(X2) = ∑ Xi2 P(Xi)

Its standard deviation is

S.D (X) = √VAR (X) =√(E(X2) – [E(X)]2)


Example 18:

A Random variable takes the values -3, -2, 1, 0, 4, 6 with probabilities 1/12, 2/12, 3/12,
4/12, 1/12, 1/12 respectively find its mean or expected value and variance.

Solution: Given

Xi P(Xi) Xi P(Xi) Xi P(Xi)

-3 1/12 -3/12 9/12
-2 2/12 -4/12 8/12
1 3/12 3/12 3/12
0 4/12 0 0
4 1/12 4/12 16/12
6 1/12 6/12 36/12
Total 6/12 72/12 = 6

∴ E(X) = ∑ Xi P(Xi) = 6/12 = ½

Var (X) = E(X2) – [E(X)]2

= 6 – ¼ = 23/4

S.D (X) = √23/2

Example 19: Mr. A and B play a game. If “A” picks up an even number from 1 to 6, B
will pay him double the amount equal to picked up number. If “A” picks up an odd
number then he has to pay amount equal to double the picked up number. What is A’s

Solution: Let Xi be the random variable and P(Xi) be its probability, then we have

No. (Xi) Xi P(Xi) Xi P(Xi)

1 -2 1/6 -2/6
2 4 1/6 4/6
3 -6 1/6 1/6
4 8 1/6 8/6
5 -10 1/6 -10/6
6 12 1/6 12/6
Total 1 11/6

∴ Expectation of “A” is E(X) = 11/6

Example 20: If Xi is a random variable with the following distribution find i) P(Xi) ≥ 3
ii) P(Xi = 0) iii) P(1 ≤ Xi
≤ 3) iv) P(Xi) ≥ 4

Xi -3 -2 0 1 2 3 4 5
P(Xi) K 2K 2K 3K 3K 2K K K
Solution: Since Xi is a random variable ∑ P(Xi) = 1

i) P(Xi
≥ 3) = P(Xi = 3) + P(Xi =4) + (P(Xi) = 5)

= 2K + K + K = 4K = 4/15

ii) P(Xi = 0) = 2K = 2/15

iii) P(1 ≤ Xi
≤ 3)

= P(X =1) + P(X = 2) + P(X = 3)

= 3K + 3K + 2K = 8K = 8/15

iv) P(Xi
≥ 4) = P(Xi = 4) + P(Xi = 5)

= K + K = 2K = 2/15


Probability plays an important role in decision making process. The basic definitions and
approaches were explained with examples. The environments where to use the different
rules are also explained with examples.


Individuals and corporates generate several data that resembles certain theoretical
distributions. Since mathematically we have many derived characteristics of the
theoretical distributions, we can make use of them for a quick analysis of the observed
distributions. Examples of observed distributions are:-
i. Number of male children in a family.

ii. Number of defectives produced per production run

iii. Number of employees drawing salary in some brackets.

These theoretical distributions are divided into two groups:

a) Discrete probability distributions and

b) Continuous probability distributions and are formed under certain assumptions.

Learning Objective 1

Understand the Concept of Bernoulli Distribution

Bernoulli Distributions:

A variable which assumes values 1 and 0 with probabilities p and q=1-p, is called
Bernoulli variable. It has only one parameter p. For different values of p (0≤ p≤ 1), we
get different Bernoulli distributions.

1 represents the occurrence of success

0 represents the occurrence of failure.

In other words the assumption for the distribution is outcome of a experiment is of

dichotomous nature i.e. Success / failure, present / absent, defective / non defective, yes /
no etc.

Example: When a fair coin is tossed the outcome is either head or tail. The variable “X”
assumes 1 or 0.

Repetition of Bernoulli experiment

An experiment which results in two mutually exclusive and exhaustive outcomes is called
a Bernoulli experiment. Let a Bernoulli experiment be repeated “n” times under identical
conditions, Let Xi, for i=1 to n, assume the values 1 or 0. Then Xi is a Bernoulli Variate
with probability p. Let X = X1 + X2 +……..+Xn denote the number of success in the “n”
repetition. Then X forms Bernoulli distribution. Its mean is p an variance is pq.

Learning Objective 2

Understand the Concept of Binomial Distribution

1. Binomial Distribution:

It is a discrete probability distribution. Its probability mass function is given by

P(X) = nCxqn-x px, x = 0 to n. The Binominal Distribution is given by

(q+p)n = qn + nC1 qn-1 p1 + nC2qn-2 p2 +…………………..+pn

The successive terms of the expansion gives the probability of 0, 1, 2 ……..n

success. The mean and variance of the distribution are np and npq. “n” and “p”
are its parameters. It is a unimodal distribution. For fixed n or p as p or n
increases the distribution shifts from left to right.

a. Assumption under which Binomial Distribution can be applied.

i. The experiment should be of dichotomous nature.

ii. The probability of success should remain the same from experiment to

iii. Experiments should be conducted under identical conditions.

iv. Experiments should be statistically independent.

b. Examples of Binomial Variate

1. Number of defectives in a random sample of 6 articles drawn from a

manufactured lot.
2. Number of seeds germinating among 10 seeds sown.
3. Number of heads turned in tossing 8 coins.

c. Recurrence relation between successive terms of Binomial expansion is

given by

Where Tx-1 = N p(n=X-1). N – Total frequency. This recurrence formula helps us

to construct theoretical distribution for given observed distribution.

There are 3 types of problems in Distribution: i) To find probability of events ii)

to find expected values iii) given the parameters to find the distribution.
Type i)

Example 1: An unbiased coin is tossed 6 times. What is the probability that the
tosses will result in i) Exactly two head ii) At least 5 head iii) at most two heads
iv) not greater than one v) not less than five heads vi) at least one head.

Solution: Let “A” be the event of getting head

Given p = 1/2 q = 1/2 n = 6

∴ Binominal Distribution is = (1/2 + 1/2)6

i. P(X = 2) = 6C2 (1/2)6-2

. (1/2)2=

ii. P(X ≥ 5) = P(X = 5) + P(X = 6)

= 6C5 (1/2)6-5. (1/2)5 + 6C6 (1/2)6-6. (1/2)6=7/64

iii. P(X ≤ 2) = P(X = 0) + P(X = 1) + P(X = 2)

= (1/2)6 + 6C1 (1/2)6-1. (1/2)1 + 6C2 (1/2)6-2 (1/2)2=11/32

iv. P(X ≤ 1) = P(X = 0) + P(X = 1)=1/64+6/64=7/64

v. P(X ≥ 5) = P(X = 5) + P(X = 6)=7/64

vi. P(X≥ 1) = 1- P(X = 0)=63/64

Example 2: The probability that an employee getting occupational disease is 20

%. In a firm having 5 employees, what is the probability that i) None ii) Exactly
Two iii) More than 4 will contract the disease.

Solution: Let “A” be the event of employee contracting the disease.

Given P(A) = 0.2 = P

∴ q = 1 – 0.2 = 0.8 n=5

∴ Binominal Distribution is (0.8 + 0.2)5

i. P(X = 0) = (0.8)5

ii. P(X = 2) = 5C2 (0.8)3 (0.2)2

= 10 x 0.512 x 0.04

= 0.2048

iii. P(X > 4) = P(X = 5) = (0.2)5

= 0.00032

Example 3: The probability that a bomb dropped on a bridge hits it is 0.5. Eight
bombs are dropped on the bridge. The bridge will be destroyed if any two bombs
fall on it. i) Find the probability that all bombs hit it ii) the bridge is destroyed.


Let the probability that the bomb will hit the bridge be p.

Then given p = 0.5, q = 1-p = 0.5 n=8

∴ Binominal Distribution is (q + p)n = (0.5 + 0.5)8

i. P(X = = p8 = (1/2)8 = 1/256

ii. Bridge is destroyed if 2 or more bombs fall on it

∴ P(X ≥ 2) = 1 – [P (X = 0) + P(X = 1)]=

Type ii)

Example 4: A random sample of 5 sachets of coconut oil were examined and two
were found to be leaking. A wholesaler receives six hundred and twenty five
packets, each containing 5 sachets. Find the expected number of packets to
contain exactly one packet leaking.


Given n = 5

Probability of leaking p = 2/5 ∴ q = 3/5 N = 625

Binominal Distribution = (3/5 + 2/5)5

∴ P(X = 1) = 5C1 (3/5)5-1

. (2/5)1=162/625

∴ Expected no. of packets to contain exactly one leaking sachet =


Type iii) Examples

Example 5: Bring out the fallacy, if any, in the following statement on Binominal
Distribution. “The mean of a B.D is 4 and variance is 5″.


Given np = 4 (Mean)……………..(1)

npq = 5 (Variance)………….(2)

Therefore, npq/np=5/4

Since q > 1, the statement is wrong.

Example 6: Find the probability that X = 3 for a Binomial Distribution whose

mean is 3 and variance is 2.


Given np = 3, npq = 2 q=npq/nq=2/3 p=1-q=1/3

and since np = 3

n . 1/3 = 3 or n = 9

∴ B.D is (2/3 + 1/3)9

∴ P(X = 3) = 9C3 (2/3)6

. (1/3)3 = 1792 / 6561
Learning Objective 3

Understand the Concept of Poisson Distribution

Poisson Distribution:

It is a discrete probability distribution. Its probability mass function is given by


X varies from 0 to infinity. The mean and variance of the distribution is m. Its

standard deviation is . “m” is called the parameter of the distribution . “m” is

also given by np i,e. m = np. where p is the probability of success and n is the
number of trials. It is a unimodal distribution. It is also known as the distribution
of “RARE EVENTS”. It is the limiting form of Binomial Distribution as n tends
to .


Assumption under which the distribution can be applied are

a. The outcome of trial / experiment must be of dichotomous nature.

b. The probability of success must remain the same for trials.

c. The trials should be conducted under identical conditions.

d. The trails should be statistically independent

e. The probability of success should be very small and n should be large

such that np is a constant m.

[Generally p < 0.1 and n > 10]

Some of Real life Poisson Variate.

4. Number of accidents in any Traffic circle.

5. Number of incoming telephone calls at an exchange per minute.
6. Number of radio-active particles emitted by substances.
7. Number of defects in a product.
8. Number of micro-organisms developed during a period.
Recurrence Relation

Recurrence relation between successive terms is given by

Examples 7: Suppose 2 house in thousand catches fire in a year and there are
2000 houses in a village. What is the probability that

i) none ii) at least one iii) Not more than 2 houses catches fire.


Given the probability of a house catching fire

P=2/1000=0.002 n=2000

∴ m = np = 2000 x 0.002 = 4

Therefore required Probability


ii. P(X ≥ 1) = 1 – P(X = 0) = 1 – 0.01832 = 0.98168

iii. P(X ≤ 2) = P(X = 0) + P(X = 1) + P(X = 2)

= = =0.2366

Example 8: 1 percent of bulbs manufactured by a firm are expected to be

defective. A cartoon contains 200 bulbs. Find the probability that the cartoon
contains 3 or more defective bulbs.


Given P= 0.01 probability that bulb is fused

n = 200

∴ m = np = 200 x 0.01 = 2
P(X ≥ 3) = 1 – [P(X = 0) + (P(X = 1) + P(X + 2)]= =1-e-

= 1 – 0.67670 = 0.3333

Example 9: On an average there are 3 mistakes on a page of book. The book

contains 200 pages. What is the probability that a randomly selected page has
exactly one mistake?


Given m = 3


Example 10:

In example 9, how many pages would you expect to be free from mistakes.


Given m=3 n = 200

P(X = 0) = e-3 = 0.04979

∴ Expected number of pages to be free from mistakes is given by

nP(X = 0) = 200 x 0.04979

= 9.978 10 pages.

Type iii)

Example 11: X is a Poisson Variate such that P(X = 1) = P(X = 2). Find P(X = 0)


Let “m” be the parameter of the distribution

P(X = 1) = P(X = 2)

Or, m/1=m2/2

∴ P(X = 0) = e-2 = 0.13534

Learning Objective 4

Understand the Concept of Normal Distribution

Normal Distribution

9. It is a continuous probability distribution.

10. Its probability density functions is given by.

11. X varies from – α to + α .

12. s means is and standard deviation is σ .
13. and σ are the parameters of the distribution.
14. It is a bell-shaped curve.
15. It is symmetric about its mean.
16. The mean divides the curve into 2 equal portions.
17. Its Quartile Deviation, Q.D = 2/3 σ
18. Its Mean Deviation M.D 4/5 σ
19. The X – axis is an asymptote to the curve [Asymptote is a straight line that
touches the curve at infinity].
20. The points of inflexion occurs at σ
21. It is a unimodal Distribution.
22. Mean, Median and Mode coincide.
23. The area under normal curve within certain limits are as shown below
Standard Normal Distribution

Any Normal Distribution can be converted into a standard Normal Distribution by

the transformation Z = X- /σ . Z is called Standard Normal variate. Its
distribution forms a standard normal distribution whose probability density
function is given by

Z varies from – to + . The mean of its distribution is zero and standard

deviation is 1. The statistician have developed standard normal table. The table
gives the probability that z will lie between 0 and Z. Therefore to solve any
problem in Normal Distribution, we convert it to standard normal distribution and
calculate z and then refer to the table.

Normal distribution is the limiting form of Binomial Distribution.


Examples 11:

The weight of bournvita packs packed by the filling machine follow a normal
distribution with mean weight of 500 gms and standard deviation of 10 gms. A
pack is selected at random. What is the probability that i) its weight will exceed
515 gms ii) packs weight lie within 480 to 520 gms. iii) What proportion of packs
will have less than 480 and greater than 520 gms. If 10,000 packs are supplied
how many will be rejected gms, if 480 and 520 are upper and lower limit for

Solution: To solve such problem will draw the normal curve and represent the
information’s given in the problem as follows

P(X ≥ 515) = 0.5 – P (500 ≤ x ≤ 515)=

= 0.5 – P [0 ≤ Z ≤ 1.5]

= 0.5 – 0.4332

= 0.0668


P[480 ≤ x ≤ 520]

= P[480 ≤ x ≤ 520] + P[500 ≤ x ≤ 520]=

= P[ - 2 ≤ = ≤ 0] + P [0 ≤ = ≤ 2]

= 0.4772 + 0.4772

= 0.9544

Probability of acceptance is as found in (ii),

P(480 ≤ = ≤ 520) = 0.9544

If the weight lies outside these value then it will rejected

∴ Probability of Rejection = 1 – 0.9544 = 0.0456

∴ Number of packets that will be rejected = NP = 10000 x 0.0456 = 456

Type iii)

Example 12:

The sales volume of 1000 retail outlets of a soap company follow Normal
Distribution. 20 % of retail outlets sells less than 50 units per day and 15 % of
them sells 200 unit and above a) find Mean and Standard Deviation of the sales
volume b) find the expected number of retail outlets that sells units between 50
and 118 units.


Let m and σ be the mean and s.d

The given information can be represented as above


Quick analysis of observed data can be done if it is identified with the theoretical
distribution. The probabilities associated with random Variate of the distribution
help us to know the chances of occurrence of several events within specified
values. We can extend the solution to the cost aspects also.


In different fields of human activity, in doing the ordinary actions of our daily life, the
decision making process is based on the observations of few units which forms a portion
of the total population. This process of studying only a portion of the population and
making decisions involves risk, the risk of making wrong decisions. Evaluation of risk
will be discussed in Testing of hypothesis chapter. This unit deals with the various
techniques of drawing samples from the population.

Learning Objective 1

Understand Sampling Technique and the Concepts of Population and Sample

Population and Sample

a. Universe or Population:

Statistical Survey or enquiries deal with studying various characteristics of unit belonging
to a group. The group consisting of all the units is called Universe or Population.

Example: In the statistical survey aimed at determining average per capita income of the
people in the city, all earning individuals in the city form the population.

If the population consists of finite number of individuals, then it is called a Finite

Population. A population consisting of infinite number of units or units such that it is
practically impossible to observe all the units is called Infinite Population.

Although many populations appear to be exceedingly large, no truly infinite population

of physical objects actually exists. Given limited resources and time it is practically not
possible to count the number of grains of sand on the beach. Such populations are termed
as infinite population for our study.

Populations may further be classified as Existent or Hypothetical. A population

consisting of concrete objects like the books in library is known as existent population.
Throwing a coin infinite number of times produces Hypothetical Population.
b. Sample is a finite subset of a population drawn from it to estimate the characteristics of
the population. Sampling is a tool which enables us to draw conclusions about the
characteristics of the population.

The advantages of sampling are:

• In short time we get maximum information about the population.

• It results in considerable amount of saving of time and labour.
• The organization and administration of a sample survey is relatively much less.
• The results obtained are reliable and always possible to attach degree of
• There is a possibility of obtaining detailed information. In other words there is a
greater scope.
• In case of infinite population, it is the only available method.
• If the units are destroyed or affected adversely in the course of investigation, then
the only method is sampling.

Sampling Theory

The sampling theory is based on the following important laws.

a. Law of Statistical Regularity:

The law lays down that a group of units chosen at random from a large group tends to
posses the characteristics of that large group.

Suppose a particular characteristic of the population has the following shape, then the
same characteristics will also follow the same shape in the Sample.

b. Principle of Inertia of large numbers:

This principle states that “other things being equal, as the sample size increases, the
results tend to be more reliable and accurate”.

Suppose the population mean is 25 units. If a sample size of 50 results in average of 24.5
units, then larger sample size of 100 will result in 24.8 units. In other words larger the
sample size, the more accurate will be the result.
c. Principle of persistence of small numbers:

If some of the units in a population possess markedly distinct characteristics, then it will
be reflected in the sample values also.

For example, if there are 300 Blind persons in a population of 10,000 persons, then a
sample of hundred will have more or less same proportion of Blind persons in it.

d. Principle of Validity:

A sampling design is said to be valid if it enables us to obtain tests and estimation about
population parameters.

e. Principle of Optimization:

This principle aims at i) obtaining a desired level of efficiency at minimum cost or

obtaining maximum possible efficiency with given level of cost.

Terms used in sampling theory are

a. Parameter: Any statistics, like mean, median, etc calculated from population
values are known as parameters of the population and denoted by Greek letters (µ ,
σ etc).

Statistics: Any statistics calculated from the sample are known as statistic and are
denoted by English letters (X, S, etc)

c. Sampling Distribution: Consider the selection of 2 numbers from 1, 2,3,4,5. The

possible combinations and their mean are tabulated below.

Combination No Numbers Selected Average

1 1,2 1.5
2 1,3 2
3 1,4 2.5
4 1,5 3
5 2,3 2.5
6 2,4 3
7 2,5 3.5
8 3,4 3.5
9 3,5 4
10 4,5 4.5

This gives the means of sample size 2. We form a distribution of sample means.

X f
fx fx2
Mean Frequency
1.5 1 1.5 2.25
2 1 2.0 4.00
2.5 2 5.0 12.50
3 2 6.0 18.00
3.5 2 7.0 24.5
4 1 4.0 16.0
4.5 1 4.5 20.25
N 10 30 97.50

∴ Mean of the distribution = Σ fx / N = 3

Mean of the population is 1 + 2 + 3 + 4 + 5 / 5 = 3

The above table represents the sampling distributions of Means. We observe that mean of
sample means is equal to population mean.

1. d.Standard error

The standard deviation of sample means


Is called the Standard Error of the Mean.

In other words the standard deviation of sampling distribution of any statistic is called
standard error of that statistic. Standard error helps us in

i) Testing of hypothesis.

ii) Constructing confidence interval for the statistics.

iii) Giving reliability measure for the statistic by its reciprocal value.

Errors in Statistics

The term error denotes the difference between population value and its estimate provided
by sampling technique. Therefore the term is not referred in its ordinary sense in

There are Four types of errors

a) Sampling error

b) Non – Sampling errors

c) Biased errors

d) Unbiased errors

a.Sampling error: The sample results are bound to differ from population results, since
sample is only a small portion of the population. It is also known as inherent error and
cannot be avoided. It is not worth to eliminate them completely.

These errors may be due to the following factors.

i. faulty selection of sample.

ii. Substitution of units to be studied.

iii. Faulty demarcation of sampling units

iv. Error due to bias in estimation.

However they follow random or chance variations and tend to cancel out each other on

b.Non – Sampling Errors

They are attributed to factors that can be controlled and eliminated by suitable actions. It
is worth to eliminate these errors. They are due to the following factors.

i. Faculty planning, faculty definitions.

ii. Defective methods of interviewing.

iii. Personal bias of investigator.

iv. Lack of trained and qualified investigators

v. Respondents failure to answer

vi. Improper coverage.

vii. Compiling errors.

viii . Publication errors.

c.Biased errors.

It arises in both census and sampling method. They are due to personal bias of the
investigator and the instruments used for measuring. They are also due to faculty
collection of data, Respondent’s bias and bias due to non-response.

Biased errors have a tendency to grow with sample size. Therefore they are also known
as cumulative errors. The magnitude of biased errors is directly proportional to sample

d.Unbiased errors.

The errors that are due to over-estimate and underestimate such that they are equal are
known as unbiased errors. They are known as compensatory errors. They do not increase
with sample size.

Measures of Statistical Errors

i. Absolute Error: is the difference between true value (t) and the observed value (a).
Symbolically AE = t – aIt is independent of magnitude of the actual value.

ii. Relative Error: is the ratio of the Absolute Error to the actual value symbolically.

It provides a degree of error for comparison purposes between different sets of data.

Learning Objective 2

Know about the Different Types of Sampling

The sampling techniques may be broadly classified as follows:

i. Probability Sampling

ii. Non-Probability Sampling

Probability Sampling.

It is provides a scientific technique of drawing samples from the population according to

the law in which each unit has a predetermined probability of being included in the
sample. Different ways of assigning probability are

i. Each unit has the same chance of being selected.

ii. Sampling units have varying probability

iii. Units have probability proportional to the sample size.

Some of the important sampling designs are:

i. Simple Random Sampling

Under this technique sample units are drawn in such a way that each and every unit in the
population has an equal and independent chance of being included in the sample. If
sample unit is replaced before drawing next unit, then it is known as Simple Random
Sampling with replacement [SRSWR]. If the sample unit is not replaced before drawing
next unit, then it is called Simple Random Sampling without replacement [SRSWOR]. In
first case probability of drawing a unit is 1/N, where N is the population size. In the
second case probability of drawing a unit is 1/Nn.

Selection of Simple Random Sampling can be done by a) Lottery Method b) the use of
table of random numbers.

a) In lottery Method we identify each and every unit with distinct numbers by allotting an
identical card. The cards are put in a drum and thoroughly shuffled before each unit is

b) There are several Random Numbers Tables. They are Tippet’s Random Number Table,
Fisher’s and Yate’s Tables, Kendall and Babington Smiths random tables, Rand
Corporation random number etc Specimen of Random Numbers by Tippetts is given

Suppose we want to select 10 units from a population size of 100. we number the
population units from 00 to 99. Then we start taking 2 digits. Suppose we start with 41
(second row) then the other numbers selected will be 67, 95, 24, 15, 45, 13, 96, 72, 03.

1. ii. Stratified Random Sampling

This sampling design is most appropriate if the population is heterogeneous with

respect to characteristic under study or the population distribution is highly

We subdivide the population into several groups or strata such that i) units within
each stratum is more homogeneous ii) units between stratum are heterogeneous
and iii ) Strata do not overlap, in other words every unit of population belongs to
one and only one stratum.
The criterion used for stratification are geographical, sociological, age, sex,
income etc. The population of size N is divided into ‘K’ strata relatively
homogenous of size N1, N2………….Nk such that N1 + N2 +……… + Nk = N.
Then we draw a simple random sample from each stratum either proportional to
size of stratum OR equal units from each stratum.

Merits of the Sampling Technique are

a. Sample is more representative.

b. Provides more efficient estimate

c. Administratively more convenient

d. Can be applied in situation where different degrees of accuracy is desired for

different segments of population


a. Many times the stratification is not effective.

b. Appropriate sample sizes are not drawn from each of the stratum.


Suppose 200, 300 and 500 items are produced by Factories located at three cities
X, Y and Z. We wish to draw a sample of 20 items under proportional stratified
sampling. We number the unit from 0 to 999. Then refer to Random Table and
select the numbers as

Proportion of samples to be selected are

For Factory X, it is 20x(200/1000)=4

For Factory Y, it is 20x(300/1000)=6

For Factory Z, it is 20x(500/1000)=10

Hence the total = 20

For first factory sample units selected are

174, 192, 069, 156

For second factory sample units selected are

287, 432, 444, 482, 302, 254

For third factory sample units elected are

854, 772, 733, 741, 822, 853, 570, 802, 629, 525

Systematic Sampling

This design is recommended if we have a complete list of sampling units arranged

in some systematic order such as geographical, chronological or alphabetical

Suppose the population size is “N”. The population units are serially numbered 1
to N in some systematic order and we wish to draw a sample of “n” units, then we
divide units from 1 to N into “K” groups such that each group has n units. This
implies nK = N or K = N/n. From the first group we select a unit at random.
Suppose the unit selected is 6th unit, thereafter we select every 6 + Kth units. If K =
20, n = 5 and N = 100 then units selected are 6, 26, 46, 66, 86.

Merits of Systematic Sampling are:-

a. Very easy to operate and easy to check

b. It saves time and labour

c. More efficient than Simple Random Sampling if we have up-to-date frame

Demerits of Systematic Sampling are:-

a. Many case we do not get up-to-date list

b. It gives biased results if periodic feature exist in the data.

Cluster Sampling
The total population is divided into recognizable sub-divisions, known as clusters
such that within each cluster units are more heterogeneous and between clusters
they are homogenous. The units are selected from each cluster by suitable
sampling techniques.

Multi-stage Sampling

The total population is divided into several stages. The sampling process is
carried out through several stages. For example we want to select 1000 colleges
from southern states. In the first stages we may select any three state. In the
second stage we may select some districts in that state. In the 3rd stage, we may
select the colleges in each district. We may adopt any sampling technique at each

Merits of the method are:-

a. Greater flexibility in Sampling method

b. Existing division can be used.

Demerits are

a. Estimates are less accurate

b. Investigator should have knowledge of the entire population that will be



Non-Probability Sampling

Depending upon the object of enquiry and other considerations a predetermined

number of sample units is selected purposely so that they represent the true
characteristics of the population.

A serious drawback of this sampling design is that it is highly subjective in

nature. The selection of sample units depends entirely upon the personal
convenience, biases, prejudices and beliefs of the investigator. This method will
be more successful if the investigator is thoroughly skilled and experienced.

Judgment Sampling

The choice of sample items depends exclusively on the judgment of the

investigator. The investigator’s experience and knowledge about the population
will help to select the sample units. It is most suitable method if the population
size is less.
Merits of this method are:-

a. Most useful for small population

b. Most useful to study some unknown traits of a population some of

whose characteristics are known.

c. To solve day-to-day problem

Demerits of this method are

a. It is not a scientific method

b. It has a risk of investigator’s bias being introduced.

Convenience Sampling

The sample units are selected according to convenience of the investigator. It is

also called “chunk” which refers to the fraction of the population being
investigated which is selected neither by probability nor by judgment. Further a
list or frame work should be available for the selection of the sample. There is
high chance of bias being introduced. It is used to make pilot studies.

Quota Sampling

It is a type of judgment sampling. Under this design Quotas are set up according
to some specified characteristic such as age group, income groups etc. From each
group a specified number of units are sampled according to the Quota allotted to
the group. Within the group the selection of sample units depends on personal
judgment. It has a risk of personal prejudice and bias entering the process. This
method is often used in public opinion studies.

Learning Objective 3

Learn the Concept of Determination of Sample Size

Sample size depends upon the size of the population; the resources available, the
degree of accuracy desired, homogeneity of the population, nature of study,
Methods of sampling used and nature of respondents.

Formulae available to determine sample size are

(For infinite population)

Where Z – is value according to the degree of accuracy desired P – Population

value, Ps – Sample value which implies P – Pserror desired in the result Q = 1 –
P. “n” is sample size.

(For finite population)

N is population size.


(For finite population)


(For infinite population)


Mean of sample means

σ Population Standard Deviation

n Sample Size

Central Limit Theorem

If X1, X2…………Xn is a random sample of size “n” from any population, then
the sample mean (X) is normally distributed with mean µ and variance σ 2 / n
provided “n” is sufficiently large.

From the theorem we infer i) the mean of the sampling distribution of mean will
be equal to the population mean ii) the sampling distribution of the mean
approaches normal distribution as the sample size increases iii) it permits us to
use sample statistics to make inference about population parameters irrespective
of the shape of frequency distribution of the population.


There are two methods of studying the characteristics of population, census and
sampling. The various advantages of sampling and the various errors that could
prop up in using these methods were explained. Mainly there are two methods of
sampling namely i) Probability Sampling ii) non-probability sampling. The merits
and demerits of each sampling method were explained. We discussed the
procedure for determining sample size. We concluded the chapter with the
importance of central limit theorem.


Everyone makes estimates. When you are ready to cross a street, you estimate the speed
of any car that is approaching, the distance between you and that car, and your own
speed. Having made these quick estimates, you decide whether to wait, walk, or run.

Learning Objective 1

Know the Estimation Procedure and Different Estimates

Reasons why estimates have to be made

All mangers must make quick estimates too. The outcome of these estimates can affect
their organizations as seriously as the outcome of your decision as to whether to cross the
street. Credit managers estimate whether a purchaser will eventually pay his bills.
Prospective home buyers make estimates concerning the behaviour of interest rates in the
mortgage market. All these people make estimates without worry about whether they are
scientific but with the hope that the estimates bear a reasonable resemblance to the

Managers use estimates because in all but the most trivial decisions, they must make
rational decisions without complete information and with a great deal of uncertainty
about what the future will bring. As educated citizens and professionals, you will be able
to make more useful estimates by applying the techniques described in this and
subsequent chapters.

Making statistical inference

Statistical inference is based on estimation, and hypothesis testing. In both estimation and
hypothesis testing, we shall be making inferences about characteristics of populations
from information contained in samples. Here we infer something about a population from
information taken from a sample.

Here we try to estimate with reasonable accuracy the population proportion (the
proportion of the population that possesses a given characteristic) and the population
mean. To calculate the exact proportion or the exact mean would be an impossible goal.
Even so, we will be able to make an estimate, and implement some controls to avoid as
much of the error as possible.

Types of estimates

There are two types of estimates about a population;

1. A point estimate and

2. an interval estimate

A Point estimate: is a single number that is used to estimate an unknown

population parameter. A point estimate is often insufficient, because it is either
right or wrong, we do not know how wrong it is. Therefore, a point estimate is
much more useful if it is accompanied by an estimate of the error that might be

An interval estimate: is a range of values used to estimate a population parameter.

It indicates the error in two ways: by the extent of its range and by the probability
of the true population parameter lying within that range.

Learning Objective 2

Understand the Criteria of a Good Estimator

Unbiasedness: This is a desirable property for a good estimator to have. The term
unbiasedness refers to the fact that a sample mean is an unbiased estimator of a
population mean because the mean of the sampling distribution of sample means
taken from the same population is equal to the population mean itself. We can say
that a statistic is an unbiased estimator if, on average, it tends to assume values
that are above the population parameter being estimated as frequently and to the
same extent as it tends to assume values that are below the population parameter
being estimated.

Efficiency: Another desirable property of a good estimator is that it be efficient.

Efficiency refers to the size of the standard error of the statistic. If we compare
two statistics from a sample of the same size and try to decide which one is the
more efficient estimator, we would pick the statistic that has the smaller standard
error. Suppose we choose a sample of a given size and must decide whether to use
the sample mean or the sample median to estimate the population mean. If we
calculate the standard error of the sample mean and find it to be 1.05 and then
calculate the standard error of the sample median and find it to be 1.6, we would
say that the sample mean is a more efficient estimator of the population mean
because its standard error is smaller. It makes sense that an estimator with a
smaller standard error (with less variation) will have more chance of producing an
estimate nearer to the population parameter under consideration.

Consistency: A statistic is a consistent estimator of a population parameter if as

the sample size increases, it becomes almost certain that the value of the statistic
comes very close to the value of the population parameter. If an estimator is
consistent, it becomes more reliable with large samples.

Sufficiency: An estimator is sufficient if it makes so much use of the information

in the sample that no other estimator could extract from the sample additional
information about the population parameter being estimated.

Learning Objective 3

Know the Concepts of Point Estimate, Interval Estimate and Confidence


Point estimates:
Results of a samples of 35 Box of bolts (bolts per box)
101 103 112 102 98 97 93
105 100 97 107 93 94 97
97 100 110 106 110 103 99
93 98 106 100 112 105 100
114 97 110 102 98 112 99

Consider the table above, we have taken a sample of 35 boxes of bolts from a
manufacturing line and have counted the bolts per box. We can arrive at the
population mean i.e. mean number of bolts by taking the mean for the 35 boxes
we have sampled. i.e. adding all the bolts and dividing by the number of boxes.

Thus using the sample mean x as the estimator we have a point estimate of the
population mean µ.

Similarly we can use the sample variance s2 and estimate the population variance,
where the sample variance s2 is given by the formula.

Interval Estimates

The purpose of gathering samples is to learn more about a population. We can

compute this information from the sample data as either point estimates, or as
interval estimates. An interval estimate describes a range of values within
which a population parameter is likely to lie.

The marketing research director needs an estimate of the average life in months of
car batteries his company manufactures. We select a random sample of 200
batteries with a mean life of 36 months. If we use the point estimate of the sample
mean x as the best estimator of the population mean µ, we would report that the
mean life of the company’s batteries is 36 months.

The director also asks for a statement about the uncertainty that will be likely to
accompany this estimate, that is, a statement about the range within which the
unknown population mean is likely to lie. To provide such a statement, we need to
find the standard error of the mean.
If we select and plot a large number of sample means from a population, the
distribution of these means will approximate to normal curve. Furthermore, the
mean of the sample means will be the same as the population mean. Our sample
size of 200 is large enough that we can apply the central limit theorem. Suppose
we have already estimated the standard deviation of the population of the batteries
and reported that it is 10 months. Using this standard deviation we can calculate
the standard error of the mean: so using the formula

We find the standard error S.E.=

Making the interval estimate:

We can tell to the director that our estimate of the life of the company’s batteries
is 36 months, and the standard error that accompanies this estimate is 0.707. In
other words, the actual mean life for all the batteries may lie somewhere in the
interval estimate of 35.293 to 36.707 months. This is helpful but insufficient
information for the director. Next, we need to calculate the chance that the actual
life will lie in this interval or in other intervals of different widths that we might
choose, ± 2σ (2 x 0.707), ± 3σ (3 x 0.707), and so on.

The probability is 0.955 that the mean of a sample size of 200 will be within ±2
standard errors of the population mean. Stated differently, 95.5 percent of all the
sample means are within ±2 standard errors from µ . “The population mean µ will
be located within ±2 standard errors from the sample mean 95.5 percent of the

Hence from the above example we can now report to the director, that the best
estimate of the life of the company’s batteries is 36 months, and we are 68.3
percent confident that the life lies in the interval from 35.293 to 36.707 months
(36 ± 1 σ x ). Similarly, we are 95.5 percent confident that the life falls within the
interval of 34.586 to 37.414 months (36 ± 2 σ x), and we are 99.7 percent
confident that battery life falls within the interval of 33.879 to 38.121 months (36
± 3 σ x).

Interval Estimates and confidence intervals

In using interval estimates, we are not confined to ±1,2 and 3 standard errors; for
example, ± 1.64 standard errors includes about 90 percent of the area under the
curve; it includes 0.4495 of the area on either side of the mean in a normal
distribution. Similarly, ±2.58 standard error includes about 99 percent of the area,
or 49.51 percent on each side of the mean.

The probability that we associate with an interval estimate is called the

confidence level. This probability indicates how confident we are that the interval
estimate will include the population parameter. A higher probability means more
confidence. In estimation, the most commonly used confidence levels are 90
percent, 95 percent, and 99 percent, but we are free to apply any confidence level.

The confidence interval is the range of the estimate we are making. If we report
that we are 90 percent confident that the mean of the population of incomes of
people in a certain community will lie between Rs. 8,000 and Rs. 24,000, then the
range Rs. 8,000-Rs. 24,000 is our confidence interval. Often, however, we will
express the confidence interval in standard errors rather than in numerical values.

Thus, we will often express confidence intervals like this: X ± 1.64

X + 1.64 = upper limit of the confidence interval

X – 1.64 = lower limit of the confidence interval

Thus, confidence limits are the upper and lower limits of the confidence interval.

In this case, X + 1.64 is called the upper confidence limit (UCL) and X – 1.64

= is the lower confidence limit (LCL).

Calculating interval Estimates of the Mean from Large Samples

If the samples are large then we use the finite population multiplier to calculate
the standard error. This is given from the previous unit as

Calculating interval Estimates of the Proportion from Large Samples

Statisticians often use as sample to estimate a proportion of occurrences in a

population. For example, the government estimates by a sampling procedure the
unemployment rate, or the proportion of unemployed people, in the country’s
We know for a binomial distribution, the mean and the standard deviation of the
binomial distribution to be

Mean µ = np

And standard deviation σ = √(npq) where q = 1-p

Here n = number of trials

p = probability of success and

q = probability of failure = 1- p

Since we are taking the mean of the sample to be the mean of the population we
actually mean that µ -p = p

Similarly, we can modify the formula for the standard deviation of the binomial
distribution, √(npq), which measures the standard deviation in the number of
successes. To change the number of successes to the proportion of successes, we
divide √npq by n and get √(pq )/ √(n)

Therefore the standard error of the proportion Sp = √(pq)/√(n)

Example: In a very large organization the director wanted to find out what
proportions of the employees prefer to provide their own retirement benefits in
lieu of a company – sponsored plan. A simple random sample of 75 employees
was taken and found that 40%, i.e. 0.4 of them are interested in providing their
own retirement plans. The management requests that we use this sample to find an
interval about which they can be 99 percent confident that it contains the true
population proportion.

Here n = 75, p = 0.4; q = 1-p = 1 – 0.4 = 0.6

Therefore Standard error of the mean = √(pq) /√(n)

There the interval estimate for 99% level of confidence is 0.4 ± 2.58 (0.057) =
0.253 and 0.547.

Therefore the proportion of the total population of employees who wish to

establish their own retirements plans lie between 0.253 and 0.547.

Interval Estimates using the student’s t Distribution

So far, the sample sizes we were examining were all larger than 30. This is not
always the case. Questions like how can we handle estimates where the normal
distribution is not the appropriate sampling distribution, that is, when we are
estimating the population standard deviation and the sample size is 30 or less?
Suppose we have data only form let us say 10 weeks or sample sizes less than 30,
then fortunately, another distribution exists that is appropriate in these cases. It is
called the t distribution.

Early theoretical work on t distributions was done by a man named W. S. Gosset

in the early 1990s. Gosset was employed by the Guinness Brewery in Dublin,
Ireland, which did not permit employees to publish research findings under their
own names. So Gosset adopted the pen name Student and published under that
name. Consequently, the t distribution is commonly called Student’s t
distribution, or simply Student’s distribution.

Conditions for usage:

Because it is used when the sample size is 30 or less, statisticians often associate
the t distribution with small sample statistics. This is misleading because the size
of the sample is only one of the conditions that lead us to use the t distribution.
The second condition is that the population standard deviation must be unknown.
Use of the t distributions for estimating is required whenever the sample size is 30
or less and the population standard deviation is not known. Furthermore, in using
the t distribution, we assume that the population is normal or approximately

Degrees of freedom

“There is a different t distribution for each of the possible degrees of freedom.”

What are degrees of freedom? We can define them as the number of values we
can choose freely.

We will use degrees of freedom when we select a t distribution to estimate a

population mean, and we will use n – 1degrees of freedom, where n is the sample
size. For example, if we use a sample of 20 to estimate a population mean, we
will use 19 degrees of freedom in order to select the appropriate t distribution.

With two sample values, we have one degree of freedom (2-1 = 1), and with
seven sample values, we have six degrees of freedom (7-1 = 6). In each of these
two examples, then, we had n-1 degrees of freedom, assuming n is the sample
size. Similarly, a sample of 23 would give us 22 degrees of freedom.

Using the t – Distribution Table

Comparison between t and z tables

The table of t distribution values differs in construction from the z table or normal
distribution table used previously. The t table is more compact and shows areas
and t values for only a few percentages (10, 5, 2, and 1 Percent). Because there is
a different t distribution for each number of degrees of freedom, a more complete
table would be quite lengthy. Although we can conceive of the need for a more
complete table

A second difference in the t table is that it does not focus on the chance that the
population parameter being estimated will fall with our confidence interval.
Instead, it measures the chance that the population parameter we are estimating
will not be within our confidence interval (that is, that it will lie outside it). If we
are making an estimate at the 90 percent confidence level, we would look in the t
table under the 0.10 column (100 percent – 90 percent = 10 percent). This is 0.10
chance of error is symbolized by the Greek letter alpha α. We would find the
appropriate t values for confidence intervals of 95 percent, 98 percent, and 99
percent under the columns headed 0.05, 0.02, and 0.01, respectively.

A third difference in using the t table is that we must specify the degrees of
freedom with which we are dealing. Suppose we make an estimate at the 90
percent confidence level with a sample size of 14, which is 13 degrees of
freedom. Look under the 0.10 column until you encounter the row labelled 13.
Like a z value the t value there of 1.771 shows that if we mark off plus and minus

1.7716 (estimated standard errors of ) on either side of the mean, the area
under the curve between these two limits will be 90 percent, and the area outside
these limits(the chance of error) will be 10 percent.

Remember that in any estimation problem in which the sample size is 30 or less
and the standard deviation of the population is unknown and the underlying
population can be assumed to be normal or approximately normal, we use the t

Determining the Sample size in Estimation

In all the examples above we have used, the sample size was known. Now we are
trying to estimate the sample size n. if it is too small we may fail to achieve the
objective, if it is too large we will be wasting resources. However, let’s try to
examine some of the methods that are useful in determining what sample is
necessary for any specified level of precision.

Comparison of two ways of expressing the same confidence limits

IIM wants to conduct a survey of the annual earning of its graduates in
international placements. It knows from the past experience that the standard
deviation of its population of students is $ 1500. How large a sample size should
be taken in order to estimate the mean annual earnings of last years class within $
500 at 95% level of confidence?

If you look at the problem above: it is stated that variation of $ 500 on either side
of the populations mean.

That means z = 500

At 95% level of confidence we know from the z table that z = 1.96

Therefore 1.96 = 500; and that means = 500 / 1.96 = 255

Now if the standard error of the mean is 255; that leads us to

= σ / √n = 255. Since σ = 1500 we can find n. that is

1500 / √n = 255 therefore n = (1500 / 255)2 = 34.6

Meaning n should be greater than 34.6 or 35 if the university want to estimate the
precision with which it wants to conduct the survey.


In this chapter we have seen point estimates and interval estimates. These are the
foundation for inferential statistics in estimation and hypothesis testing which we
will be discussing in the next unit. Also we have seen the concept of confidence
levels and make estimations when the sample sizes are small and large. Also we
have gone in reverse to estimate a sample size provided we know the level of
accuracy we want to construct the estimate. Also we have seen that if the sample
size is less than 30 and the populations standard deviation is not known, we use
the student’s t distribution for estimations.


Hypothesis testing begins with an assumption, called a hypothesis, that we make about a
population parameter. We assume a certain value for a population mean. To test the
validity of our assumption, we gather sample data and determine the difference between
the hypothesized value and the actual value of the sample mean. Then we judge whether
the difference is significant. The smaller the difference, the greater the likelihood that our
hypothesized value for the mean is correct. The larger the difference, the smaller the

Unfortunately, the difference between the hypothesized population parameter and the
actual statistic is more often neither so large that we automatically reject our hypothesis
nor so small that we just as quickly accept it. So in hypothesis testing, as in most
significant real-life decisions, clear-cut solutions are the exception, not the rule.

Learning Objective 1

Understand the Basic Concepts of Testing of Hypotheses

Assumptions: Although hypothesis testing sounds like some formal statistical term
completely unrelated to business decision making, in fact managers propose and test
hypothesis all the time. “if we drop the price of this car model by Rs.1,500, we’ll sell
50,000 cars this year” is a hypothesis. To test this hypothesis, we have to wait until the
end of the year and count sales. Managerial hypothesis are based on intuition; the
marketplace decides whether the manager’s intuitions were correct. Hint: Hypothesis
testing is about making inferences about a population from only a small sample. The
bottom line in hypothesis testing is when we ask ourselves (and then decide) whether a
population like we think this one is would be likely to produce a sample like the one we
are looking at.

1. Testing Hypothesis
2. Null and Alternate hypothesis
In hypothesis testing, we must state the assumed or hypothesized value of the
population parameter before we begin sampling. The assumption we wish to test
is called the null hypothesis and is symbolized Ho.

Suppose we want to test the hypothesis that the population mean is equal to 500.
We would symbolize it as follows and read it, “The null hypothesis is that the
population mean = 500 written as Ho: µ = 500. The term null hypothesis arises
from earlier agricultural and medical applications of statistics. In order to test the
effectiveness of a new fertilizer or drug, the tested hypothesis (the null
hypothesis) was that it had no effect, that is, there was no difference between
treated and untreated samples.

If we use a hypothesized value of a population mean in a problem, we would

represent it symbolically as µ H0

This is read. “The hypothesized value of the population mean.”

If our sample results fail to support the null hypothesis, we must conclude that
something else is true. Whenever we reject the hypothesis, the conclusion we do
accept is called the alternative hypothesis and is symbolized H1 (”H sub-one”).
For the null hypothesis H0: µ = 200

We will consider three alternative hypothesis as:

H1: µ
≠ 200 (population mean is not equal to 200)

H1: µ > 200 (population mean greater than 200)

H1: µ < 200 (population mean less than 200)

Interpreting the level of significance

The purpose of hypothesis testing is not to question the computed value of the
sample statistic but to make a judgment about the difference between that sample
statistic and a hypothesized population parameter. The next step after stating the
null and alternative hypotheses, then, is to decide what criterion to use for
deciding whether to accept or reject the null hypothesis. If we assume the
hypothesis is correct, then the significance level will indicate the percentage of
sample means that is outside certain limits. (In estimation, please remember, the
confidence level indicated the percentage of sample means that fell within the
defined confidence limits.
Hypothesis are accepted and not proved.

Even if our sample statistic does fall in the non-shaded region (the region that
makes up 95 percent of the area under the curve), this does not prove that our null
hypothesis (H0) is true; it simply does not provide statistical evidence to reject it.
Why? Because the only way in which the hypothesis can be accepted with
certainty is for us to know the population parameter; unfortunately, this is not
possible. Therefore, whenever we say that we accept the null hypothesis, we
actually mean that there is not sufficient statistical evidence to reject it. Use of the
term accept, instead of do not reject, has become standard. It means simply that
when sample data do not cause us to reject a null hypothesis, we behave as if that
hypothesis is true.

Selecting a significance level

There is no single standard or universal level of significance for testing

hypotheses. In some instances, a 5 % level of significance is used. Published
research results often test hypotheses at the 1 percent level of significance. It is
possible to test a hypothesis at any level of significance. But remember that our
choice of the minimum standard for an acceptable probability, or the significance
level, is also the risk we assume of rejecting a null hypothesis when it is true. The
higher the significance level we use for testing a hypothesis, the higher the
probability of rejecting a null hypothesis when it is true. 5% level of significance
implies we are ready to reject a true hypothesis in 5% of cases.

If the significance level is high then we would rarely accept the null hypothesis
when it is not true but, at the same time, often reject it when it is true.

When testing a hypothesis we come across with four possible situations depicted
as follows.

The combination are

1. Hypothesis is a true, test result accepts it – we have made a right decision.

2. Hypothesis is a true, test result rejects it – we have made a wrong decision
(Type I error). It is also known as Consumer’s Risk, denoted by α
3. Hypothesis is False, test result accepts it – we have made a wrong decision
(Type II error). It is known as Producer’s Risk, denoted by β . 1 – P is
called power of the Test.
4. Hypothesis is False, test result rejects it – we have made a right decision.

When Type I error is preferred

Suppose that making a Type I error (rejecting a null hypothesis when it is true)
involves the time and trouble of reworking a batch of chemicals that should have
been accepted. At the same time, making a Type II error (accepting a null
hypothesis when it is false) means taking a chance that an entire group of users of
this chemical compound will be poisoned. Obviously, the management of this
company will prefer a Type I error to a Type II error and, as a result, will set very
high levels of significance in its testing to get low β s.

When Type II error is preferred

Suppose, on the other hand, that making a Type I error involves disassembling an
entire engine at the factory, but making a Type II error involves relatively
inexpensive warranty repairs by the dealers. Then the manufacturer is more likely
to prefer a Type II error and will set lower significance levels in its testing.

Decide on which distribution to use in Hypothesis testing

After deciding what level of significance to use, our next task in hypothesis
testing is to determine the appropriate probability distribution. We have a choice
between the normal distribution, and the t distribution. The rules for choosing the
appropriate distribution are similar to those we encountered in the unit on
estimation. The Table below summarizes when to use the normal and t
distributions in making tests of means. Later in this unit, we shall examine the
distributions appropriate for testing hypotheses about proportions.

Remember one more rule when testing the hypothesized values of a mean. As in
estimation, use the finite population multiplier whenever the population is finite in
size, sampling is done without replacement, and the sample is more than 5 percent
of the population.

Conditions for using the Normal and t distributions in Testing Hypothesis

about means
When the Population When the Population
Standard Deviation is Standard Deviation is
known not known
Normal distribution, z Normal distribution, z –
Sample size n is larger than 30
– table table
Sample size n is 30 or less and
Normal distribution, z
we assume the population is t Distribution, t – table
– table
normal or approximately so

Learning Objective 2

Know about the Different Tests and Different Test Statistics

Two – tailed tests and One – Tailed tests

Two Tailed Tests:

A two-tailed test of a hypothesis will reject the null hypothesis if the sample mean
is significantly higher than or lower than the hypothesized population mean. Thus,
in a two-tailed test, there are two rejection regions. This is shown in figure 1 of

A two-tailed test is appropriate when the null hypothesis is µ = µ Ho (where µ Ho

is some specified value) and the alternative hypothesis is µ
µ Ho .

Example 1: Assume that a manufacturer of light bulbs wants to produce bulbs

with a mean life of µ = µ Ho = 1,000 hours. If the lifetime is shorter, he will lose
customers to his competitions; if the lifetime is longer, he will have a very high
production cost because the filaments will be excessively thick. In order to see
whether his production process is working properly, he takes a sample of the
output to test the hypothesis Ho; µ = 1,000. Because he does not want to deviate
significantly from 1,000 hours in either direction, the appropriate alternative
hypothesis is H1: µ
1,000, and he uses a two-tailed test. That is, he rejects the null hypothesis if the
mean life of bulbs in the sample is either too far above 1,000 hours or too far
below 1,000 hours.

However, there are situations in which a two-tailed test is not appropriate, and we
must use a one-tailed test.

Example 2: Consider the case of a wholesaler that buys light bulbs from the
manufacturer discussed earlier. The wholesaler buys bulbs in large lots and does
not want to accept a lot of bulbs unless their mean life is at least 1,000 hours or a
minimum of 1,000 hours. As each shipment arrives, the wholesaler tests a sample
to decide whether it should accept the shipment. The company will reject the
shipment only if it feels that the mean life is below 1,000 hours. If it fells that the
bulbs are better than expected (with a mean life above, 1,000 hours), it certainly
will not reject the shipment because the longer life comes at no extra cost. So the
wholesaler’s hypotheses are Ho: µ = 1,000 and H1: µ < 1,000 hours. It rejects
Ho only if the mean life of the sampled bulbs is significantly below 1,000 hours.
This situation is illustrated in the figure below. From this figure, we can see why
this test is called a left-tailed test (or a lower-tailed test).
In general, a left tailed (lower-tailed) test is used if the hypotheses are Ho: µ =
µ Ho. In such a situation, it is sample evidence with the sample mean significantly
below the hypothesized population mean that leads us to reject the null hypothesis
in favor of the alternative hypothesis. Stated differently, the rejection region is in
the lower tail (left tail) of the distribution of the sample mean, and that is why we
call this a lower-tailed test.

A left-tailed test is one of two kinds of one-tailed tests. As you have probably
guessed by now, the other kind of one-tailed test is a right-tailed test (or an upper-
tailed test). An upper-tailed test is used when the hypotheses are Ho: µ > µ Ho.
Only values of the sample mean that are significantly above the hypothesized
population mean will cause us to reject the null hypothesis in favor of the
alternative hypothesis. This is called an upper-tailed test because the rejection
region is in the upper tail of the distribution of the sample mean.
This is to remind you again that, in each example of hypothesis testing, when we
accept a null hypothesis on the basis of sample information, we are really saying
that there is no statistical evidence to reject it. We are not saying that the null
hypothesis is true. The only way to prove a null hypothesis is to know the
population parameter, and that is not possible with sampling. Thus, we accept the
null hypothesis and behave as if it is true simply because we can find no evidence
to reject it.

Classification of Test Statistics

Statistics used for Testing of Hypothesis can be classified as follows

A Large Samples (n > 30) – Attributes (proportions)

Description of Test Test Statistics Notes
P – Population

Test for specified proportion – Ps = Sample

infinite population proportion

Q = 1 – P, n sample
2 Test for specified proportion – P = Population
Finite Population proportion
Ps = Sample

Q = 1 –P, n – Sample

N – Population size
P1 -first sample

P2 -second sample
Test between proportions –
different Population Q1 = 1 – P, Q2 = 1-P2

n1- first sample size

n2 – second sample
P1 -first sample

P2 -second sample
Test between proportion –
same population Q1 = 1 – P, Q2 = 1-P2

n1- first sample size

n2 – second sample

B Large Samples – n > 30: Variable

Test Description of Test Test Statistics Notes

5 Test for specified mean µ – Population mean
– infinite population
µ s = Sample mean

σ = Population S.D

We can use Sample S.D

(s) also in case population

S.D. is not given
µ – Population mean

µ s = Sample mean

Test for specified mean σ = Population S.D

– Finite Population
We can use Sample S.D

(s) also in case population

S.D. is not given
P1 -first sample

P2 -second sample
Test between means –
different Population
Q1 = 1 – P, Q2 = 1-P2

n1- first sample size

n2 – second sample size

Test between Mean –

same population

Test Procedure

Step 1: State Null hypothesis (Ho) and alternate hypothesis (H1)

Step 2: State the level of significance. This gives you the tabulated normal / t –

Step 3: Select the appropriate test from the list given in 9.2 and next chapter 10
Step 4: Calculate the required values for the test

Step 5: Conduct the test

Step 6: Draw conclusion

If calculated value is < Tabulated Value accept Ho

If calculated value is > Tabulated Value Reject Ho

Learning Objective 3

Learn about Identifying the Right Test Procedure

How to identify the right statistics for the test.

Step 1: Check the sample size. If n > 30 it is large sample Test. If n ≤ 30 it is

small sample test (will be discussed in unit 10).

Step 2: Check whether the data is attribute or variable. If the words mean and S.D
are used, then it is test for variable, other wise it is test for attribute.

Step 3: Check whether it is a test for specified value or between values. If two
sample sizes are given, then it is between values, otherwise it is for specified

Step 4: If it is specified value test, check whether sample belongs to infinite or

finite population. If it is between values test, check whether samples are from
different population or same population.

Step 5: Select the appropriate test statistic.

Step 6: If the words improved, more, higher, less, lower, effective, efficient,
superior, inferior etc used then it is one-tailed test, otherwise it is two tailed test.


Example 1: Thompson press hypotheses that the average life of its latest web-
offset press is 14,500 hours. They know the SD of the press life is 2,100 hours.
From a sample of 25 presses, the company finds a sample mean of 13,000 hours.
At 0.01 significance level, should the company conclude that the average life of
the presses is less than the hypothesized 14,500 hours?

5. Null hypothesis Ho: µ = 14,500

Alternate hypothesis HA: M < 14,500 (one-tailed test)

6. Level of significance α = 0.01 ⇒ Ztab = 2.33

7. Test Statistics

8. Given µ = 14,500, µ s = 13,000, σ = 2,100, n = 25

Note: Although n < 25, population S.D is given, therefore it becomes Z



9. Test

10. Conclusion

Since Zcal (3.57) > Ztab (2.33) Ho is rejected

Example 2: Theater owners in India know that a hit movie ran for an average of
84 days with a standard deviation of 10 days in each city the movie was screened.
A particular movie distributor was interested in comparing the popularity of
movie in his region with that of the population. He randomly chose 75 theatres at
random in the region and found a popular movie ran for 81.5 days.

11. State appropriate hypotheses for testing whether there was significant
difference between theatres in the distributor’s region and the population.
12. At a 1% significance level, test these hypotheses.

13. Null hypothesis Ho: µ = 84 where µ = 84

Alternate hypothesis HA: µ
≠ 84
(two-tailed test)

14. Level of significance 1% ⇒ Ztab = 2.58

15. Test Statistic

16. Given µ = 84, µ s = 81.5, σ = 10, n = 75


17. Test

18. Conclusion

Since Zcal (2.165) < Ztab (2.58) Ho is accepted

Example 3: A ketchup manufacturer is in the process of deciding whether to

produce a new extra spicy brand of ketchup. The company’s market research team
found in a survey of 6000 households that 355 households would buy the extra
spicy brand. In an earlier more extensive study carried out 2 years ago showed
that 5% of the house holds would buy the brand then. At 2 % level of
significance, should the company conclude that there is an increased interest in
the extra spicy flavour?

19. Null hypothesis Ho: P = Ps

Alternate hypothesis HA: P < Ps (one-tailed test)

20. Level of significance 2 % ⇒ Ztab = 2.05

21. Test Statistics
22. Given P = 0.05, Ps = 355 / 6000, = 0.05513, n = 6000, Q = 1 – P = 0.95

∴ (PQ / n)1/2 = √(0.05×0.95)/6000=0.0028

23. Test

24. Conclusion

Since Zcal (2.08) > Ztab (2.05) Ho is rejected

Example 4: Microsoft estimated 10,000 potential software buyers 35% planning

to wait to purchase the new OS Windows Vista, until an upgrade has been
released. After an advertising campaign to reassure the public, Microsoft surveyed
3000 buyers and found 950 who are still skeptical. At 5% level of significance
can the company conclude that the population of skeptical people had decreased?
(Null hypothesis is rejected. Use z distribution).

25. Null hypothesis Ho: P0 = Ps

Alternate hypothesis HA: P0 > Ps

26. Level of significance 5% ⇒ Ztab = 1.645

27. Test Statistics

28. Given Ps = 950 / 3000 = 19 / 60 = 0.317, P = 0.35, Q = 0.65, n = 10,000, n

= 3000


29. Test
30. Conclusion

Since Zcal (4.52) > Ztab (1.645) Ho is rejected

⇒ Proportion of Sceptical people has significantly decrease.

Example 5: A machine is designed so as to back 200 ml of a medicine with a

standard deviation of 5ml. A sample of 100 bottles. When measured had a mean
content of 201.3ml Test whether the machine is functioning properly use 5% level
of significance.

31. Null hypothesis Ho: µ = µ s

Alternate hypothesis HA: µ

µ s (two-tailed test)

32. Level of significance 5% implies Ztab = 1.96

33. Test Statistics

34. Given µ = 200, µ s = 201.3, σ = 5, n = 100


35. Test

36. Conclusion

Since Zcal (2.60) > Ztab (1.96) Ho is rejected ⇒ The machine is not
functioning properly.

In this unit we discussed the four tests available for small samples. These tests can
be used for sample size (n ≤ 30) and samples whose population S.D are not
known. The different tests are illustrated with examples.


In the previous units we learned how to test hypotheses using data from either one or two
samples. We used one-sample tests to determine whether a mean or a proportion was
significantly different from a hypothesized value. In the two-sample tests, we examined
the difference between either two means or two proportions, and we tried to learn
whether this difference was significant.

Suppose we have proportions from five populations instead of only two. In this case, the
methods for comparing proportions described in for testing hypothesis for two-samples
do not apply; we must use the chi-square X2 test. chi-square X2 tests enable us to test
whether more than two population proportions can be considered equal.

Actually, chi-square X2 tests allow us to do a lot more than just test for the quality of
several proportions. If we classify a population into several categories with respect to two
attributes (such as age and job performance), we can then use a chi-square X2 test to
determine whether the two attributes are independent of each other.

Learning Objective 1

Understand the concept of Chi-square (X2) as a test of independence

Chi-square X2 as a test of independence

Characteristics of X2 test

• X2 test is based on frequencies and not on parameters.

• It’s a non-parametric test where no parameters regarding the rigidity of population
of populations are required.
• Additive property is also found in X2 test.
• X2 test is useful to test the hypothesis about the independence of attributes.
• The X2 test can be use in complex contingency tables.
• The X2 test is very widely used for research purposes in behavioural and social
sciences including business research.
• It is defined as ν = ∑ (0 – E)2 / E.
Degrees of Freedom:

The number of degrees of freedom for n observations is n – k and is usually denoted

by ν where k is the number of independent linear constraints imposed upon them.
Suppose we are asked to write any four numbers then we will have all the numbers of
our choice. If a restriction is applied or imposed to the choice that the sum of these
numbers should be 50; then the freedom of choice would be reduced to three only and
so the degrees of freedom would now be 3.

If a X2 is defined as the sum of the squares of n independent standardized normal

variates and the condition of the satisfaction of l linear relation is imposed upon them
(such as the estimation of some population parametric value etc.) then the effect of
these n constraints of (i) would be replaced by n – k. We have seen in (ii) that if the
sum of squares is taken about the sample mean instead of the population mean when
n is replaced by n-1 = ν , since one linear constraint had been imposed.

Restrictions in Applying X2 test

The sample observations should be independently and normally distributed. For this
either the parent population should be infinitely large (say, greater than 50) or
sampling should be done with replacement.

Constraints imposed upon the observations must be linear character

e.g., ∑Oi = ∑Ei.

The X2 distribution is essentially a continuous distribution but its character of

continuity is maintained only when the individual frequencies of the Variate values
remain ≥ 5. So in applying X2 test in the testing of the goodness of fit or in a
contingency table, the cell frequency should not be less than 5. In practical problems
we can combine a few values of small frequencies into one to get the pooled
frequency greater than 5.

Application of X2 test

X2 is used in testing: (i) the significance of sample variances, (ii) the goodness of fit
of a theoretical distribution, (iii) the independence in a contingency table and (iv)
whether the observed results are consistent with the expected segregations in breeding
experiments of Genetics.
Levels of significance

Tables have been prepared for the values of P, the probability of getting a value of X 2
≥ X02 where X2 is an observed value. From these tables, we can find the value of P
corresponding to an observed value if X2 and then proceed to test whether the
difference between observed and theoretical frequencies is significant or not. Smaller
the values of P, greater the divergence between fact and theory so that small values
lead us to suspect the hypothesis. Not only small values of P lead us to suspect the
hypothesis but a value of P very near to unity may also lead to a similar result. Thus if
P = 1, X2 = 0, showing that there is perfect agreement between fact and theory which
is a very improbable event. There are two conventional levels of significance.

1. If P < 0.05, we say that the observed value of X 2 is significant at 5 percent

level of significance.
2. Similar if P < 0.01, he value is significant at 1 % level.

The formula for calculating X2 is Σ (f0 – fe)2 / fe.

Where f0 is observed frequency

fe is expected frequency

Steps in solving X2 problems

1. Calculate the expected frequencies. In general the expected frequency for any cell
can be calculated from the following expression:
2. Take the difference between observed and expected frequencies and obtain the
squares of these differences (O – E)2
3. Divide the values obtained in step 2 by the respective expected frequency and add
all the values to get the value according to the formula Σ (f0 – fe)2 / fe.


After ascertaining the X2 value, the X2 table comprises of columns headed with
symbols ψ 0.05 for 5% level of significance, X20.01
for 1% level of significance and so on. The left hand side indicates the degrees of
freedom. If the calculated value of X2 falls in the acceptance region, the null
hypothesis HO is accepted and vice-versa.
Learning Objective 2

Know about the concept of Chi-square (X2) Distribution

Chi – Square Distribution

Which is the sum of the squares of n independent standard normal variates, following the
X2 distribution with n degrees of freedom.

Properties of X2 distribution

1. Mean of X2 distribution = Degree of freedom = ν

2. S.D. of X2 distribution = √2ν
3. Median of X2 distribution divides the area of the curve into two equal parts, each
part being 0.5.
4. Mode of X2 distribution is equal to degrees of freedom less 2 i,e., V-2.
5. X2 values are always positively skewed.
6. X2 values increases with the increase in the DF, there is a new ψ 2 distribution
with every increase in the no. of degrees of freedom.
7. The lowest value of X2 is zero and the highest is infinity α i,e. 0 < X2 < α .
8. When two chi-squares X12 and X22 are independent following X2 distribution with
n1 and n2 degrees of freedom, their sum X12 + X22 will follow ψ 2 distribution with
n1 + n2 degrees of freedom.
9. When n>30, √2X2 – (√2ν -1) approximately follows the standard normal


1. The frequencies used in chi-square test must be absolute and not in relative terms.
2. The total no. of observations collected for this test must be large.
3. Each of the observations which make up the sample of this test must be
independent of each other.
4. As X2 test is based wholly on sample data, no assumption is made concerning the
population distribution. In other words it is a non parametric-test.
5. X2 test is wholly dependent on degrees of freedom.
6. The expected frequency of any item or cell must not be less than 5, the
frequencies of adjacent items or cells should be polled together in order to make it
more than 5.
7. The data should be expressed in original units for convenience of comparison and
the given distribution should not be replaced by relative frequencies or
8. This test is used only for drawing inferences through test of the hypothesis, so it
cannot be used for estimation of parameter value.

Uses of X2 test

The X2 test is used broadly to:

• Test goodness of fit for one way classification or for one variable only.
• Test of independence or interaction for more than one row or column in the form
of a contingency table concerning several attributes
• Test of population Variance σ 2 through confidence intervals suggested by X2

Application of X2 – test

Tests for independence of attributes.

The number of degrees of freedom is given by (No. of rows – 1) x (No. of column –

The expected value is given by

Example 1: The following table gives the production in three shifts and the number of
defective goods that turned out in three weeks. Test at 5% level of significance whether
weeks and shifts are independent.

Shift 1 Week 2 Week 3 Week Total

I 15 5 20 40
II 20 10 20 50
III 25 15 20 60
Total 60 30 60 150


Observed Value (O) Expected Value (E) (O – E)2 (O – E)2/E

15 40 x 60 /150 = 16 1 0.0625
20 50 x 60/150 = 20 0 0.0000
25 60 x 60/150 = 24 1 0.0417
5 40 x 30/150 = 8 9 1.1250
10 50 x 30/150 = 10 0 0.0000
15 60 x 30/150 = 12 9 0.7500
20 40 x 60/150 = 16 16 1.0000
20 50 x 60 /150 = 20 0 0.0000
20 60 x 60/150 = 24 16 0.6667
X 3.6459

1. Null hypothesis Ho: The week and shifts are independent

Alternate hypothesis HA: They are dependent

2. Level of Significance 5% and D.O.F (3 – 1) (3 – 1) = 4 ⇒ X2tab = 9.49

3. Test Statistics
4. Test X2cal = 3.6459
5. Conclusion: Since X2cal (3.6459) < X2tab (9.49) Ho is accepted.

The attributes are independent.

Example 2: Out of 1000 people surveyed 600 belonged to urban area and rest to rural
area. Among 500 who visited other states 400 belonged to urban area. Test at 5% level of
significance whether area and visiting other states are dependent.

Solution: The given information can be tabulated as follows

Other States Urban Rural Total

Visited 400 100 500
Not Visited 200 300 500
Total 600 400 1000

(O – E)2
Observed Value (O) Expected Value (E) (O – E)
400 300 10000 33.33
200 300 10000 33.33
100 200 10000 50.00
300 200 10000 50.00
X2cal 1.66.66

1. Null hypothesis Ho: Area and visit are independent

Alternate hypothesis HA: They are dependent

2. Level of Significance 5% and D.O.F (2 – 1) (2 – 1)

3. Test Statistics
1. Test X2cal = 1.66.66
2. Conclusion: Since X2cal (1.66.66) > X2tab (3.845) Ho is rejected.

Test Of Goodness Of Fit

Degrees of freedom is n-1

Expected value = Average of the observed values.

Examples 3: A personal Manager is interested in trying to determine whether

absenteeism is greater on one day of the week than on another day of the week. He has
the following record for the past years.

Test whether absenteeism is uniformly distributed over the week.

Solution: If the absenteeism is uniformly distributed over the week, then expected No. of
absenteeism per day should be

E = 66 + 57 + 54 + 48 + 75 /5 = 60

Observed Value (O) Expected Value (E) (O – E)2 (O – E)2/E

66 60 36 0.6000
57 60 9 0.1500
54 60 36 0.6000
48 60 144 2.4000
75 60 225 3.7500
X cal 7.5000

1. Null hypothesis Ho: The attributes are independent

Alternate hypothesis HA: They are dependent

2. Level
3. Test Statistics

4. Test X2cal = 7.50

5. Conclusion: Since X2cal (7.5) < X2tab (9.49) Ho is rejected.

Absenteeism and days of week are independent.

Example 4: According to theory in Genetics the proportion of beans of A, B C and D

types in a generation should be 9:3:3:1. In an experiment with 1600 beans the frequency
of bean of A, B, C and D type was observed to be 882, 313, 287 and 118 respectively
Does the result support the theory.


1. By Null hypothesis. E = Total No. x Corresponding ratio

Observed Value (O) Expected Value (E) (O – E)2 (O – E)2/E

882 1600 x 19 / 10 = 900 324 0.36
313 300 169 0.56
287 300 169 0.56
118 100 324 3.24
X2cal 4.72

1. Test X2cal = 4.72

2. Conclusion: Since X2cal (4.72) < X2tab (7.81) Ho is rejected.

The result supports the theory.

Test for Specified Variance:

Suppose we want to test whether the population has a given variance σ 02, then


If the calculated value lie between K1 and K2 then H0 is accepted K1 and K2 values are
read from the table.

Example 5: The standard deviations of heights of plants is known to be 2 cms. Eight

randomly selected plants have heights 172, 156, 154, 163, 170, 169, 170 and 164 cms.
Test whether the sample standard deviation differs significantly?


X d = X – 160 d2
12 144
-4 16
-6 36
3 9
10 100
9 81
10 100
4 16
38 502


∴ nS2 = 321.5


X2 – Test is a non – parametric test. It is used to test the independence of

attributes, goodness of fit and specified variance. It assumes that samples are
drawn at random and external forces, if any, act on them in equal magnitude. The
sample size should be very large. None of the theoretical expected values
calculated should be less than five.


are given by these formulae

It is also known as variance Ratio test. It has two degrees of freedom, one for numerator
and another for denominator of the ratio. They are represented by

ν 1 = n1 – 1 and ν 2 = n2 – 1.

Learning Objective 1

Understand the F-Distribution and its Uses

Assumptions for F – test

The samples are simple random samples.

• They are independent of each other.

• The parent population from which they are drawn are normal.

Note: 1. If F ∼ F(ν 1, ν 2), then 1/F ∼ F (ν 2, ν 1)

1. n1 F = X2 for F distribution with (n1, n2) degrees of freedom.


Time taken to do a job by method I and Method II by workers is given below.

Method I 27, 23, 16, 20, 26, 22

Method II 33, 35, 34, 27, 42, 32, 38

Can we conclude that variance of time distribution for method I and method II are


Method I Method II
X d = X – 22 d2
27 5 25
23 1 1
16 -6 36
20 -2 4
26 4 11
22 0 0
Total 2 82
X d = X – 35 d2
33 -2 4
35 0 0
34 -1 1
27 -8 64
42 7 49
32 -3 1
38 3 9
Total -4 136
= 22.286

1. Null hypothesis Ho: σ 1

=σ 2

Alternate hypothesis H1: σ 1


σ 22

1. Level of significance 5 % and D.O.F (6,5) ⇒ Ftab = 4.95

2. Test Statistics

3. Test


1. Conclusion

Since Ftcal (1.37) < Ftab (4.95) Ho is accepted.

There is no significant difference.

Learning Objective 2

Learn the Concept of Analysis of Variance

ANOVA: ANOVA will enable us to test for the significance of the differences of
variances among more than two sample means. Using analysis of variance we will be
able to make inferences about whether our samples are drawn from populations having
the same mean.

Analysis of variance is useful in such situations as comparing the mileage achieved by

five different brands of gasoline, testing which of four different training methods produce
the fastest learning record, or comparing the first-year earnings of the graduates of half a
dozen different business schools. In each of these cases, we would compare the means of
more than two samples.

In statistical terms the difference between two statistical data is known as variance. When
two data are compared for any practical purpose, their difference is studied through the
techniques of Analysis of Variance. Initially the technique was applied in the field of
Zoology and Agriculture but in a later stage it was applied in other fields also. In analysis
of variance the degree of variance between two or more data as well as the factor
contributing towards the variance is studied.

In fact, analysis of variance is the classification and cross-classification of statistical data

with the view of testing whether the means of specific classification differ significantly or
they are homogeneous.

The analysis of variance is a method of splitting the total variation of data into constituent
parts which measure different sources of variations. The total variation is split up into the
following two-components.

a. Variance within the subgroups of samples.

b. Variation between the subgroups of the samples.

Total variance = Variance between the samples + Variance within the samples.

After obtaining the above two variations, these two variations are tested for their
significance by F-test which is also known as Variance Ration Test.

Objectives of Analysis of Variance (ANOVA)

The objectives are:

i. To obtain a measure of the total variation between or among the components,

ii. To find a measure of variation between or among the components. Then the
significance of difference between the variations in two series or more may be

In other words, with the help of the technique of analysis of variance we can test the
hypothesis that the means of all the components constituting a population are equal to the
mean of the population or that the samples have come from the same population.


The technique of analysis of variance is referred to as ANOVA. A table showing the

source of variance, the sum of squares, degrees of freedom, mean square (variance) and
the formula for the F-ratio is known as ANOVA table.

Computing of Test Statistics

The actual analysis of variance is carried out on the basis of ratio between the variances.
The variance ratio is obtained by dividing the variance between the samples by the
variance within the sample. The ratio forms the test statistic known as F-Statistics, i,e.,

F-Statistic = (Variance between the samples/Variance within the samples)


The underlying assumptions for the study of analysis of variance are:

i. Each of the samples is a simple random sample.

ii. Population from which the samples are selected are normally distributed.

iii. Each of the samples is independent of the other samples,

iv. Each of the population has the same variations and identical means.

v. The effect of various components are additive.

Classification of Analysis of Variance

The analysis of variance is mainly carried on under the following two classifications:

i. One way analysis of variance or one way classification.

ii. Two way analysis of variance or two way classified data or manifold classification.

ANOVA Table in One Way Analysis of Variance

‘ANOVA’ table presents the various results obtained while carrying out the analysis of
variance. A specimen of ANOVA table is given below.

Source of Variation Sum of Squares Degree of Freedom Mean Square

Between Samples SSC K–1 MSC

Within Samples SSE N–K MSE

Total SST N–1

Example 1: Below are given the yield (in Kg) per acre for 5 trial plots of varieties of

Plot No. Treatment

1 2 3 4
1. 42 48 68 80

2. 50 66 52 94

3. 62 68 76 78

4. 34 78 64 82

5. 52 70 70 66

Carry out an analysis of variance and state conclusion.

Plot No. (X1) (X2) (X3) (X4)
1 2 3 4
1. 42 48 68 80

2. 50 66 52 94

3. 62 68 76 78

4. 34 78 64 82

5. 52 70 70 66
Total 240 330 330 400

T = Sum of all the observations = 42 + 50 +…………+66 = 1300

T2 / N = 13002 / 20 = 84500

SST = Sum of squares of all observations – T2 / N

(Crude sum of squares of all observations) – (Correction Factor)

= (422 + 502 + 622 + 342 +…………..+ 662) – 84500 = 4236

SSE = SST – SSC = 4236 – 2580 = 1656



The degree of freedom = (K – 1, N – K) = (3, 16). [K is the number of columns and N is

the total number of observations.

The Analysis of Variance Table will become as shown:

Source of Variation Sum of Squares Degree of Freedom Mean Square

Between Samples SSC = 2580 K–1=3 MSC = 860

Within Samples SSE = 1656 N – K = 16 MSE = 103.5

Total SST N–1


The table value of F at 5% level of significance for (3, 16) df is 3.24 is less than the
calculated value of F. Therefore the null hypothesis is rejected. Therefore the treatments
do not have the same effect.

Two Way classifications

In the two way classification, observations are classified into groups on the basis of two

Procedure for carrying out the two way analysis of variance


1. (a) Assume the means of all columns are equal. That is the effects of all factors in
one kind of treatment are equal.

i,e., α 1 = α 2 = α 3 =………α c
(b) Assume the means of all rows are equal. That is, the effects of all factors in
the second kind of treatment are equal.

i,e., β 1 = β 2 = β 3 =………β r

1. Compute T = Sum of all values.

2. Find SST = Sum of squares of all observations – T2 / N
3. Find

Where ∑X1, ∑X2, ∑X3…. are column totals

4. Find

Where ∑ Xj1, ∑ Xj2…… are row totals

2. SSE = SST – SSC – SSR

1. MSC = SSC/MSE ; MSR=SSR/(r-1) ; MSE= SSE/{(c-4)(r-1)}

Where c is the no. of columns and r is no. of rows



Degrees of freedom for Fc = {c-1, (c-1) (r-1)}

Degrees of freedom for Fr = {c-1, (c-1) (r-1)}

Fc is for column wise comparison and

Fr is for row wise comparison

If Fc < table value of F then α 1 = α 2 = α 3 =………

If Fr < table value of F then β 1 = β 2 = β 3 =………

Anova Table for two way Analysis of Variance

Source of Variation Sum of Squares d.f. Mean Square F. Ratio

Between Columns SSC c–1 MSC
Within Rows SSR r–1 MSR
Residual SSE (c-1) x (r -1) MSE
Total SST N–1


Three varieties of crops A, B, C are tested in a randomized block design with four
replications – The yields are given below:

Variet Replications
y 1 2 3 4
A 6 4 8 6
B 7 6 6 9
C 8 5 10 9

Test whether there is difference between replications. Test also whether varieties differ

Variet Replications Total

y 1 2 3 4
A 6 4 8 6 24
B 7 6 6 9 28
C 8 5 10 9 32
Total 21 15 24 24 84

N = 12, T = sum or all values = 6 + 7 +8 + 4 + 6 + 5 + 8 + 6 + 10 + 6 + 9 + 9 = 84


SST = sum of square of all values – T2 / N

= 62+72+82+42+62+52+82+62+102+62+92+92 – 588 = 36

SST = 36

For columns, SSC



For Rows, SSR

SSR = 8


SSE = SST – SSC – SSR = 36 – 18 – 8 = 10


Source of
Sum of Squares d.f. Mean Square F.Ratio
Between Columns SSC = 18 c–1=3 MSC = 6
Fc = 6/1.667 = 3.6
Within Rows SSR = 18 r–1=2 MSR = 4
Fr = 4/1.667 = 2.4
Residual SSE = 10 (c-1) x (r -1) = 6 MSE = 1.667
Total SST = 36 N – 1 = 11
Between columns:

DF (3,6), Table value of F = 4.757 @ α = 0.05

Calculated value of F = 3.6 < Table value

Therefore we accept the hypothesis that there is no significant difference between


Between rows:

DF(2,6), Table value of F = 5.143

Calculated F value is 2.4 < Table value

Therefore we accept the hypothesis that there is no significant difference between the


F-test is used to test the equality of two variance. ANOVA is used to test the
equality of several means using the relation σ x = σ / √n.


Both correlation and regression are used to measure the strength of relationships between

The following statistical tools measure the relationship between the variable analyzed in
social science research.

1. Correlation

a. Simple correlation – Here the relationship between two variables are studied.

b. Partial correlation – Here the relationship of any two variables are studied,
keeping all others constant.
c. Multiple correlation – Here the relationship between variables are studied

2. Regression

a. Simple regression

b. Multiple regression

3. Association of Attributes

Correlation measures the relationship (positive or negative, perfect) between the two
variables. Regression analysis considers relationship between variables and estimates the
value of another variable, having the value of one variable. Association of Attributes
attempts to ascertain the extent of association between two variables.

Learning Objective 1

Understand the Concept of Correlation


When two or more variables move in sympathy with other, then they are said to be
correlated. If both variables move in the same direction then they are said to be positively
correlated. If the variables move in opposite direction then they are said to be negatively
correlated. If they move haphazardly then there is no correlation between them.

Correlation analysis deals with

1. Measuring the relationship between variables.

2. Testing the relationship for its significance.
3. Giving confidence interval for population correlation measure.

Causation and Correlation

The correlation between two variables may be due to the following causes,

i) Due to small sample sizes. Correlation may be present in sample and not in

ii) Due to a third factor. Correlation between yield of rice and tea may be due to a
third factor “rain”

Types of Correlation
Types of correlation are given below

a. Positive or Negative

b. Simple, Partial and Multiple

c. Linear and Non-linear

Positive correlation: Both the variables (X and Y) will vary in the same direction. If
variable X increases, variable Y also will increase; if variable X decreases, variable Y
also will decrease. Negative Correlation: The given variables will vary in opposite
direction. If one variable increases, other variable will decrease.

Simple, Partial and Multiple correlations: In simple correlation, relationship between two
variables are studied. In partial and multiple correlations three or more variables are
studied. Three or more variables are simultaneously studied in multiple correlations. In
partial correlation more than two variables are studied, but the effect on one variable is
kept constant and relationship between other two variables is studied.

Linear and Non-Linear correlation: It depends upon the constancy of the ratio of change
between the variables. In linear correlation the percentage change in one variable will be
equal to the percentage change in another variable. It is not so in non linear correlation.

Measures of correlation

i) Scatter Diagram.

ii) Karl Pearson’s correlation coefficient.

iii) Spearman’s Rank correlation coefficient.

Scatter Diagram

The ordered pair of observed values are plotted on x y plane as dots. Therefore it is also
known as Dot Diagram. It is diagrammatic representation of relationship.

If the dots lie exactly on a straight line that runs form left bottom to right top, then the
variables are said to be perfectly positively correlated (fig.i).

If the dots lie close to a straight line that runs from left bottom to right top, then the
variables are said to be positively correlated (fig.ii).
If the dots lie exactly on a straight line that runs from left top to right bottom then the
variables are said to be perfectly negatively correlated (fig iii).

If the dots lie very close to a straight line that runs from left top to right bottom then the
variables are said to be negatively correlated (fig iv).

If the dots lie all over the graph paper then the variables have zero correlation (fig v).

Scatter diagram tells us the direction in which they are related and does not give any
quantitative measures for comparison between sets of data.

Karl Pearson’s Correlation Coefficient

It is defined as

1. i. …………………………….(A)

n – number of paired observations
∑xy / N is called covariance of x and y. The other forms of this formula are

ii. ii.

For all practical purpose we can conveniently use form D. Whenever summary
information is given choose proper form from A to C.

Properties Of Karl Pearson’s Correlation Coefficient.

• Its value always lies between – 1 and 1.

• It is not affected by change of origin or change of scale.
• It is a relative measure (does not have any unit attached to it)

Factors influencing the size of Correlation Coefficient

The size of r is very much dependent upon the variability of measured values in the
correlation sample. The greater the variability, the higher will be the correlation,
everything else being equal.

The size of r is altered when researchers select extreme groups of subjects in order to
compare these groups with respect to certain behaviors. Selecting extreme groups on one
variable increases the size of r over what would be obtained with more random sampling.

Combining two groups which differ in their mean values on one of the variables is not
likely to faithfully represent the true situation as far as the correlation is concerned.
Addition of an extreme case (and conversely dropping of an extreme case) can lead to
changes in the amount of correlation. Dropping of such a case leads to reduction in the
correlation while the converse is also true. (Source: Aggarwal.Y.P, Statistical Methods,
Sterling Publishers Pvt Ltd., New Delhi, 1998, p.131).


Example 1: Find Karl Pearson’s Correlation Coefficient, given

X 20 16 12 8 4
Y 22 14 4 12 8

X Y X2 Y2 XY
20 22 400 484 440
16 14 256 196 224
12 4 144 16 48
8 12 64 144 96
4 8 16 64 32
∑X = 60 ∑Y = 60 ∑X = 880 ∑Y = 904 ∑XY = 840
2 2

Applying the formula for r and substituting the respective values from the above
table we get r as:

Example 2: Calculate Karl Pearson Coefficient of Correlation from the following data:

Year 1985 1986 1987 1988 1989 1990 1991 1992

Index of Production 100 102 104 107 105 112 103 99
Number of unemployed 15 12 13 11 12 12 19 26


Year Index of x2 No. of y2 xy

Production X X– unemployed Y–
1985 100 -4 16 15 0 0 0
1986 102 -2 4 12 -3 9 +6
1987 104 0 0 13 -2 4 0
1988 107 +3 9 11 -4 16 - 12
1989 105 +1 1 12 -3 9 -3
1990 112 +8 64 12 -3 9 - 24
1991 103 -1 J 19 +4 16 -4
1992 99 -5 25 26 + 11 121 - 55
∑x2 = ∑y2 = ∑xy =
∑X = 832 ∑x = 0 ∑Y = 120 ∑y = 0
120 194 -92

= 104 = 15

Therefore a correlation between production and unemployed is negative.

Example 3: Calculate Correlation Coefficient from the following data:

X 50 60 58 47 49 33 65 43 46 68
Y 48 65 50 48 55 58 63 48 50 70

Using the formula for calculating r as

And substituting values we get r = 0.611

Example 4: In a Bivariate data on x and y variance of x = 49, variance of y = 9 and

covariance (∑x,y) = -17.5. Find coefficient of correlation between x and y.

Solution: we know r=∑xy/ Nσ xσ y

There is a high negative correlation.

Example 5: Ten observation in Weight (x) and Height (y) of a particular age group gave
the following data.

Find “r”

Solution: we know

Correlation is practically nil.

Probable Error

It measures the extent to which correlation coefficient is dependable. It is an old measure

of testing the reliability of “r”. It is given by

P.E = (0.6475) [1 – r2] / √n

Where “r” is measured from sample of size n.

It is used to

i) interpret the value of r

a) If r < P.E, then it not at all significant.

b) If r > 6 P.E, then “r” is highly significance.

c) If P.E < r < 6 P.E, we can not say anything about the significance of “r”

ii) Construct confidence limits within which population “P” is expected to lie.
Conditions under which P.E can be used.

1. Samples should be drawn from a normal population.

2. The value of “r” must be determined from sample
3. Samples must have been selected at random

Example 6

If r = 0.6 and N = 64, a) Interpret ‘r’ b) find the limits within which ‘ρ ‘ is suppose to lie.


a) 6 P.E = 6 x 0.054 = 0.324

Since r (0.6) > 6 P.E

It is highly significant

b) Limits for population “ρ ”

= 0.6 ± 0.054

= 0.546 – 0.654

Spearman’s Rank Correlation Coefficient

Karl Pearson’s correlation coefficient assumes that

i) Samples are drawn from a normal population.

ii) The variables under study are affected by a large number of independent
causes so as to form a normal distribution. When we do not know the shape of
population distribution and when the data is qualitative type Spearman’s Ranks
correlation coefficient is used to measure relationship.

It is defined as

Where D is the difference between ranks assigned to the variables. Value of

ρ lies between – 1 and +1 and its interpretation is same as that of Karl
Pearson’s correlation coefficient.
There are 3 types of problems

i. Ranks are assigned.

ii. Ranks are to be assigned and there is no tie between ranks.

iii. When there is tie between ranks.

i. When ranks are assigned already

Example 7: In a singing competition, two judges assigned the following ranks for 7
candidates. Find Spearman’s rank correlation coefficient.

Competitor 1 2 3 4 5 6 7
Judge I 5 6 4 3 2 7 1
Judge II 6 4 5 1 2 7 3


Competitor R1 (Judge 1) R2 (Judge 2) D = R1 – R2 D2

1 5 6 -1 1
2 6 4 -2 4
3 4 5 -1 1
4 3 1 2 4
5 2 2 0 0
6 7 7 0 0
7 1 3 2 4

Example 8: Rank Difference Coefficient of Correlation (Case of No Ties)

Score on Score on Rank Of Rank on Difference
Test I Test II Test I Test II between squared
X Y R1 R2 D D2
A 16 8 2 5 -3 9
B 14 14 3 3 0 0
C 18 12 1 4 -3 9
D 10 16 4 2 2 4
E 2 20 5 1 4 16
N=5 ∑D2 = 38

Applying the formula of Regulations we get

Relation between x and y is very high and inverse.

Relationship between score on Test I & II is very high and inverse.

iii) Where ranks are repeated

Example 9: The sales statistics of 6 sales representatives in two different localities. Find
whether there is a relationship between buying habits of the people in the localities.

Representative 1 2 3 4 5 6
Locality I 70 40 65 110 60 20
Locality II 70 30 80 100 90 20


Representative Sales in Locality I R1 Sales in locality II R2 D = R1-R2 D2

1 2 4 -2 4
2 5 5 0 0
3 3 3 0 0
4 1 1 0 0
5 4 2 2 4
6 6 6 0 0
0 8

There is high positive correlation between buying habits of the locality people.
iii When Ranks are repeated

Example 10

Find rank correlation coefficient for the following data.

Student A B C D E F G H I J
Score on Test I 20 30 22 28 32 40 20 16 14 18
Score on Test II 32 32 48 36 44 48 28 20 24 28

Score on Score on Rank Of Rank on Difference
Test I Test II Test I Test II between squared
X Y R1 R2 D D2
A 20 32 6.5 5.5 0 1.00
B 30 32 3 5.5 - 2.5 6.25
C 22 48 5 1.5 3.5 12.25
D 28 36 4 4 0 0
E 32 44 2 3 - 1.0 1.00
F 40 48 1 1.5 - 0.5 0.25
G 20 28 6.5 7.5 - 1.0 1.00
H 16 20 9 10 - 1.0 1.00
I 14 24 10 9 1.0 1.00
J 18 28 8 7.5 0.5 0.25
N = 10 ∑D2 = 24

mi represents the number of times a rank is repeated

Testing of Correlation

“t” test is used to test correlation coefficient. Height and weight of a random sample of
six adults

Height (cm) 170 175 176 178 183 185

Weight (Kg) 57 64 70 76 71 82

It is reasonable to assume that these variables are normally distributed, so the Karl
Pearson Correlation coefficient is the appropriate measure of the degree of association
between height and weight. R = 0.875

Hypothesis test for Pearson’s population correlation coefficient

Ho:ρ = 0 This implies no correlation between the variables in the population

H1: ρ > 0 This implies that there is positive correlation in the population (increasing
height is associated with increasing weight) 5% significance level is taken

Statistic “t” test=√(n-2)/(1-r2) ; r=0.875 ; Test statistic=√(6-2)/(1-0.8752)=3.61

Table value of 5% significance level and 4 degree of freedom (6-2) = 2.132.

Since the calculated value is more than the table value. Null hypothesis is rejected. There
is significant positive correlation between height and weight.

Partial Correlation

Partial Correlation is used in a situation where three and four variables involved. Three
variables such as age, height and weight. Correlation between height and weight can be
computed by keeping age constant. Age may be the important factor influencing the
strength of relationship between height and weight. Partial Correlation is used to keep
constant the effect of age. The effect of one variable is partialled out from the correlation
between other two variables. This statistical technique is known as partial correlation.

Correlation between variables x and y is denoted as rxy

Partial Correlation is denoted by the symbol r12.3. Here correlation between variable 1 and
2 keeping 3rd variable constant.
r12.3 = Partial correlation between variables 1 and 2 keeping 3rd constant

r12 = correlation between variables 1 and 2

r13 = correlation between variables 1 and 3

r23 = correlation between variables 2 and 3



Multiple Correlation

Three or more variables are involved in multiple correlations. The dependent variable is
denoted by X1 and other variables are denoted by X2, X3 etc. Gupta S.P, has expressed
that “the coefficient of multiple linear correlation is represented by R1 and it is common
to add subscripts designating the variables involved. Thus R1.234 would represent the
coefficient of multiple linear correlations between X1 on the one hand X2, X3 and X4 on
the other. The subscript of the dependent variable is always to the left of the point:

The coefficient of multiple correlations for r12, r13 and r23 can be expressed

Coefficient of multiple correlations for R1.23 is the same as R1.32

A coefficient of multiple correlation lies between 0 and 1. If the coefficient of multiple

correlation is 1, it shows that the correlation is prefect. If it is 0, it shows that there is no
linear relationship between the variables. The coefficient of multiple correlation are
always positive in sign and range from 0 to + 1.

Coefficient of multiple determination can be obtained by squaring R1.23. Alternative

formula for computing R1.23 is:
Similarly alternative formulas for R1.24 and R1.34 can be computed

The following formula can be used to determine a multiple correlation coefficient with
three independent variables.

Multiple correlation analysis measures the relationship between the given variables. In
this analysis the degree of association between one variable considered as the dependent
variable and a group of other variables considered as the independent variables.

Example 11: The following zero order correlation coefficients are given

r12 = 0.98; r13 = 0.44 r23 = 0.54

Calculate multiple correlation coefficient treating first variable as dependent and second
and third variables as independent. (source: Gupta S.P, Statistical Method)


First variable is dependent.

Second and third variables are independent.

Using the formula for multiple correlation coefficient for R1.23 we get:

Learning Objective 2

Know the Concept of Regression Analysis

Regression is defined as, “the measure of the average relationship between two or more
variables in terms of the original units of the data.”

Correlation analysis attempts to study the relationship between the two variables x and y.
Regression analysis attempts to predict the average x for a given y. In Regression it is
attempted to quantify the dependence of one variable on the other. Example: There are
two variables x and y. y depends on x. The dependence is expressed in the form of the

Regression Analysis

Regression Analysis used to estimate the values of the dependent variables from the
values of the independent variables.

Regression analysis is used to get a measure of the error involved while using the
regression line as a basis for estimation.

Regression coefficient is used to calculate correlation coefficient. The square of

correlation that prevails between the given two variables.

Regression Lines

For a set of paired observations there exist two straight lines. The line drawn such that
sum of vertical deviation is zero and sum of their squares is minimum, is called
Regression line of y on x. It is used to estimate y – values for given x – values. The line
drawn such that sum of horizontal deviation is zero and sum of their squares is minimum,
is called Regression line of x on y. it is used to estimate x – values for given y – values.
The smaller angle between these lines, higher is the correlation between the variables.

The regression lines always intersect at ( )

The regression lines have equation,

(i) The regression equation of y on x is given by

Y – = byx (X –)

(ii) The regression equation of x on y is given by

X– = bxy (Y – )


The regression equations found by the above conditions is said to fitted by method of
least squares. byx and bxy are called regression coefficients.

About Regression coefficient

• byx . bxy ≤ 1
• If byx is –ve, then bxy is also –ve and r is –ve.
• They can also be expressed as

• It is an absolute measure.

Differences Between Correlation Coefficient And Regression Coefficient

Correlation Coefficient Regression Coefficient

rxy = ryx byx = bxy
if byx can be greater than one, but bxy must be less than
-1< r <1
one such that byx.bxy<1
It has no units attached to it It has unit attached to it
There exist nonsense correlation There is no such nonsense regression
It is not based on cause and
It is based on cause and effect relationship
effect relationship
It indirectly helps in estimation It is meant for estimation
Examples :

Example 11: Find regression equation from the following data

Age of Husband 18 19 20 21 22 23 24 25 26 27
Age of Wife 17 17 18 18 19 19 19 20 21 22

And hence calculate correlation coefficient.


Age of husband (x) dx = x-22 dx2 Age of wife (y) dy = y-19 dy2 dx dy
18 -4 16 17 -2 4 8
19 -3 9 17 -2 4 6
20 -2 4 18 -1 1 2
21 -1 1 18 -1 1 1
22 0 0 19 0 0 0
23 1 1 19 0 0 0
24 2 4 19 0 0 0
25 3 9 20 1 1 3
26 4 16 21 2 4 8
27 5 25 22 3 9 15
Total 225 5 85 190 0 24 43

Regression equation of Y on X is

Y – = byx (X – )

⇒ Y – 19 = 0.521 (X – 22.5)

⇒ Y = 0.521X + 7.2775

Regression Equation of X and Y is

r= = 0.966

Example 12: In a correlation study we have the following data.

Series X Series Y
Mean S.D 65 67
S.D 2.5 3.5

Find the two regression equations.


Regression equation of x and y

Standard Error of Estimate

The standard error of estimates helps to measure the accuracy of the estimated figures in
regression analysis. If the value of the standard error of estimate is small, it shows that
the estimate provided by the regression equation is better and closer. If standard error of
estimate is zero, it shows that there is no variation about the line and the correlation will
be perfect. “The standard error of estimate uses to ascertain how good and representative
the regression line is as a description of the average relationship between two series.

The standard error of regression of X values from Xc is:



Example 13

1. The following results were worked out from scores in Statistics and Mathematics
in a certain examination.
Scores in Statistics (X) Scores in Mathematics (Y)
Mean 40 48
Standard Deviation 10 15

Karl Pearson’s correlation coefficient between x and y is = + 0.42

Find the regression lines x on y and y on x. Use the regression lines to find the value of y
when x = 50 and value of x when y = 30.


Given the following data:

= 40; = 40; σ x = 10; σ y = 15; r = 0.42

The regression line x on y:

Is (X – ) = r σ x / σ y (Y – )………….(1)

The regression line y on x: is

Is (Y – ) = r σ y / σ x (X – )………….(2)

Therefore substituting the values we get the respective equation as:

X = 0.279y + 26.608……….(3) And

Y = 0.63 x + 22.80…………(4)

Therefore when y = 30; x =35.518 using equation (3) and

When x =50 y = 54.3 by using equation (4)

1. From the following data obtain the two regression equations

X 12 4 20 8 16
Y 18 22 10 16 14

Estimate Y for X = 15 and estimate X for Y = 20


= (12 + 4 + 20 + 8 + 16)/ 5 =12 = mean of X

= (18 + 22 + 10 + 16 + 14) / 5 = 16 = mean of Y

X– Y–
X Y (X – )2 (Y – )2 (X – ) (Y – )
X – 12 Y – 16
8 0 2 0 4 0
4 22 -8 6 64 36 - 48
10 8 -6 64 36 - 48
8 16 -4 0 16 0 0
14 4 -2 16 4 -8
160 80 - 104


Regression equation X on Y is given by

X – 12 = – 1.3 (Y – 16)

Therefore X = 32.8 – 1.3Y

When Y = 20 X = 32.8 – 1.3 x 20 = 6.8

Regression equation Y on X is given by

Y – 16 = – 0.65 (X – 12)

Therefore Y = 23.8 – 0.65X

When X = 15 Y = 23.8 – 0.65 x 15 = 14.05

Multiple Regression Analysis

Multiple Regression Analysis is an extension of two variable regression analysis. In this

analysis, two or more independent variables are used to estimate the values of a
dependent variable, instead of one independent variable.

The objective of multiple regression analysis are:

• To derive an equation which provides estimates of the dependent variable from

values of the two or more variables independent variables.
• To obtain the measure of the error involved in using the regression equation as a
basis of estimation.
• To obtain a measure of the proportion of variance in the dependent variable
accounted for or explained by the independent variables.
Multiple regression equation explains the average relationship between the given
variables and the relationship is used to estimate the dependent variable. Regression
equation refers the equation for estimating a dependent variable.

Example 14: Estimating dependent variable X1 from the independent variables X2,
X3………….. It is known as regression equation of X1 on X2, X3…………..

Regression equation, when three variables are involved, is given below:

X1.23 = a1.23 + b1.23 X2 + b13.2 X3

Where X1.23 = estimated value of the dependent variable

X2 and X3 = independent variables.

a1.23 = (Constant) the intercept made by the regression plan. It gives the value

of the dependent variable, when all the independent variables assume a

value equal to zero.

b1.23 and b13.2 = Partial regression coefficients or net regression coefficients.

b1.23 = measures the amount by which a unit change in X2 is expected to affect

X1 when X3 is held constant.

Deviations Taken From Actual Means

X1.23 = b1.23 X2 + b13.2 X3

X1 = (X1 – )

X2 = (X2 – )

X3 = (X3 – )

b1.23 and b13.2 can be obtained by solving the following equations.

Σ X1X2 = b1.23 X22 + b13.2 X2 X3

Σ X1X2 = b1.23
Σ X2 X3 + b13.2
Σ X3

Regression equation of X3 and X2 and X1 is:

Reliability of Estimates

Reliability of estimates test the estimated value obtained by applying regression equation,
whether the estimated value is very close to actual observed value. Standard error uses to
measure the closeness of estimate derived from the regression equation to actual observed
values. The measure of reliability is an average of the deviations of the actual value of
non-dependent variable from the estimate from the regression equation. Determining the
accuracy of estimates from the multiple regression is reliability of estimates. It is also
known as standard error of estimate.

Standard error of estimate of X1 on X2 and X3 is given below:


S1.23 = Standard error of estimate X1 on X2 and X3

Xlast = Estimated value of X1 as calculated from the regression equations.

Application of Multiple Regression

Multiple regression can be applied to test the factors such as export elasticity, import
elasticity and structural change (contribution of manufacturing sector towards GDP)
influencing over employment.

Employment is dependent variable. Similarly researchers can attempt to use multiple

regression in their research work appropriately.


In this unit we studied the concept of correlation and regression and the different types of
correlation and regression. We saw how regression helps us to study unknown variables
with the help of known variables. It also establishes reliability measure for estimated


The growing competition, rapidity of change in circumstances and the trend towards
automation demand that decisions in business are not based purely on guesses and
hunches rather on a careful analysis of data concerning the future course of events. The
future is unknown to us. Yet every day we are forced to make decisions involving future
and therefore uncertainty. Great risk is associated with business affairs. All businessmen
are forced to make forecast regarding business activities.

Success in business depends upon successful forecasts of business events. In business or

trade the importance of forecasting is so great, that when he enters into the business
world, he really enters the profession of forecasting. In recent times, considerable
research has been conducted in this field. Attempts are being made to make forecasting as
scientific as possible.

Business forecasting as such is not a new development. Every businessman must

forecast; even if his whole product is sold before production. Forecasting has always been
necessary. What is new in the attempt to put forecasting on a scientific basis, i,e., to
forecast by reference to past history and statistics rather than by pure intuition and guess-

One of the most important task before businessmen and economists these days are to
make estimates for the future. For example, a business man is interested in finding out his
likely sales next year or as long term planning in next five or ten years so that he could
adjust his production accordingly and avoid the possibility of either inadequate
production to meet the demand or unsold stocks. Similarly, an economist is interested in
estimating the likely population in the coming years so that proper planning can be
carried out with regard to jobs for the people, food supply etc. First step in making
estimates for the future consists of gathering information from the past. In this connection
we usually deal with statistical data which are collected, observed or recorded at
successive intervals of time. Such data are generally referred to as Time series. Thus
when we observe numerical data at different points of time the set of observations is
known as time series.

Learning Objective 1

Understand the Concepts of Business Forecasting

Business Forecasting

Business forecasting refers to the analysis of past and present economic conditions with
the object of drawing inferences about probable future business conditions. The process
of making definite estimates of future course of events is referred to as forecasting and
the figure or statements obtained from the process is known as ‘forecast’ future course of
events is rarely known. In order to be assured of coming course of events, help is taken of
an organized system of forecasting. These are two aspects of scientific business

i. Analysis of past economic conditions: For this purpose, the components of active series
are to studied. The secular trend will show how the series has been moving in the past
and what its future course is likely to be over a long period. The cyclic fluctuations would
reveal whether the business activity is subjected to boom or depression. The seasonal
fluctuations would indicate the seasonal changes in the business activity.

ii. Analysis of present economic conditions: The object of analyzing present economic
conditions is to study those factors which affect the sequential changes expected on the
basis of the past conditions. Such factors are new inventions, changes in fashion, changes
in economic and political spheres, economic and monetary policies of the Government,
war etc. These factors may affect and alter the duration of trade cycle. Therefore it is
essential to keep in mind the present economic conditions since they have an important
bearing on the probable future tendency.

Objectives of Forecasting in Business

Forecasting is a part of human conduct. Businessmen have also to look to the future.
Success in business depends on correct predictions. In fact when a man enters business,
he automatically takes with it the responsibility for attempting to forecast the future and
to a very large extent his success or failure would depend upon the ability to forecast
successfully the future course of events. Since without same element of continuity
between past, present and future, there would be little possibility of successful prediction.
But history is not likely to repeat itself and we would hardly expect economic conditions
next year or over the next ten years to follows a clear cut prediction. Yet, frequently past
patterns prevail sufficiently to justify using the past as a basis for predicting the future.

A businessman cannot afford to base his decision on guesses. Forecasting helps a

businessman in reducing the areas of uncertainty that surround management decision
making with respect to costs, sales, production, profits, capital investment, pricing,
expansion of production, extension of credit, development of markets, increase of
inventories and curtailment of loans. These decisions cannot be made off-hand. They are
to be based on present indications of future conditions.

While forecasting, we should know that it is impossible to forecast the future precisely –
these always time must be same range of error allowed in the forecast. Statistical
forecasts are those in which we can use the mathematical theory of probability to measure
the risks of errors in predictions.

Prediction, Projection and Forecasting

A great amount of confusion seem to have grown up in the use of words ‘forecast’,
‘prediction’ and ‘projection’. A prediction is an estimate based solely in past data of the
series under investigation. It is purely mechanical extrapolation. A projection is a
prediction where the extrapolated values are subjects to a certain numerical assumptions.
A forecast is an estimate which relates the series in which we are interested to external
factors. Forecasts are made by estimating future values of the external factors by means
of prediction, projection or forecast and from these values calculating the estimate of the
dependent variable.

Characteristics of Business Forecasting

i. Based on past and present conditions: The business forecasting is based on past and
present economic condition of the business. To forecast the future, various data,
information and facts concerning to economic condition of business for past and present
are analyzed.

ii. Based on mathematical and statistical methods: The process of forecasting includes
the use of statistical and mathematical methods. By using these methods the actual trend
which may take place in future can forecasted.

iii. Period: The forecasting can be made for long term, short term, medium term or any
specific term.

iv. Estimation of future: The business forecasting is to forecast the future regarding
probable economic conditions.

v. Scope: The forecasting can be physical as well as financial.

Steps in Forecasting

The forecasting of business fluctuations consists of the following steps:

i. Understanding why changes in the past have occurred: One of the basic principles of
statistical forecasting is that the forecaster should use the data on past performance. The
current rate and changes in the rate constitute the basis of forecasting. Once they are
known various mathematical techniques can develop projections from them. If an attempt
is made to forecast business fluctuations without understanding why past changes have
taken place, the forecast will be purely mechanical based solely upon the application of
mathematical formulae and subject to series error.

ii. Determining which phases of business activity must be measured: After it is knowing
why business fluctuations have occurred, it is necessary to measure certain phase of
business activity in order to predict what changes will probably follow the present level
of activity.

iii. Selecting and compiling data to be used as measuring devices: These is an

independent relationship between the selection of statistical data and determination of
why business fluctuations occur. Statistical data cannot be collected and analyzed in an
intelligent manner unless there is a sufficient understanding of business fluctuations. It is
important that reasons for business fluctuations be stated in such a manner that is possible
to secure data that are related to the reasons.

iv. Analyzing the data: Lastly, the data are analyzed in the light of understanding of the
reason why change occurs. For example, if it is reasoned that a certain combination of
forces will result in a given change, the statistical part of the problem is to measure these
forces, from the data available, to draw conclusions on the future course of action. The
methods of drawing conclusions may be called forecasting techniques.

Learning Objective 2

Understand the Different Methods Used for Business Forecasting

Methods of Business Forecasting

Almost all the businessmen make forecasting about the business conditions related to
their business. In recent years scientific methods of forecasting have been developed. The
base of scientific forecasting is statistics. To handle the increasing variety of managerial
forecasting problems, several forecasting techniques have been developed in recent years.
Forecasting techniques vary from simple expert guesses to complex analysis of mass
data. Each techniques has its special use, and care must be taken to select the correct
technique for a particular situation. Before applying a method of forecasting the
following questions should be answered:
i. What is the purpose of the forecast how is it to be used?

ii. What are the dynamics and components of the system for which the forecast will be

iii. How important is the past in estimating the future?

Following are the main methods of business forecasting:-

1. Business Barometers

Business indices are constructed to study and analyze the business activities on
the basis of which future conditions are predetermined. As business indices are
the indicators of future conditions, so they are also known as “Business
Barometers” or ‘Economic Barometers’. With the help of these business
barometers the trend of fluctuations in business conditions are made known and
by forecasting a decision can be taken relating to the problem. The construction of
business barometer consists of gross national product, wholesale prices, consumer
prices, industrial production, stock prices, bank deposits etc. These quantities may
be concerted into relatives on a certain base. The relatives so obtained may be
weighted and their average be computed. The index thus arrived at in the business

The business barometers are of three types:

i. Barometers relating to general business activities: it is also known as

general index of business activity which refers to weighted or composite
indices of individual index business activities. With the help of general index
of business activity long term trend and cyclical fluctuations in the ‘economic
activities of a country are measured but in some specific cases the long term
trends can be different from general trends. These types of index help in
formation of country’s economic policies.

ii. Business barometers for specific business or industry: These barometers are
used as the supplement of general index of business activity and these are
constructed to measure the future variations in a specific business or industry.

iii. Business barometers concerning to individual business firm: This type of

barometer is constructed to measure the expected variations in a specific
individual firm of an industry.

i. The business barometer method is scientific and reliable and used by
management for the purpose of various business decisions at different levels.

ii. Business barometer method helps in proper forecasting of future trends of a


iii. The business barometers are the indicators of future business trends and help
to forecast the speed of fluctuations.

iv. This method helps to find solution of various business problems such as
development of market, capital investment, exploration of new consumer market


i. It is very difficult to construct indices of business activities.

ii. In most of the cases, the business barometers provide inaccurate, incomplete
and conclusive forecasting due to index numbers prepared on the basis of
incorrect and inadequate data.

iii. The business barometers are the indicators of past conditions and the
forecasting based on these conditions may be erroneous.

iv. Separate indices are calculated for individual industry and firm which are
entirely different from general indices.

2. Time Series Analysis

Time series analysis is also used for the purpose of making business forecasting.
The forecasting through time series analysis is possible only when the business
data of various years are available which reflects a definite trend and seasonal
variation. By time series analysis the long term trend, secular trend, seasonal and
cyclical variations are ascertained, analyzed and separated from the data of
various years.


i. It is an easy method of forecasting.

ii. By this method a comparative study of variations can be made.

iii. Reliable results of forecasting are obtained as this method is based on
mathematical model.


i. This method is expensive, difficult and time taking.

ii. This method deals with past data only.

iii. This method can only be used when the data for several years are available.

1. Extrapolation

Extrapolation is the simplest method of business forecasting. By extrapolation, a

businessman find out the possible trend of demand of his goods and about their
future price trends also. The accuracy of extrapolation depends on two factors:

i. Knowledge about the fluctuations of the figures,

ii. Knowledge about the course of events relating to the problem under

Thus there are two assumptions on which extrapolation is based:

i. There is no sudden jumps in figures from one period to another,

ii.There is regularity in fluctuations and the rise and fall in uniform.

In extrapolation we assume that the variable will follow the established pattern of
growth. For the purpose of business forecasting it is to determine accurately the
appropriate trend curve and the values of its parameters. some of these curves are:

i. Arithmetic trend: The straight line arithmetic trend assumes that growth will be
a constant amount each year.

ii. Semi log trend: It assumes a constant percentage increase each year. As the
annual increment is constant in logarithm, this line will become a straight line
when drawn on semi log paper.

iii. Modified exponential curve: The curve is given by y = abx. This relationship is
referred to as an exponential function. It assumes that each increment of growth
will be a constant per cent of the previous one.
iv. Logistic curve:

This curve has both an upper asymptote and a lower

asymptote. A curve of this type is well suited to describe the
growth of industries as they pass through early periods of
experimentation, rapid growth as the product is perfected and
economics of scale make possible price reductions.

v.The Gompertz curve: it is given by

Yc = abcx

In the logarithmic form

Log yc = log a + (log b) cx

To decide which curve should be used, it is helpful to obtain

scatter diagram of transformed variable.


i. This method is very useful to forecast the future demand and


ii. This method is widely used for the forecasting of business

events because it is a simple method.

iii. We get pure and reliable results by this method because it is a

mathematical method.


i. This method can be used under its own assumptions only.

ii. This method is not simple but technical because of its

mathematical formulation.

iii. The selection of trend curve is very difficult.

2. Regression Analysis
The regression approach offers many valuable contribution to the solution of the
forecasting problem. It is the means by which we select from among the many
possible relationships between variables in a complex economy those which will
be useful for forecasting. Regression relationship may involve one predicted or
dependent and one independent variables simple regression, or it may involve
relationships between the variable to be forecast and several independent
variables under multiple regressions. Statistical techniques to estimate the
regression equations are often fairly complex and time-consuming but there are
many computer programs now available that estimate simple and multiple
regressions quickly.

3. Modern Econometric Methods

Econometric techniques, which originated in the eighteenth century, have recently

gained in popularity for forecasting. The term econometrics refers to the
application of mathematical economic theory and statistical procedures to
economic data in order to verify economic theorems. Models take the form of a
set of simultaneous equations. The value of the constants in such equations are
supplied by a study of statistical time series, and a large number of equation may
be necessary to produce an adequate model.

At the present time, most short-term forecasting uses only statistical methods with
little qualitative information. However, in the years to come when most large
companies develop and refine econometric models of their major business, this
tool of forecasting will become more popular.


i. Accurate and reliable results are obtained under this method because it is
a scientific method where computer is used.

ii. This method explains in detail and in quantitative terms the way in
which various aspects of the economy are interrelated.


i. This method is difficult and complicated.

ii. This method can be used only when adequate series of data is available.

iii. It is very difficult to construct growth model for every business

4. Exponential Smoothing Method

This method is regarded as the best method of business forecasting as

compared to other methods. Exponential smoothing is a special kind of
weighted average and is found extremely useful in short-term forecasting
of inventories and sales.

5. Choice of a Method of Forecasting

The selection of an appropriate method depends on many factors – the

context of the forecast, the relevance and availability of historical data, the
degree of accuracy desired, the time period for which forecasts are
required, the cost benefit of the forecast to the company, and the time
available for making the analysis.

The forecaster should use a technique that makes the best use of available
data. Furthermore, where a company wishes to forecast with reference to a
particular product, it must consider the stage of the products life cycle for
which he is making the forecasts.

Theories of Business Forecasting

There are a few theories that are followed while making business forecast. Some
of them are:

6. Sequence or Time-lag Theory:

This is the most important theory of business forecasting. It is based on the

assumption that most of the business data have the lag and lead relationships i,e.,
changes in business are successive and not simultaneous. There is time-lag
between different movements.

For example, when government makes use of deficit financing it leads to

inflationary pressured – the purchasing power of people goes up-the wholesale
prices, the retail prices starts rising. With the rise in retail prices the cost of living
goes up and with it there is a demand for increased wages. Thus, one factor i,e.,
more money in circulation, has affected various fields of economic activity not
simultaneously but successively.

i. This method is largely used for business forecasting because of the


ii. Though this theory is based on statistical techniques, yet it is easy to


iii. Time-interval between two events can be ascertained.

iv. Government can use this technique for the purpose of economic
stability of the economy by exercising control over possible losses.


i. This method studies only the action not the reaction.

ii. This method can not be regarded accurate because by using statistical
techniques the results can be up to the truth but not accurate one.

7. Action and Reaction Theory

This theory is based on two assumptions: every action has a reaction, and the
magnitude of the original action influences the reaction. Thus if the price of rice
has gone up above a certain level in a certain period, there is likelihood that after
some time it will go down below the normal level. Thus, according to this theory
a certain level of business activity is normal or abnormal, conditions cannot
remain so for ever. Thus, we find four phases of a business cycle.

i.. Prosperity

ii. Decline

iii. Depression, and

iv. Improvement


i. This is better than other theories.

ii. By this theory more reliable results can be obtained because this theory
gives attention to action and reaction of event.

i. The determination of normal level is very difficult.

ii. It is not necessary that reaction is equal to the action.

8. Economic Rhythm Theory

The basic assumption of this theory is that history repeats itself and hence assume
that all economic and business events behave in a rhythmic order. According to
this theory, the speed and time of all business cycles are more or less same and by
using statistical and mathematical methods a trend is obtained which will
represent a long term tendency of growth or decline.

It is done on the basis of the assumption that the trend line denotes the normal
growth or decline of business events.


i. Forecasting is made on the basis of past conditions, hence more reliable.

ii. This method is helpful in long-term forecasting


i.The business events are not strictly periodic and prediction of business
cycle on the basis of statistical method is not satisfactory.

ii.Past conditions are given more weight-age than the present conditions

9. Specific Historical Analogy

History repeats itself is the main foundation of this theory. Whatever happened in
the past under a set of circumstances is likely to happen in future also if
conditions are the same. A time series relating to the data in question is
thoroughly scrutinized and from it such period is selected in which conditions
were similar to those prevailing at the time of making the forecast but it is largely
dependent on past data.


i. It is an easy method
ii. As the future is forecasted on the basis of past business conditions, the
forecasting will be more reliable.


i. In this theory the forecasting is based on guess work not on a scientific

method because the past and present conditions are rarely found to be

ii. It is very difficult to select the past period with the same business
conditions like present.

Cross-Cut Analysis Theory

This theory proceeds on the analysis of interplay of current economic forces. In

this method, the combined effects of various factors are not studied. The effect of
each factor is studied independently. Under this theory, forecasting is made on the
basis of analysis and interpretation of present conditions because the past events
have no relevance with present conditions.


i. Present conditions are preferred than past.

ii. Facts are analyzed individually but collectively

iii. Forecast is nearer to the accuracy as it is based on present conditions.


i. Independent analysis of individual facts is very difficult.

ii. Past facts are equally important for the purpose of forecasting but in this
method no weight-age in given to past.

iii. The forecasting made on the basis of this technique cannot be regarded
Utility of Business Forecasting

Business forecasting acquires an important place in every field of the economy.

Business forecasting helps the businessman and industrialists to form the policies
and plans related with their activities. On the basis of the forecasting the
businessman can forecast the demand of the product, price of the product,
condition of the market etc. the business decisions can also be reviewed on the
basis of business forecasting. The main advantages of business forecasting are:

i. Helpful in increasing profit and reduction in losses: Every business is carried

out with the purpose of earning maximum profits, so by forecasting the future
price of the product and its demand the businessman can predetermine the
production cost, production and the level of stock to be determine. Thus, business
forecasting is regarded as the key of success of business.

ii. Management decisions: The business forecasting provides us basis for

management decisions because in present time the management has to take the
decision in the atmosphere of uncertainties. Also, the business forecasting
explains the future conditions and enables the management to select the best

iii. Useful to administration: on the basis of forecasting the government can

control the circulation of money, modify the economic, fiscal and monetary
policies to avoid the adverse effects of trade cycles. So, with the help of
forecasting the government can control the expected fluctuations in future.

iv. Basis for capital market: The business forecasting helps in estimating the
requirement of capital, position of stock exchange and the nature of investors etc.

v. Useful in controlling the business cycles: The trade cycles cause various
depressions in the business such as sudden change in price level, increase in the
risk of business, increase in unemployment etc. By adopting a systematic business
forecasting the businessman and government can handle and control the
depression of trade cycles.

vi. To achieve the goals: The business forecasting help to achieve the objective of
business through proper planning of business activities.

vii. Facilitates control: By business forecasting the tendency of black marketing,

speculation, uneconomic activities and corruption can be controlled.

viii. Utility to society: With the help of business forecasting the entire society is
also benefited because the adverse effects of fluctuations in the conditions of
business are kept under control.

Limitations of Business Forecasting

The business forecasting cannot be accurate due to various limitations which are
as follows:-

i. The forecasting cannot be accurate because it is largely based on future events

which are not sure to exist.

ii. The business forecasting is generally made by using statistical and

mathematical method. But the use of these methods cannot claim to be able to
make uncertain future certain.

iii. The underlying assumptions of business forecasting cannot be satisfied

simultaneously. In such a case the results of forecasting will be misleading.

iv. The forecasting cannot guarantee the elimination of errors and mistakes. The
managerial decision will be wrong if the forecasting is wrong.

v. Factors responsible for economic changes are often difficult to discover and to
measure. Hence business forecasting becomes an unnecessary exercise.

vi. The business forecasting does not evaluate risks.

vii. The forecasting is made on the basis of past information and data and relies
that economic events are repeated under the same conditions. But there may be
circumstances where these conditions are not repeated

viii. Forecasting is not a continuous process while to be effective it requires

continuous attention.


In these units we discussed about theory behind forecasting the objectives of

forecasting, steps involved in forecasting and different methods available. Finally
we conducted it with utility of business forecasting.


A time series is a set of numerical values of a given variable listed at successive intervals
of time. That is, the data regarding the variable is listed in chronological order. Usually
the interval of time is taken as uniform.
Example: Yearly production of wheat in the country, hourly temperature of a city,
bimonthly electricity bills etc. Almost all the data like industrial production, agricultural
production, exports, imports, diary products can be arranged in chronological order.

Learning Objective 1

Know the Concept of Time-series

A time-series is a set of numerical values of a given variable listed at successive intervals

of time. That is, the data regarding the variable is listed in chronological order. Usually
the interval of time is taken as uniform.

Example: Yearly production of wheat in the country, hourly temperature of a city,

bimonthly electricity bills etc. Almost all the data like industrial production, agricultural
production, exports, imports, diary products can be arranged in chronological order.

Time Series Analysis

Given a time series, we wish to

i. Study the forces that influence the variations in time series, and

ii. Study the behaviour of phenomenon over the given period of time.

For example, consider the sale of T.V sets (in thousands) by a producing company

Year 1995 1996 1997 1998 1999 2000

Number sold (in thousands) 12 14 16 12 10 18

We would like to analyse the above data and give some trends about the sales. For
example, the company would like to know as to why the sales dropped in 1998 and 1999,
and then why the sales increased. That is, the company would like to analyse the various
forces that affect the sales.

There can be changes in the values of the variable recorder over different points of time
due to various forces. Analysing the effect of all such forces on the values of the variable
is generally known as the analysis of time series. Broadly there can be four types of
changes in the values of the variable as discussed below:

i. Changes which generally occur due to general tendency of the data to increase
or decrease.

ii. Changes which occur due to change in climate, weather conditions, festivals
iii. Changes which occur due to booms and depressions.

iv. Changes which occur due to some unpredictable forces like floods, famines,
earthquakes etc.

Learning Objective 2

Understand Different Components of a Time-series

Components of Time Series

The behaviour of a time series over periods of time is called the movement of the
time series. The time series is classified into the following four components:

1. Long Term Trend or Secular Trend: This refers to the smooth or

regular long term growth or decline of the series. This movement can be
characterised by a trend curve. If this curve is a straight line, that is called a
trend line. If the variable is increasing over a long period of time, then it is
called an upward trend. If the variable is decreasing over a long period of
time, then it is called downward trend. If the variable moves upward or
downwards along a straight line then the trend is called a linear trend,
otherwise it is called a non-linear trend.
2. Seasonal Variations: Variations in a time series that are periodic in nature
and occur regularly over short periods of time during an year are called
seasonal variations. By definition, these variations are precise and can be


i. The prices of vegetables drop down after rainy season or in winter months
and they go up during summer, every year.

ii. The prices of cooking oils reduce after the harvesting of oil seeds and go up
after some time.

1. Cyclic Variations: The long-term oscillations that represent consistent

rises and declines in the values of the variable are called cyclic variations.
Since these are long-term oscillations in the time series, the period of
oscillation is usually greater than one year. The oscillations are about a trend
curve or a trend line. The period of one cycle is the time-distance between two
successive peaks or two successive troughs.
2. Random Variations: These are called irregular movements. Movements
that occur usually in brief periods of time, without any pattern and are
unpredictable in nature are called irregular movements. These movements do
not have any regular period or time of occurrences. Example: The effects of
national strikes, floods, earthquakes etc. It is very difficult to study the
behaviour of such a time series.

Learning Objective 3

Know Different Methods for Time-series Analysis

Methods of Measuring Trend:

We shall be studying the following methods of measuring the trend of a time series:

1. Free hand or graphic method: This is the simplest method to drawing a

trend curve. We plot the values of the variable against time on a graph paper
and join these points. The trend line is then fitted by inspecting the graph of
the time series. Fitting a trend line by this method is arbitrary. The trend line
is usually drawn such that the numbers of fluctuations on either side are
approximately the same. The trend line should be a smooth curve. The free
hand method has some disadvantages. They are:

i. it depends on individual judgement

ii. It cannot be used for any predictions of trends, as drawing the trend curve is

Example: Find trend with the help of freehand curve method for the data given below:

Year 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001
n (in lakh 15 18 16 22 19 24 20 28 22 30 26

1. Semi-Average Method:

The methods of fitting a linear trend with the help of semi average method are as follows:

i. The number of years in even: The data of the time series are divided into
two equal parts. The total of the items in each of the part is done and it is
then divided by the number of items to obtain arithmetic means of the two
parts. Each average is then centred in the period of time from which it has
been computed and plotted on the graph paper. A straight line is drawn
passing through these points. This is the required trend line.

ii. The number of years is odd: When the number of years is odd, the value
of the middle year is omitted to divide the time series into two equal parts.
Then the procedure (i) is followed.

A trend value of any future year may be predicted by multiplying the periodic increment
by the number of years into the future that is desired and adding the result to the best
trend value listed in the series.


i. The method is simple

ii. The trend line can be extended on either side in order to obtain past or future
iii. This is an objective method, as any one applying this method get the same
trend line.


i. The method of semi average assumes a straight line relationship between the
plotted points, regardless of the fact whether such relationship exists or not.

ii. This method has an in built limitation of arithmetic mean. This method is not
suitable is case of very low or very large extreme values.

iii. There is no assurance that the influence of cycle is eliminated.

1. Method of Moving Averages

This method is used for smoothing the time series. That is, it smoothens the
fluctuations of the data by the method of moving averages.

a.When Period of moving average is odd: To determine the trend by this

method, we use the following method:

i. Obtain the time series

ii. Select a period of moving average such as 3 years, 5 years etc.

iii. Compute moving totals according to the length of the period of moving

If the length of the period of moving average is 3 i,e., 3-yearly moving average is
to be calculated, compute moving totals as follows:

a + b + c, b + c + d, c + d + e, d + e + f…..

for 5-years moving average, moving totals are computed as follows:

a + b + c + d + e, b + c + d + e + f, c + d + e + f + g…..

Placing the moving totals at the centre of the time span from which they are

iv. Compute moving averages by moving totals in step (3) by the length of the
period of moving average and place them at the centre of the time span from
which the moving totals are computed. These moving averages are also called the
trend values.
By plotting these trend values (if desired) one can obtain the trend curve with the help of
which we can determine the trend whether it is increasing or decreasing.

If needed, one can also compute short-term fluctuations by subtracting the trend values
from the actual values.

Illustrative Example:

Year 1998 1989 1990 1991 1992 1993 1994 1995 1996 1997
Production (in
15 18 16 22 19 24 20 28 22 30
lakh ton)

Solution: Calculation of trend values

Production (Thousand 3 –yearly 3 –yearly moving Short term

Y Tonnes) moving totals totals Ye fluctuations (Y – Yc)
1988 21 - - -
1989 22 66 22.00 0
1990 23 70 23.33 - 0.33
1991 25 72 24.00 1.00
1992 24 71 23.67 0.33
1993 22 71 23.67 - 1.67
1994 25 73 24.33 0.67
1995 27 79 26.33 0.67
1996 26 - - -

b. When period of moving averages is even: when period of moving average is

even (4years etc) we compute the moving averages by using the following steps:

i. Obtain the time series

ii. Obtain the length of the period of moving average. Let the length of the
moving averages period be 4-years.

iii. Compute 4 yearly moving totals and place them at the centre of time span.
The four – yearly moving totals are computed as follows:

a + b + c + d, b + c + d + e, c + d + e + f,
iv. Compute 4 – yearly moving average and place them at the centre of the
time span. Note that this placement is inconvenient, because the moving
average so placed would not coincide with original time period.

v. Take two – period moving average of moving averages and place them at
the middle of the periods. This process is called centring of moving averages.

Merits of method of moving averages:

i. This method is simple.

ii. This method is objective in the sense that any body working on a problem with
this method will get the same results.

iii. This method is used for determining seasonal, cyclic and irregular variations
besides the trend values.

iv. This method is flexible enough to add more figures to the data because the
entire calculations are not changed.

v. If the period of moving averages coincides with the period of cyclic

fluctuations in the data, such fluctuations are automatically eliminated.


i. There is no functional relationship between the values and the time. Thus, this
method is not helpful in forecasting and predicting the values on the basis of time.

ii. There are no trend values for some year in the beginning and some in the end.
For example, for 5 – yearly moving average there will be no trend values for the
first two years and the last three years.

iii. In case of non – linear trend the values obtained by this method are biased in
one or the other direction.

iv. The selection of the period of moving average is a difficult task. Therefore
great care has to be taken in selecting the period, particularly, when there is no
business cycle during that time.

Method of least squares

Under this method the trend curve is determined by fitting a mathematical equation. This
method is more accurate and precise and can be used even for forecasting. We can fit
either a straight line or a parabolic curve from the given data by this method. Let y be the
actual values of y and yc be the computed values of y for a given value of x.

Let y = a + bx be a straight line to be fitted for trend. To find the values of a and b such
that the sum of squares of differences of the actual and computed values of y is least, i.e
∑ (y – yc)2 is least, where the condition ∑ (y – yc) = 0 is satisfied, is known as method of
least squares. The line obtained by the method is known as the ‘line of best fit.’

For a given time series data, to find a linear trend, the values of a and b are obtained by
the normal equations.

Where N is the number of pairs for which data are given.

Here a is intercept of the line on they y – axis and b is the slope of the line, b is
also known as growth rate (if b > 0) or decline rate (if b<0), b gives the change in
the value of y, for per unit change in the value of x.

Direct method

i. Convert the years into natural numbers ( 1, 2, 3……) and denote by x and find
∑ x.

ii. Find the squares of x values and obtain ∑ xz.

iii. Multiply the x – values with corresponding y – values and obtain ∑ xy.

iv. Add the values of y to obtain ∑ y.

v. Put these values in the two normal equations and solve for a and b.

vi. Substitute these values of a and b in y = a + bx and then find trend values for
various values of x.

Short – Cut Method

Measure the variables x from any point of time in origin as the first year, but the
calculations are simplified when the mid – point in time is taken as origin so that ∑ x = 0,
when ∑ x = 0, then normal equations reduce to:

∑ y = Na therefore a = ∑ y/N and

∑ xy = b ∑ x2 therefore b = ∑ x / Σx2


i. This method is a completely objective method.

ii. This method gives the trend values for the entire time period.

iii. This method can be used to forecast future trend because trend line establishes a
functional relationship between the value and the time.


i. It requires many calculations and is tedious and complicated.

ii. If even a single item is added to the series a new equation has to be formed.

iii. Future forecasts made by this method are based only on trend values. Seasonal,
cyclical or irregular variations are ignored.

Non – linear Trend:

Fitting a Parabolic Curve or Non – linear Curve by the Method of Least Squares.

When the time series data do not confirm with the linear trend then we obtain Non –
linear trend. For this we use the equation of the form.

Y = a + bx + ex2 + dx3 + …… + kxn which is known a polynomial of degree n in x, k ≠ 0.

Let the parabolic curve be

Y = a + bx + cx2, with usual notations. The values of a, b, and c can be determined by

solving the normal equations:

If we can change the origin at a suitable point such that Σ x = 0, then the normal
equations reduce to:
Mathematical Models for Time series

The following are the two models commonly used for the decomposition of a time series
into its components.

1. Additive Model: This model assumes that the observed value is the sum of
four components of time series, i,e.,

Y = T + S + C + 1,

Where Y = original data, T = Trend value, S = Seasonal component, C =

Cyclical component, I = irregular component,

The additive model for decomposition of time series assumes that all the four
components of the time series operate independently of one another. It also
assumes that the behaviour of components is of an additive character. It is to
be noted that only absolute values are added or deducted from the trend value
to arrive at the observed value.

1. Multiplicative Model: This model assumes that the observed value is

obtained by multiplying the trend (T) by the rates of three other components,

Y = T x S x C x 1.

The multiplications model assumes that the components although due to

different causes are not necessarily independent and they can affect one
another. It also assumes that the behaviour of components is of multiplicative
character. It may be noted that except the value of trend all the other values on
the right hand side are rates or index numbers.

Most of the time series relating to economic and business phenomenon

conform to the multiplication model. In practice, additive model is rarely

Editing of Time Series:

It is necessary to make certain adjustments in the available data. Some important
adjustments are:

i. Time Variation : When data are available on monthly basis, the effect of
time variation needs to be adjusted because all months of the year do not
have the same number of days. This adjustment of time variation is done by
dividing each monthly total by daily average, it is then multiplied by 365 /
12 which is the average number of days in a month.

ii. Population changes: Adjustment for population change becomes

necessary when a variable is affected by change in population. If we are
studying National Income figures such adjustment is necessary. In this case,
adjustment is to divide the income by the number of persons concerned.
Then we can have per capita income figures.

iii. Price changes: Adjustment for price changes becomes necessary

wherever we have real value changes. Current values are to be deflated by
the ratio of current prices to base year prices.

iv. Comparability: In order to have valid conclusion the data which are
being analysed should be comparable. When we are dealing with the
analysis of time series it involves the data relating to past which must be
homogeneous and comparable. When we are dealing with the analysis of
time series it involves the data relating to past which must be homogenous
and comparable. Therefore, effects should be there to make the data as
homogeneous and comparable as possible.

Measurement of Seasonal Variation

In order to isolate and identify seasonal variations, we first eliminate as far as

possible the effect of trend, cyclical variations and irregular fluctuations on the time
series. The main methods of measuring seasonal variations are:

i. Seasonal Variation Index or Seasonal Average Method

ii. Seasonal Variation through Moving Averages

iii. Chain or Link Relative Method

iv. Ratio to Trend Method

Now we shall discuss them separately:

Seasonal Average Method

In this method we will use following steps: i) The time series is arranged by years and
months or quarter. ii) Totals of each month or quarter over all the years are obtained. iii)
The average for each month or quarter is obtained. The average may be mean or median.
In general, we take mean if not specified otherwise. iv) Taking the average of monthly or
quarterly average equal to 100, seasonal index for each month or quarter is calculated by
the following formula:

Seasonal Index for a month (or quarter)


Seasonal Index for first term


Where S1 = Average of first term

S = Average of all terms Σ Sj / k

j = 1, 2, 3, 4……..k

k = 12 for monthly data

= 4 for quarterly data


i. This method is the simplest one.

ii. This method is useful where no definite, trend exists in the time series.

Demerits / Limitations

i. Most economic time – series have trends and therefore, the seasonal index
computed by this method is really an index of trends and seasons.

ii. The simple averages method of isolating seasonal fluctuations in time – series is
based on the assumption that the series contains only the seasonal and irregular

iii. This method does not give a true reflection of the normal seasonal variation
because it is obtained from the original data which are affected by not only seasonal
movements but also by remaining three components.
iv. The effects of cycles of the original data are not eliminated by the process of

Seasonal Variation through Moving Averages.

This method is also known as Percentage of Moving Average Method. The steps
involved in the computation of seasonal indices by this method are as follows:

i. The moving averages of the data are computed. If the data are monthly then
12 – monthly moving average, if they are quarterly, then 4 – quarterly moving
averages will be computed. In both the cases time periods of moving averages are
even, hence these moving averages are to be centred.

ii. Under additive model, from each original value, the corresponding moving
average is deducted to find out short time fluctuations:

iii. Y–T=S+C+1

iv. By preparing a separate table, monthly (or quarterly) short time fluctuations
are added for each month (or quarter) over all the years and their average is
obtained. these averages are known as seasonal variations for each month or

v. If we want to isolate / measure irregular variations, the mean of the respective

month or quarter is deducted form the short – time fluctuations.

Chain or Link Relative Method

This method involves the steps given below:

i. Each quarterly or monthly value is divided by the preceding quarterly or

monthly value and the result is multiplied by 100. These percentages are known
Link Relatives of the seasonal values. Thus;

There shall be no Link Relative corresponding to the first.

ii. The mean of the Link Relatives for each season is computed over all the
years. Median can also be taken instead of mean of the Link Relatives.

iii. These average link, relatives are converted into chain relatives. The chain
relative of first is taken as 100.

The chain relative of current

= (Average Link Relative of current * Chain Relative of previous year)/100

iv. The second chain relative of first is computed on the basis of the chain
relative for the last:

Chain relative of the first quarter = (Average Link Relative of the first x Chain
Relative of the last)/100

This chain relative may or may not be 100. It is not equal to 100 due to secular
trend. If it is 100 go to step (vi), if it is not 100 go to step (v) and then step (vi).

v. Compute the difference d between the new chain relatives of first obtained in
step (iv) and chain relative assumed as 100. d is divided by the number of seasons
and the resulting figure is multiplied by 1, 2, 3 and the product is deducted
respectively from the chain relatives of 2nd, 3rd, and 4th quarters. These are called
corrected relatives.

vi.The chain relatives obtained is step (iv), if correction is not necessary for the
corrected chain relatives obtained in step are expressed as percentages of the
average to have adjusted chain relatives.

These are the required seasonal indices.

Ratio to Trend Method

The steps to determine seasonal indices by this method are as follows:

i. Determine the trend values by the method of least squares.

ii. To find ratio to trend, divide the original data by the corresponding trend
values and multiply these ratios by 100,i,e

iii. Calculate the Arithmetic Mean of the Trend Ratios obtained in step (ii).

iv. Finally all the trend ratios will be converted into seasonal indices. For this
add all averages obtained in (iii) and find their General Average. Seasonal indices
are calculated by using the following formula:

Seasonal Indices=(Quarterly Averages/General Averages)X100

Learning Objective 4

Learn About Forecasting Methods Using Time-series

Forecasting Methods Using Time series

There are five forecasting methods using time series.

1. Mean Forecast: It is the simplest method of forecasting in which for the

time period t, we forecast the value of the series to be equal to the mean of the
series I,e,

In this method the trend effect and cyclic effects do not come into account.

1. Naïve Forecast: In this method we forecast the value, for the time period t,
to be equal to the actual value observed in the previous period, i,e, time period
(t-1). This is given as

Y t = yt – ν

1. Linear Trend Forecast: It is given by yt = a + bx, where x is to be found

from the value of t; a and b are constants. This method is based on the least
squares method where a linear relationship is to be obtained between time and
the response value x by the above formula.
2. Non-Linear Trend Forecast: In this method a non-linear relationship
between the time and the response value has been found by the method of
least squares. The value of forecast yt for the time period t, can be yt = a + bx
+ cx2

Where x-value will be calculated from the value of t and the constants a.

1. Forecasting with Exponential Smoothing

Learning Objective 5

Know About Utility of Time-series in Business Applications

Utility of the Time Series

The following are the possible uses of the time series:

i. Comparative study of the behaviour of the variable over different periods of
time can be done. The variable may be export figures, quantity of industrial
production etc:

ii. Forecasting can be done using the time series. By studying the variations and
other behaviour of the variables over a sufficiently long period of time, it may be
possible to forecast the future behaviour of the variables. However, such a
forecast has meaning only if the period of forecast is a normal period. For
example, various five-year plans by the Government of India are formulated by
studying the time series and forecasting.

iii. Study of the time series helps in analysing the post behaviour of the
variables. This helps in identifying the various forces that effect its behaviour.


In this unit we studied about the business forecasting. The different step involved in
forecasting is discussed in a simple manner. The concept of time series analysis is
discussed next with good examples. Action and reaction theory is explained with its
merits and demerits in a simple manner. Lastly in this unit we discuss about the method
of least squares with merits and demerits discussed in detail.


We know the most values change and therefore may want to know-how much changes
has taken place over a period of time. For example, we may want to know how much the
prices of different times essential to a household have increased or decreased so that
necessary adjustments can be made in the monthly budget. However, while price of a few
items may have increased, others may have decreased over a given period of time.
Consequently, in all such situations, an average measure needs to be defined to compare
such difference over a time period. Index numbers are yardsticks for describing such

An index number is a statistical measure which is designed to express changes or

differences in a variable or a group of related variables, usually expressed in percentage
form. These differences may have to do with the physical quantities of the goods, the
prices of the commodities, or such concepts as’ efficiency’ “intelligent’ or beauty’. The
comparison may be between the periods of time, between places, between like categories
etc. we may have index numbers comparing the cost of living at different times or in
different localities or countries, the physical volume of production in different years, or
efficiency or different government offices. However, we confine most of our attention to
the construction of index numbers measuring changes over time.

Learning Objective 1

Understand the Concept of Index Numbers

Index Number – Definition

An index number is a number which is used to measure the level of a certain

phenomenon as compared to the level of the same phenomenon at some standard period.
In other words an index number is a number which is used as a device for comparison
between the price, quantity or value of a group of articles in different situations, e.g. at a
certain place or a period of time and that of another place or period of time. When a
comparison is in respect of prices, it is called an index number of price, when in respect
of physical quantities; it is named as index number of quantities. Other index numbers are
defined in the similar manner. The index numbers are mean for comparison of variations
arising out of the difference in situations, e.g change of time or change of place.

Learning Objective 2

Understand Different Types of Relative Indices


The value of a variable in a given year (or place) divided by the value of the same
variable in a specified year (or place) is called a relative and is generally, expressed in

a.Price Relative: The price of commodity in a given year expressed as a percentage of the
price of the same commodity in a specified year is called price relative.

Suppose the price of a commodity in India in 2001 was Rs.95 per kg and in 2000 it was
Rs.80 per kg

Then the price relative for 2001, (using 2000 as base) is: 95 / 80 x 100 = 118.75%

b.Production Relative: If the wheat production in India in 2002 was 5,82,000 metric tons
and in 2004 it was 6,96,000 metric tons, then assuming the production of 2002 as 100, the
production relative for 2004 is equal to (696000/582000)x100=119.6%
c.Quantity Relative: The quantity (q1) of a commodity consumed in a given year
expressed as a percentage of the quantity (q0) of the same commodity consumed in a
specified year is called Quantity Relative.

Thus Quantity Relative = q1 / q0 x 100

d.Value Relative: If p1 and q1 are the price and quantity respectively for a commodity in a
given and p0 and q0 are the specified price and quantity respectively of the same
commodity, in a specified year, then V1 = p1q1 is the value of given year and V0 = p0q0 is
the value of the specified year.

The ratio V1 / V0 x 100 = p1q1 / p0q0 x 100 is called the value relative of the specified year
with respect to the given year.

The overall change in price, production, quantity or value etc. is represented by these
typical summaries which are known as relatives.

Classification of Index Numbers

These are various approaches for classification of index numbers:

1. Based on Variables.

a.Price Index: When the variable is price.

b.Quantity index: When the variable is quantity

c.Value index: when the variable is value.

d.Production index: when the variable is production.

2. Based on Retail or Wholesale Prices

3. Cost of living index number: Where we use retail prices.
4. Wholesale price index number: Where we use wholesale prices.
5. Based on Weights
6. Simple (unweighted) index number.
7. Weighted index number.
8. When a number of commodities is more than one, then we obtain a single
(combined) index number. This can be done in four ways:

i.Simple average of relatives

ii.Weighted average of relatives

iii.Simple aggregate
iv.Weighted aggregate.

Base Year and Current Year

In the computation of an index number we require two years (or places). The given year
whose values are to be compared is called a current year (or current period) and the
specified year whose values are taken as standard (say 100) is called a Base year {Base
Period}. For example, if the prices of 2005, are compared with the prices 0f 2004, then
2005 is the current year and 2004 is the base year. The index number of 2005 based on
2004, in general, denoted by O01or P01, where 0 stands for 2004, and 1 stands for 2005.

Chief Characteristics of Index Numbers

i.Expressed in numbers: Index numbers represent the relative changes such as

production is increased, prices are down etc, in the numbers.

ii.Expressed in percentage: Index numbers are expressed in terms of percentages so

as to show the extent or relative change where the value of base is assumed to be
100 but the sign of percentage (%) is not used.

iii.Relative measure: Index numbers measure changes which are not capable of
direct measurement.

iv.Specified averages: Index number represents a special case of average, in

general, a weighted average. It is a special type of average, because whereas in a
simple average, the data are homogenous having the same unit of measurement,
they average variables having different units of measurement.

v.Basis of Comparison: Index numbers by their very nature are comparative. They
compare changes overtime or between places or like categories.

Learning Objective 3

Understand Construction of Index Numbers

Main Steps in the Construction of Index Numbers

To follow the steps involved in the construction of index numbers many problems are
encountered which are to be discussed carefully:

1. Purpose of Index Number: The steps which are taken in the construction of index
numbers generally depend on the purpose of the index number. Hence the purpose
of an index numbers must be defined clearly and precisely. For example, the
purpose of the general index number of wholesale price index number is to know
the general price level, while that of consumer price index number is to give an
idea of the effect of the change in retail prices on the cost of living of classes of

1. Selection of Base Period: The base period of an index number is the period of
time against which the comparisons are made. There are three types of based

i.Fixed base (a single period)

ii.Fixed base (an average of selected periods)

iii.Chain base

While selecting the base a decision has to be made so as to whether we have fixed
base or chain base in a fixed base (a single period):

iv.The base period must be a normal period. By normal period we mean that
period which is free from all sorts of abnormalities or random causes such as
financial crisis, floods, famines, earth quakes, strikes of labourers, wars etc.

v.The base period should be a period for which reliable figures are available.

vi.The base period should not be too distant in the past.

When it is difficult to choose just one single period as the normal, then a better
choice will be an average of several periods.

If the comparisons are required form year to year a system of chain base is used.
In this method, there is 10 fixed base for comparing the values of subsequent
years, but the value of each year is compared with the value of the preceding year.

2. Selection of Commodities:

a.First problem is the selection of commodities because it is not feasible to

include all commodities. The purpose of the index number is to help in deciding
the number of commodities.

b.Which commodities are to be included? A careful selection of the commodities

must be made in such a way that:

i.It represents the real tastes, habits and the customs of the people,
ii.It should be of a standard quality and there must be no significant
variation in the quality,

iii.It must be easily recognizable and describable.

iv.It should not be a non-tangible commodity such as personal service etc.

1. Selection of the Representative Prices: In the collection of price quotations

we have to consider the following points:

i.The method of quoting prices of the commodities.

ii.The type of quotations whether wholesale prices or retail process.

iii.The place from where the quotations are to be obtained.

2. System of Weighting: The term ‘weight’ refers to the relative importance

of the different commodities included in the construction of index
numbers. There are two methods of assigning weights:

i.Implicit Method: In this method several varieties of a certain type of

commodity under study are used. Such weights are called implicit weights.

ii.Explicit method: In this method, the weights are laid down on the basis
of one outward evidence of importance of commodities. One fo the
problems in the selection of appropriate weight is to decide this evidence.
Another problem with regard to the system of weighting is whether
weights should be fixed or fluctuating.

3. Selection of the Average: To find composite index number we can use any
average such as arithmetic mean, geometric mean, harmonic mean, median and
mode. The use of an average depends on the relative merits and demerits of the
various averages. The average may be weighted or unweighted.
4. Selection of Suitable Formula: There are various formulae for computing index
numbers so the selection of a suitable formula also possess some problem. A
particular formula is suitable in a particular situation.

Learning Objective 4

Know the Methods of Computation of Index Numbers

The various methods of constructing index numbers can be classified in two

i. Unweighted index numbers.

ii. Weighted index numbers

In unweighted index numbers each item is supposed to have the same weight but
in weighted index numbers the weights are assigned to various items in
accordance with their importance.

Unweighted index numbers can be further divided into two categories:

i. Simple aggregative method

ii. Simple average of relatives method.

Weighted index numbers can be divided into two categories.

i. Weighted aggregative method.

ii. Weighted average of relatives method.

Unweighted Index Numbers

1. Simple aggregative method: To construct a price index by simple

aggregative method we proceed as follows:

i. Add the prices of all commodities in the current year, i,e., find

ii. Add the prices of all commodities in the base year, i,e., ∑p0

iii. Divide the total of current year prices by the total of base year
prices and multiply the quotient by 100, i,e

iv.Here I01is the simple price index number of current year (1)
based on based year (0).
Merits: This is the simplest method of constructing index numbers because it is
simple to understand and requires simple calculations.


i. This method gives inappropriate results when the prices of different

commodities are quoted in different units.

ii. Since weights are not used, this method does not give any
consideration to the relative importance of commodities.

iii. Index number calculated by this method is unduly effected by high

or low values.

Example 1: Find the simple aggregative price index from the following data:

Commodity Unity Price in Rs. Per unit

2000 2004
A One kg. 10 15
B One kg. 40 30
C One dozen 10 12
D One litre 5 13


Price index number of 2004 based in 2000

Using the formula

Where ∑P1 = total of prices in 2004 = 70

∑P0 = total of prices in 2000 = 65

This implies that the prices had increased by 7.7% in year 2004 as compare to the
year 2000.

Simple average of relatives method: To construct a price index by this method

we proceed as follows:

i.Obtain the price relative for each commodity

Price relative for the current year=(price of current

year/price of base year)X100


ii.Calculate the arithmetic mean, geometric mean etc., of

the price relatives obtain in (i) and denote it by L01

a.When arithmetic mean is used:

b.When geometric mean is used:

Merits / Advantages:

i. It is not affected by units in which price are quoted

ii. It is not affected by absolute values of prices as prices are converted

into price relative.

iii. It gives equal importance to all items and extreme items do not unduly
affect the index number.

iv. The index number calculated by this method satisfies the unit test.

Demerits / Limitations
i. As it is an unweighted average the importance of all items is assumed to
be the same.

ii. The index number constructed by this method does not satisfy all the
criterion laid down for an ideal index.

iii. The index number is unduly influenced by high or low prices when
arithmetic mean is used.

iv. More labour is involved if geometric mean is used.

Weighted Index Numbers

To meet the weakness of the simple or unweighted method, we weight the price
of each commodity by a suitable factor often take as the quantity or the volume of
the commodity sold during the base year. In other words, in this method
appropriate weights are assigned to various commodities to reflect their relative
importance in the group. The weight can be production figures, consumption
figures or distributive figures. For the construction of the price index number
quantity weights are used. If w is the weight attached to a commodity, then the
price index is given by

Weighted index numbers can be classified into two broad groups:

i.Weighted Aggregative Index Numbers.

ii.Weighted Average of Price Relatives

Weighted Aggregative Index Number: In the weighted aggregative index numbers

the weights are assigned to various items and the weighted aggregate of the prices
are obtained. Weights are assigned in various ways and the weighted aggregates
are used in different ways for the construction of index numbers. Some of the
important methods of constructing weighted Aggregative Index Numbers are
given below:

Laspeyre’s Price Index: Laspeyre’s method is based on fixed weights of the base
year. Base year’s quantities are used as weights. The formula given by Laspeyre
is given below:
Laspeyre’s Price Index:

Where P1 = Current price year

P0 = Base year price

Q0 = Quantity used for weight in the base years.

This index number has an upward bias i,e when prices increase, there is a
tendency to reduce the consumption of higher priced goods. The index
number is very widely used in practical work.

Quantity Index Number using Lasperyre’s formula is

Paasche’s Method: Paasche’s Method is based on current year’s quantities.

Current year’s quantities are used as weights. The formula given by Paasche is
given below:

Paache’s Price Index:

Where P1 = Current price year

P0 = Base year price

Q0 = Current year quantity which are taken as weights.

This index number has downward bias. This formula is not used frequently in
practice where the number of commodities in large.

Quantity index number using Paasche’s formula is given by

Dorbish and Bowley’s Method: This method is a combination of Laspeyre’s and

Paasche’s method. If we find out the arithmetic average of Laspeyre’s index and
the Paasche’s index, we get the index suggested by Dorbish and Bowley. This
index number takes into account both the base year as well as the current year

Dorbish and Bowley’s Price Index


i.It is free from bias, upward as well as downward.

ii.This formula takes into account both current years as well as base year
prices and quantities.

iii.It satisfies both ‘time several test’ as well as the ‘factor reversal test’.
This is why it is called an ideal index number.


i.This formula is difficult to interpret.

ii.It is not a practical index to compute because it is excessively


iii. It requires the prices and quantities for base year and current yar.

Quantity Index Numbers:

The quantity index numbers measure the average storage in quantities and enable
us to compare changes in physical quantity of goods produced or sold. These
index numbers can also be simple or weighted. Weights in quantity index number
in price. Therefore quantity index numbers can be easily obtained from price
index numbers just by interchanging p’s and q’s in the above formulae.

Value Index Numbers:

The value index numbers are very easy to calculate. Value is the product of price
and quantity. A simple value index number is equal to the value of the current
year divided by the value of the base year. If this value is multiplied by 100 we
get the value index number. The required formula is:

Value index number:

Simple Value index number

Where V1 = value of the current year

Such Index numbers are not weighted because they do not take into account either
the price or the quantity. These index numbers are not very popular because the
situation revealed by price and quantities are not fully revealed by the values.

Test For Adequacy Of Index Number Formulae

2. Unit Test: This test requires the formula should be free of

units. Expect simple aggregative index all the others satisfies this
3. Time Reversal Test: This test requires the formula for
calculating the index number should be such that it will give the
same ratio between one period of comparison and the other.

Symbolically P01 x P10 = 1. This test is satisfied by Fisher’s Ideal

Index, simple geometric mean of price relatives, weighted geometric
mean of price relatives and Marshall-Edgeworth Index number.

4. Factor Reversal Test: The formula should permit the

interchange of price and quantity without giving inconsistent results.

This test is satisfied by Fisher’s Ideal Index

5. Circular Test: It is an extension of Time Reversal test. The test requires

that if an index is constructed for the year ‘a’ on base year ‘b’, and for the
year b on the base year c we should get the same result as if we calculated
directly for the year a on the base year c without going through b.
Symbolical P01 x P12 x P20 = 1

It is satisfied by index numbers with fixed weights by

aggregate methods.

Cost of Living Index Numbers of Consumer Price Index

The ‘cost of living index’, also known as “consumer price index’ or ‘cost of living
price index’ is the country’s principal measure of price change. It measures
average change over time in the prices paid by the consumer of specific baskets of
goods and services.

The consumer price index numbers are designed to measure the average change in
the price index numbers are designed to measure the average change in the price
paid by the ultimate consumers for specified quantities of goods and services over
a period of time. The consumer price index helps us in determining the effect of
rise and fall in prices on different classes of consumers living in different areas.

Different people consume different kinds of commodities and the same

commodities in different proportions. The consumer price index helps us in
determining the effect of size and fall in price index helps us in determining the
effect of rise and fall in prices on different classes of consumers living in different
areas. The consumer price index number is significant because the demand of a
higher wage is based on the cost of living index and the wages and salaries in
most nations are adjusted according to this index number. The cost of living index
does not measure the actual cost of living nor the fluctuations in the cost of living
due to causes other than the change in price level but its object is to find out how
much the consumers of a particular class have to pay for a certain quantity of
goods and services.

Utility of Consumer Price Index Numbers:

i.It is useful to measure the change in purchasing power of currency,

real income etc.

ii.It helps the government in formulating wage policy, price policy,

taxation and general economic policies.

iii. Market price for a particular kind of goods and services are
analysed by consumer price index.

iv. The salaries and wages are fixed on the basis of consumer price
index. So, it is very helpful to revise wage of dearness allowance.
Assumptions: Cost of living index number is based on same assumptions which
are as follows:

i.Similar needs: The needs of the people for which this index number
is constructed are same.

ii.Same goods: The goods consumed in the base year and the current
year unchanged.

iii.No change in quantity of goods: It is also assumed that the quantity

of goods consumed will remain same in the base year and current year.

iv.Price quotations are same: It is also assumed that the prices at

different places are same and they do not change frequently.

v.True on the average: Cost of living index numbers are true on the

vi.Representative goods: The commodities included in the cost of

living index number represent the consumption of the class of people.

The steps in construction of cost of living index numbers:-

i.Selection of the class of people

ii.Scope of the index

iii.Conducting family budget inquiry

iv.Obtaining price quotations

v.Preparing a frame or list of persons

Methods of Constructing Consumer Price Index

These are three methods for constructing consumer price index number:

i.Aggregate expenditure method: [Based on Laspeyre's method, i,e

base year quantities are taken as weights (w = Q0)

ii.Family budget method: [or the method of weighted relatives
where weights are the value (P0Q0) in the base year often denoted
by V]

sane as (i)

iii.Weight average of price relatives: Let I = group index and W =



Learning Objective 5

Learn about Limitations of Index Numbers

Limitations of Index Numbers

There is no doubt that the technique of index numbers is a very useful tool. But
these are certain limitations of index numbers which should be borne in mind.

The chief limitations are:

• Not perfect – They are approximate values.

• Difficulties in the construction of index numbers – Due to selection of base year,
items, changes in habits and selection of average.
• Sampling errors
• Index numbers can also be manipulated
• Limited application – An index number constructed for one purpose cannot be
used for other purposes.
• Lack of adequate and accurate data

Learning Objective 6

Understand the Utility and Importance of Index Numbers

The primary purpose of index numbers is to measure relative temporal or cross-sectional
changes in a variable or a group of related variables which are not capable of being
directly measured. The greatest purpose of index numbers has been to measure and
compare the changes in prices and purchasing power of money which have received great
attention of economists for many years.

Now-a-days, index number is not only used for measuring price changes alone. The
factors like wages, employment, production, trade, demand, supply, business condition,
industrial activity, financial problems etc., are also studied through this statistical device.
As a Barometer measures the pressure of atmosphere or gases so the index numbers
measure the pressure of economic behaviour, and thus the index numbers are called
economic Barometers.

In brief following are the main uses of index numbers:

• Comparative Study
• Simplifies data
• Provide guidelines to economic policy and in formulating decisions
• Measures purchasing power of money
• Change in cost of living
• National income
• It is used as control by government
• Reveal trends and tendencies
• Useful in deflating
• Comparative study is made possible
• Universal utility


In this unit we studied about the concept of index numbers, and classification of index
numbers into different types. The different index numbers formally available, the utility
and importance of index numbers are explained in a simple way.