You are on page 1of 59


To all PGDM


Ritesh Singhal
{M.Sc.(Maths), MIT, M.Phil.}

Ritesh Singhal
 The systematic and scientific treatment of
quantitative measurement is precisely
known as statistics.
 Statistics may be called as science of
 Statistics is concerned with the collection,
classification (or organization),
presentation and analysis of data which
are measurable in numerical terms.

Ritesh Singhal
Stages of Statistical Investigation
Collection of Data

Organization of data

Presentation of data


Interpretation of Results

Ritesh Singhal
 It is divided into two major parts: Descriptive and
Inferential Statistics.
 Descriptive statistics, is a set of methods to describe
data that we have collected. i.e. summarization of data.
 Inferential statistics, is a set of methods used to
make a generalization, estimate, prediction or decision.
When we want to draw conclusions about a distribution.

Ritesh Singhal
Collection of Data
 Data can be collected by two ways:
>>> Primary Data Collection
It is the data collected by a particular person
or organization for his own use.
>>> Secondary Data Collection
It is the data collected by some other person
or organization, but the investigator also
get it for his use.

Ritesh Singhal
Methods of Primary data collection
 Direct personal interview
 Data through questionnaire

 Indirect investigation


Ritesh Singhal
Methods of Secondary data collection
 Data collected through newspapers &
 Data collected from research papers.
 Data collected from government
 Data collected from various NGO, UN,
 Other published resources
Ritesh Singhal
Classification of data
 Classification is a process of
arranging data into sequences and
groups according to their common
characteristics or separating them
into different but related parts.
 It is a process of arranging data into
various homogeneous classes and
subclasses according to some
common characteristics.

Ritesh Singhal
Presentation of Data
 Data should be presented in such a
manner, so that it may be easily
understood and grasped, and the
conclusion may be drawn promptly from
the data presented. e.g.
>>> Histogram
>>> Frequency polygon & curve
>>> Pie Chart
>>> Ogives
>>> Pictogram & Cartogram
>>> Bar Chart

Ritesh Singhal
 Discrete Variable
e.g. No. of books, table, chairs
 Continuous Variable

e.g. Height, Weight

 Quantitative Variable

That can be measured on a scale

 Qualitative Variable

That can not be measured on a scale

Ritesh Singhal
Frequency Distribution
The observations can be recorded by three
1. Individual Series
Data recorded for individual member.
2. Discrete Series
This variable can assume values after an
interval (or jumps).
3. Continuous Series
Here the variable may be having any value,
integer or fraction.
Ritesh Singhal
Statistics functions & Uses
 It simplifies complex data
 It provides techniques for comparison
 It studies relationship
 It helps in formulating policies
 It helps in forecasting
 It is helpful for common man
 Statistical methods merges with speed of
computer can make wonders; SPSS, STATA
Ritesh Singhal
Scope of Statistics
 In Business Decision Making
 In Medical Sciences
 In Actuarial Science
 In Economic Planning
 In Agricultural Sciences
 In Banking & Insurance
 In Politics & Social Science

Ritesh Singhal
Distrust & Misuse of Statistics
 Statistics is like a clay of which one
can make a God or Devil.
 Statistics are the liers of first order.
 Statistics can prove or disprove

Ritesh Singhal
Measure of Central Tendency
It is a single value represent the entire
mass of data. Generally, these are the
central part of the distribution.
It facilitates comparison & decision-
There are mainly three type of measure
1. Arithmetic mean
2. Median
3. Mode
Ritesh Singhal
Arithmetic Mean
This single representative value can be
determined by:
A.M. =Sum/No. of observations
1. The sum of the deviations from AM is always
2. If every value of the variable increased or
decreased by a constant then new AM will
also change in same ratio.

Ritesh Singhal
Arithmetic Mean (contd..)

3. If every value of the variable

multiplied or divide by a constant
then new AM will also change in same
4. The sum of squares of deviations
from AM is minimum.
5. The combined AM of two or more
related group is defined as

Ritesh Singhal
 The median is that value of the
variable which divides the group into
two equal parts, one part comprising
all values greater, and the other part
having lesser value than median.
 Determination of Median

>>> Arrange the data first

>>> Find the size of (N+1)/2 th item.

Ritesh Singhal
 Mode is that value which occurs most
often in the series.
 It is the value around which, the
items tends to be heavily
 It is important average when we talk
about “most common size of shoe or

Ritesh Singhal
Relationship among Mean, Median
& Mode
 For a symmetric distribution:
Mode = Median = Mean
 The empirical relationship between mean,
median and mode for asymmetric
distribution is:
Mode = 3 Median – 2 Mean

Ritesh Singhal
Mode: Peak of the curve.
Median: Divide the curve into two equal
Mean: Center of gravity of the curve.
For a positively skewed distribution:
 For a Negatively skewed distribution:


Ritesh Singhal
Dispersion or Variation
 The average does not enable us to
draw a full picture of the distribution.
So a further description is necessary
to get a better description.
 The extent or degree to which data
tends to spread around an average is
called dispersion & Variation.

Ritesh Singhal
 For judging the reliability of averages.
 Comparison of distributions
 Useful for controlling variability
 Useful in further analysis

Ritesh Singhal
Measure of Dispersion
 Range
 Inter quartile Range
 Mean Deviation
 Standard Deviation

Ritesh Singhal
 Range is the difference between the
largest and the smallest observation.
Range = L-S
 It is easy to calculate and provides a
full picture of variation of the data
 It is crude measure & not based on all
the observations.

Ritesh Singhal
Correlation Analysis
Correlation denotes the degree of
interdependence between variables or
the tendency of simultaneous variation
between variables.
Types of Correlation:
1. Positive & Negative

2. Linear & Non-linear

3. Multiple & Partial

Ritesh Singhal
Positive & Negative Correlation
 Positive  Negative
 Income Vs  Price Vs
Expenditure Consumption
 Agricultural Prod  Day temp Vs Sale
Vs Rainfall of Woolen clothes
 Sales Vs Advt Expd
 Cost of raw
material Vs Cost of
Industrial Prod

Ritesh Singhal
Measure of Correlation
 Scatter Diagram Method
 Karl Pearson’s Coefficient of
 Spearman’s Coefficient of Rank
 Concurrent Deviation Method

Ritesh Singhal
Scatter Diagram Method
 It is a graphical method to find the
correlation between variables.
 Here the pair of the observations are
plotted on a 2-D space.
 After joining the these points we can
have the idea about the relationship
between variables.

Ritesh Singhal
Karl-Pearson’s coefficient of
correlation (r)
 The value of r lying between -1 and
+1 i.e., -1≤r ≤+1
 Coefficient of correlation is
independent of change origin and
 Coefficient ‘r’ is symmetric rxy=ryx
 The Probable error of ‘r’ is used to
interpreting its estimated value.
Ritesh Singhal
Spearman’s Coefficient of Rank
 Karl-Pearson’s method discusses the
relationship between the quantitative
variable where as Spearman’s
coefficient suitable for qualitative
variable like, rank given to the
participant in any contest by two
judges and we want to measure the
relationship between rank given by
these judges.
Ritesh Singhal
Concurrent Deviation Method
 This is the simplest method in which
only the direction of change is taken
into consideration rather than
magnitude of variation.
 It gives a general idea about the
correlation between variables quickly.

Ritesh Singhal
Regression Analysis
 It is concerned with the formulation
and determination of algebraic
expression for the relationship
between variables.
 For this purpose we use regression
 These regression line are used for
predicting the value of one variable
from that of other.

Ritesh Singhal
Regression Analysis contd..

 Here the variable whose value is to be

predicted is called dependent
(Explained) variable and the variable
used for prediction is called independent
(Explanatory) variable.
 This method first introduced by “Sir
Francis Galton”.
 It helps in prediction & estimation.

Ritesh Singhal
Properties of Regression Lines &
 The regression line Y on X is used to
estimate the best value of Y (Dep.)
for a given value of X (Indep.).
 The regression line X on Y is used to
estimate the best value of X (Dep.)
for a given value of Y (Indep.).
 Both the regression coefficients are
independent of change of origin &

Ritesh Singhal
Properties of Regression Lines &
Coefficient (contd..)

 The relation between r, byxand bxy is

r = ±√ byx bxy
 Both the regression coefficient should
have same sign.
 Both the regression coefficient could
not more than one simultaneously.
 Regression coefficient denotes the rate
of change. i.e. byx measure the change
in Y for a unit change in X.
Ritesh Singhal
Properties of Regression Lines &
Coefficient (contd..)

 Both lines cut each other at (X, Y).

 If r=0, both lines perpendicular to
each other.
 If the regression lines are identical,
the correlation between the variable
is perfect.

Ritesh Singhal
Standard Error of Estimate
 It provides us a measure of scatter of
the observations about an average
line, the standard error of estimate of
Y on X is:
SY.X = √ [Σ(Y-Yest)2 / N]

Ritesh Singhal
 Probability is a concept which
numerically measures the degree of
uncertainty or certainty of the
occurrence of any event. i.e. the
chance of occurrence of any event.
 The probability of an event A is
No. of Favorable cases
P(A)= Total No. of Cases

Ritesh Singhal
 If P(A)=0, Impossible Event
 If P(A)=1, Sure Event
 0≤P(A)≤1
 P(A)= Probability of occurrence
 P(Ā)= Probability of Non-occurrence
 P(A) + P(Ā) = 1

Ritesh Singhal
Some Keywords
 Equally Likely Events: When the
chance of occurrence of all the events
are same in an experiment.
 Mutually Exclusive Events: If the
occurrence of any one of them
prevents the occurrence of other in
the same experiment.
 Sample Space: the set of all possible

Ritesh Singhal
Some Keywords
 Independent Events: If two or more
events occur in such a way that the
occurrence of one does not effect the
occurrence of other.
 Dependent Events: If the occurrence
of one event influences the
occurrence of the other.

Ritesh Singhal
Classical or Priori Probability
 If a trial result in ‘n’ exhaustive,
mutually exclusive and equally likely
cases and ‘m’ of them are favorable
to the happenings of an event E, then
the probability ‘P’ of happening of E is
given by:
P(E) = m / n

Ritesh Singhal
Empirical or Posteriori Probability
 The classical def requires that ‘n’ is
finite and that all cases are equally
 This condition is very restrictive and
can not cover all situations.
 The above conditions are not
necessarily active in this case.

Ritesh Singhal
Fundamental rule of counting
 If an event can occur in ‘m’ ways and
following it, a second event can occur
in ‘n’ ways, then these two event in
succession can occur in ‘mxn’ ways.
 E.g. A tricolor can be formed out of 6
colors in 6x5x4=120 ways.
 No. of words of 3 characters out of 26
alphabets 26x25x24= 15600 ways.

Ritesh Singhal
 The different arrangement can be
made out of a given no. of things by
taking some or all at a time are called
P (n,r) = n! / (n-r)!
 E.g. permutations made with letters
a,b,c by taking two at a time:
ab, ba, ac, ca, bc, cb
Ritesh Singhal
 The combination of ‘n’ different
objects taken ‘r’ at a time is a
selection of ‘r’ out of ‘n’ objects with
no attention given to order of
C (n,r) = n!/r!(n-r)!
e.g. From 5 boys & 6 girls a group of 3
is to be formed having 2 boys & 1 girl
is C(5,2) x C(6,1) = 60 ways

Ritesh Singhal
 A coin is tossed three times. Find the
probability of getting:
i) Exactly one head
ii) Exactly two head
iii) One or two head

Ritesh Singhal
 One card is randomly drawn from a pack
of 52 cards. Find the probability that
i) Drawn card is red
ii) Drawn card is an ace
iii) Drawn card is red and king
iv) Drawn card is red or king

Ritesh Singhal
 A bag contains 3 red, 6 white and 7
blue balls. Two balls are drawn at
random. Find the probability that
i) Both the balls are white.
ii) Both the balls are blue.
iii) One ball is red & other is white.
iv) One ball is white & other is blue.

Ritesh Singhal
Addition Theorem
 For any two event A and B the
probability for the occurrence of A or
B is given by:
P(AUB)= P(A) + P(B) – P(AПB)
If A & B are mutually Exclusive then
P(AUB)= P(A) + P(B)

Ritesh Singhal
Multiplication or Conditional
 The probability of an event B when it
is known that the event A has
occurred already:
P(B/A)= P(AПB) / P(A) ;if P(A)>0
ie. P(AПB)= P(A).P(B/A)
 If A and B are Independent event:

P(AПB)= P(A).P(B)

Ritesh Singhal
 A bag contains 25 balls numbered from 1
to 25. Two balls are drawn at random from
the bag with replacement. Find the
probability of selecting:
i) Both odd numbers.
ii) One odd & one even.
iii) At least one odd.
iv) No odd numbers.
v) Both even numbers.

Ritesh Singhal
 Five men in a company of 20 are
graduate. If 3 men are picked up at
random, what is the probability that
they are all graduate? What is the
probability that at least one is

Ritesh Singhal
 The probability that A hits a target is
1/3 and the probability that B hits the
target is 2/5. What is the probability
that the target will be hit, if each one
of A and B shoots at the target.

Ritesh Singhal
Expected Value of Probability
 Let X be the random variable with the
following distribution:
X : x1 x2 x3………..
P(X) :P(x1) P(x2) P(x3)……..
Expected Value is given by:
E(X) = Σ xi . P (xi)

Ritesh Singhal
 A player tossed two coins. If two
heads show he wins Rs. 4. if one
head shows he wins Rs. 2, but if two
tails show he pays Rs. 3 as penalty.
Calculate the expected value of the
game to him.
 Solution:
E(X)= (-3) ¼ + (2) ½ + (4) ¼ =1.25

Ritesh Singhal
 An insurance company sells a
particular life insurance policy with a
face value of Rs. 1000 and a yearly
premium of Rs. 20. If 0.2% of the
policy holder can be expected to die
in the course of a year, what would
be the company’s expected earning
per policy holder per year.
 E(X)= (-980) 0.002 + (20) 0.998=18

Ritesh Singhal
Theoretical Probability Distribution

Ritesh Singhal