You are on page 1of 29

Facts are stubborn, but statistics are more pliable.

-Mark Twain
IIM TRICHY, BATCH OF 2012-2014
RECAP
We learned
Types of data
Descriptive statistics
Probability
What we will see today
Probability distributions
Discrete
Continuous
Sampling and Estimation




BINOMIAL DISTRIBUTION
Binomial distribution
Experiment has n identical trials
Each trial has only two possibilities denoted as
success or failure
Each trial is independent of the previous one
Probability of failure and success remain the same
throughout the experiment
Sample size n 5%N
Examples
Quality control
BINOMIAL DISTRIBUTION
Probability function for binomial distribution

where,
n = number of trials(or sample size)
x = number of successes desired
p = probability of getting a success in a trial
q = 1-p = probability of getting a failure in a
trial
Mean =
Variance =




!
( ) . .
!( )!
n x n x
x
n
P x C p q
x n x

= =

. n p =
. . n p q o =
POISSON DISTRIBUTION
Poisson distribution
Describes rare events
Each occurrence is independent of the other
occurrences
Describes discrete events over a continuum
No limit to number of occurrences in a interval [0,)
Expected number of occurrences must hold constant
throughout the experiment. Mean is constant!
Examples
Number of phone calls/min in a business
Number of HIV patients/day in a hospital

POISSON DISTRIBUTION
Poisson formula

where,
x = 0, 1, 2, 3 .
= Expected value, which is a constant!
Interesting fact!
Expected value and variance of Poisson
distribution is the same =



( )
!
x
e
P x
x

HYPER GEOMETRIC DISTRIBUTION


Hyper geometric distribution
Each outcome is a success or failure
Sampling is done without replacement
Population, N is finite
Number of successes in a population is unknown
Sample size n 5%N
Example
All binomial experiments without replacement

HYPER GEOMETRIC DISTRIBUTION
Hyper geometric formula


where,
N = size of the population
n = sample size
A = number of successes in the population
x = number of successes in the sample, sampling
is done without replacement



.
( )
A N A
x n x
N
n
C C
P x
C

=
EXAMPLES
An MBA graduate is applying for nine jobs and
believes that she has in each of the nine cases
a constant and independent probability of
0.48 of getting an offer
What is the probability that she will have at least
three offers
If she wants to be 95% confident of having at least
three offers, how many more jobs should she
apply for?
EXAMPLES
The customers at HDFC bank arrive randomly on a
Monday afternoon at an average of 16 customers every 20
minutes. What is the probability of having 3 or less
customers in a 4 minute interval on a Monday afternoon?
Probability of having more than seven customers in a 4
min interval is _____
According to the Census of India, approximately 5% of all
citizens of the state of Andhra Pradesh do not have a
shelter. In conducting a random survey in the state of
Andhra Pradesh, what is the probability of getting two or
fewer people not having shelter in a sample of 20?
CONTINUOUS PROBABILITY DISTRIBUTIONS
Uniform distribution
Relatively simple distribution
Area determines the probability
Example
Variation in weights produced (A to B)
Probability density function
1
( ) f x
b a
=

a x b s s
2
a b

+
=
12
b a
o

=
NORMAL DISTRIBUTION
Normal distribution
Uni-modal distribution
Symmetrical about its mean
Asymptotic to the horizontal axis
Area under curve is 1
Probability density function

x
z

o

=
2
1
2
1
( )
2
z
f x e
o t

=
NORMAL DISTRIBUTION






NORMAL DISTRIBUTION
Standardized Normal Distribution
Tedious calculating density values for different mean and
standard deviation values
We have the savior, z-score!
Z-score is the number of std. deviations that x is away
from the mean on the bell curve
Observe that z-distribution is also a normal distribution!
Normal curve to approximate binomial distribution
problems
When the sample size is large, binomial distributions
approach the normal distributions

NORMAL DISTRIBUTION

EXPONENTIAL DISTRIBUTION
Exponential distribution
Skewed to the right
Random value it takes can be from 0 to
Starts at x = 0
Exponentially decreases
Probability density function
Facts about exponential distribution

( )
x
f x e


=
0
0
x

>
>
0
0
( )
0
x
P x x e
x

> =
>
1
Mean

= =
EXPONENTAIL DISTRIBUTION

EXAMPLES
The time between customer arrivals at a bank
has an exponential distribution with a mean
time between arrivals of three minutes. If a
customer just arrived, what is the probability
that another customer will not arrive for at
least two minutes?
EXAMPLES
According to a survey done by R&D wing of IRDA, the average
annual cost for automobile insurance in India is Rs 7200. If the
automobile insurance costs are uniformly distributed and the
minimum cost is Rs 4400, what is the probability that a persons
automobile insurance cost might range between Rs 6000 and Rs
8000? Also, calculate the standard deviation of the uniform
distribution.
Graduate Record examination(GRE), produced by the ETS is
widely used by graduate schools in United states as an entrance
requirement. As a large number of students take the
examination, the scores are nearly normally distributed. In
2010, the average GRE score was 1050 and the standard
deviation was 100. What is the probability that a randomly
selected score is between the mean and 1250? What is the
probability of getting a score over 1350? What is the probability
of getting a score between 850 and 1250?
EXAMPLES
A chip manufacturing firm as a part of their quality control
established that a defective chip occurs in a pattern that is
poisson distributed on the average of 5 defective units
every one hour during production runs. Determine the
probability that less than 15 minutes elapse between any
two defects.
NASA tests the bolts that go into orbit vehicle very
rigorously. The quality control team tests samples each of
size 160 and the probability of success is 0.3. If a random
sample is chosen what is the probability that 60 or more
items pass the quality control test?
Correction for continuity!

SAMPLING
Why sampling?
Types of sampling
Random
Simple Random sampling
Stratified Random sampling
Non-Random
Convenient sampling
Judgment sampling
Snowball sampling
SAMPLING
Sampling distribution of
Central limit theorem

x
If samples of size n are drawn randomly from a population, the
sample means are approximately normally distributed for sufficiently
large sample sizes(n30) regardless of the shape of the population
distribution.
x
=
x
n
o
o =
EXAMPLE
Suppose during any hour in Reliance fresh
store, the average number of customers is
130, with a standard deviation of 10
customers. What is the probability that a
random sample of 64 different shopping hours
will yield a sample mean between 125 and
135 shoppers?

Alpha level of significance
ESTIMATION
How to estimate the population mean from
the sample when population standard
deviation is known?

and
Confidence interval


/ 2 / 2
x z x z
n n
o o
o o
s s
/ 2 / 2
1 P x z x z
n n
o o
o o
o
(
s s =
(

x
z
n

=
| |
|
\ .
EXAMPLE
A survey was taken of all European firms that do
business with India. A random sample of 50
responses to the question Approximately how
many years has your company been doing
business with India? yielded a mean of 7.56
years. Suppose the population standard deviation
for this question is 5 years, construct a 90%
confidence interval for the average number of
years that a EU company has been doing business
with India for whole population.
ESTIMATION
How to estimate the population mean from
the sample when population standard
deviation is not known?



t statistic



/ 2, 1 / 2, 1
1
n n
s s
x t x t
n n
o o
o

(
s s =
(

1 df n =
x
t
s
n

=
| |
|
\ .
ESTIMATION
Degrees of freedom
Difference between number of independent
observations for a source of variation and number
of independent parameters estimated in
computing variation
Use t statistic If sample is less than 30
If n30, the t distribution approximates to
normal distribution

BESSELS CORRECTION
Biased estimator
E(estimator) = population statistic
Variance calculated using sample mean is
always less than variance calculated from
population mean


THANK YOU

You might also like