You are on page 1of 15

SVKM’s Narsee Monjee Institute of Management Studies

Mukesh Patel School of Technology Management & Engineering

Unit I & IV(partial)


Unit I: Basic Probability (12 Hrs.)
Probability spaces, conditional probability, independence; Discrete random variables,
Independent random variables; The multinomial distribution, Poisson approximation to the
binomial distribution, Infinite sequences of Bernoulli trials; sums of independent random
variables, Expectation of Discrete Random Variables, Raw and Central Moments; Variance of a
sum; Correlation coefficient of discrete random variables; Chebyshev's Inequality: Statement and
examples.
Unit-IV: Basic Statistics (05 Hrs.)
Measures of Central tendency: Moments, skewness, Kurtosis. Moments, skewness and
Kurtosis for Binomial distribution & Poisson distribution
Probability spaces
Consider an experiment whose outcome is not predictable with certainty. However, although the
outcome of the experiment will not be known in advance, let us suppose that the set of all possible
outcomes is known. This set of all possible outcomes of an experiment is known as the SAMPLE
SPACE of the experiment and it is denoted by S.
Example: If the experiment consists of flipping two coins, then the sample space consists of the
following four points S = {HH, HT, TH, TT }
Each outcome in a sample space is called a Sample Point Number of sample points in a sample
space S is n(S) = nk Where n = number of outcomes and k = number of objects
Probability:
If an experiment results in ‘n’ exhaustive, mutually exclusive and equally likely cases and ‘m’
of them are favorable to the happening of an event ‘A’ then Probability of happening of
A is
m
P ( A) =
n
Since the number of cases in which the event A will not happen is ‘n – m’, the probability that
event A will not happen is:
n−m
P ( A) =
n
Therefore P( A) + P( A) = 1
Axioms of probability:
Consider an experiment whose sample space is S. For each event E of the sample space S, then

1. 0  P ( E )  1
2. P( S ) = 1
SVKM’s Narsee Monjee Institute of Management Studies
Mukesh Patel School of Technology Management & Engineering

Laws of probability:
1. Addition theorem:
P( A  B) = P( A) + P( B) − P( A  B)
If A and B are exclusive events i.e. disjoint sets, then: P( A  B) = P( A) + P( B)
2. Addition theorem (for three events):If A, B and C are pairwise exclusive events
P( A  B  C ) = P( A) + P( B) + P(C ) − P( A  B) − P( A  C ) − P( B  C ) − P( A  B  C )
Complementary Event: P( A ') = 1 − P( A)
Conditional Probability and Independence
If A and B are two events in a sample space S, then the probability of the event B when
the event A has already occurred is called the conditional probability of B and is denoted by
P(A|B) and defined as
P( A  B)
P( A | B) =
P( B)
The probability P(A|B) is an updating of P(A) based on the knowledge that event B has already
occurred.
Multiplication law of probability: P( A  B) = P( A | B).P( A) = P( A | B).P( B)
Independent events:
A set of events is said to be independent if the occurrence of any one of them does not depend on
the occurrence or non-occurrence of the others. If two events A and B are independent then:
P ( A  B) = P( A).P( B)
Theorem of total probability:
If B1, B2,……Bn be a set of exhaustive and mutually exclusive events and A is another event
associated with Bi, then
n
 A
P( A) =  P( Bi )P  
i =1  Bi 
Baye’s theorem:
If E1, E2, E3, . . . En are mutually exclusive and exhaustive events with P(Ei) ≠ 0 for i = 1 to
n of a RANDOM experiment then for any arbitrary event ‘A’ of the sample spaces of the
above experiment with P(A) > 0 , we have i=1
E  P( Ei ).P( A / Ei )
P i  = n
 A
 P( Ei ).P( A / Ei )
i =1
SVKM’s Narsee Monjee Institute of Management Studies
Mukesh Patel School of Technology Management & Engineering

Random Variable
A random variable is a function that assigns a real number to every element of sample space. Let
S be the sample space of an experiment. Here we assign a specific number to each outcome of
the sample space. A random variable X is a function from S to the set of real numbers R i.e. X: S
→R
Ex. Suppose a coin is tossed twice S = {HH, HT, TH, TT}.Let X: represents number of heads on
top face. So to each sample point we can associate a number X (HH) = 2, X (HT) = 1, X (TH) = 1,
X (TT) = 0.Thus X is a random variable with range space RX = { 0, 1, 2}
Types of random Variable:
Discrete Random Variable: A random variable which takes finite number of values or
countable infinite number of values is called discrete random variable.
Example: Number of alpha particles emitted by a radioactive source.
Continuous Random Variable: A random variable which takes non-countable infinite number
of values is called discrete random variable. Example: length of time during which a vacuum
tube is installed in a circuit functions is a continuous RV.
Discrete Probability Distribution
Suppose a discrete variate X is the outcome of some experiment. If the probability that X takes
the value xi is pi, then
P( X = xi ) = pi or p( xi ) for i = 1, 2....n
Where
a. p( xi )  0
b.  p( x ) = 1
i

The set of values xi with their probabilities pi i.e. (xi,pi) constitute a discrete probability
distribution of the discrete variate X. The function p is called probability mass function (pmf) or
probability density function (pdf).
Cumulative Distribution Function (CDF) or Distribution Function of discrete random variable
X is defined by F(x) = p(X ≤ x) where x is a real number (– ∞ < x < ∞)
F ( x = u ) =  p ( x)
x u

Expectation
If an experiment is conducted repeatedly for large number of times under essentially
homogeneous conditions, then the average of actual outcomes i.e. the mean value of the
probability distribution of random variable is the expected value. Let X be a discrete random
variable with PMF p(x) or PDF f(x) then its mathematical expectation is denoted by E(x) and is
defined as
SVKM’s Narsee Monjee Institute of Management Studies
Mukesh Patel School of Technology Management & Engineering

 = E ( X ) =  xp ( x)
E ( X 2 ) =  x 2 p( x)

Properties:
1.E (a ) = a.
2.E (ax + b) = aE ( x) + b
3.E ( x + y ) = E ( x) + E ( y )
4.E ( xy ) = E ( x).E ( y ), if x and v are independent R.V.
Variance: Variance of r.v. X is defined as
 2 =  ( xi −  )2 . f ( xi ) = ( x − xi )2 . p( xi )
i i

Also: Var ( X ) = E ( X 2 ) − [ E( X )]2


Properties:
1.Var (a) = 0
2.Var (aX + b) = a 2Var ( X )
3.Var ( X )  0

Standard Deviation: S.D. =  = Var ( X )


Moments
th
1. The r moment of a r.v. X about any point (X=A) is given by r (aboutX = A) = E ( X − A) r
2. Ordinary moments /Raw moments (moments about origin):
th
The r raw moment of a r.v. X (i.e. about A= 0 ) is given by

 'r (about origin) = E ( X ) r


r = 0   '0 (about origin) = 1
r = 1   '1 (about origin) = E ( X ) = Mean

3. Central moments (moments about mean):


The rth is given by central moment of a r.v. X (about A = x)

r (about mean)==E(X- x) r
r = 0  0 =E(1) = 1.
r = 1  1 =E(X- x) = E ( X ) − E ( x) = x − x = 0
r = 2  1 =E(X- x) 2 = Var ( X )

4. Central Moments in terms of Raw moments


SVKM’s Narsee Monjee Institute of Management Studies
Mukesh Patel School of Technology Management & Engineering

0 = 1, 1 = 0
Var ( X ) =  2 =  2' − 1'2
3 = 3' − 31'  2' + 21'3
 4 =  4' − 41' 3' + 61'2  2' − 31'4

Moment Generating Function:


Suppose X is a discrete random variable, discrete or continuous. The Moment generating
function (mgf or MGF) is defined and denoted by :

n
M X (t ) =  etxi f ( xi ) =  etx f ( x ) (for discrete variable)
i =1

M X (t ) = e (for continuous variable)
tx
f ( x)dx
−

t2 tr
Also by Taylor series M X (t ) = 1 + t + 2' + .....r' + ...
2! r!

Remark: If the mgf exists for a random variable X, we will be able to obtain all the moments of
X. It is very plainly put, one function that generates all the moments of X.

Result: Suppose X is a random variable (discrete or continuous) with moment generating


function then the rth raw moment is given by

Skewness
Skewness, which means lack of symmetry, is the property of a random variable or its distribution
by which we get an idea about the shape of the probability curve of the distribution. If the
probability curve is not symmetrical but has a longer tail on one side than on the other, the
distribution is said to be skewed. If a distribution is skewed, then the averages mean, median
and mode will take different values and the quartiles will not be equidistant from the median.
The measure of skewness used in common is the third order central moment (  3 ).
3
The moment coefficient of skewness is defined as v1 = .
 23 2
SVKM’s Narsee Monjee Institute of Management Studies
Mukesh Patel School of Technology Management & Engineering

Kurtosis
Even if we know the measures of central tendency, dispersion and skewness of a random variable
(or its distribution). we cannot get a complete idea about the distribution. In order to analyze the
distribution completely, another characteristic kurtosis is also required. Kurtosis means the
convexity of the probability curve of the distribution. Using the measure of coefficient of kurtosis,
we can have an idea about the flatness or peakedness of the probability curve near its top.
The only measure of kurtosis used is the fourth order central moment (  4 ).
4
The coefficient of kurtosis is defined as  2 = .
22
Note: 1. Curve which is neither flat nor peaked is called a mesokurtic curve, for which  2 = 3
2. Curve which is flatter than the curve 1 is called platykurtic curve, for which  2  3
3. Curve which is more peaked than the curve 1 is called leptokurtic curve, for which  2  3

Covariance:
If X and Y are two random variables then covariance between them is defined as
Cov( X , Y ) = E{( X − E ( X ).(Y − E (Y )} = E ( XY ) − E ( X ).E (Y )
Properties:
SVKM’s Narsee Monjee Institute of Management Studies
Mukesh Patel School of Technology Management & Engineering

1.Cov( X , Y ) = 0, Iff X and Y are two independent random variables


2.Cov(aX+bY,cX+dY)=a.c X 2 + b.d Y 2 + (ad + bc)Cov( X , Y )
3.Cov(aX , bY ) = a.bCov( X , Y )
4.Var (aX  bY ) = a 2V ( X ) + b 2V (Y )  2abCov( X , Y )
5.Var (aX  bY ) = a 2V ( X ) + b 2V (Y ), Iff X and Y are two independent random variables
Correlation Coefficient of X and Y:
The term correlation refers to the degree of relationship between two or more variables. If a
change in one variable effects a change in the other variable, the variables are said to be
correlated. Let X and Y be any two discrete random variables with standard deviations σX and
σY, respectively. The correlation coefficient of X and Y, denoted Corr(X,Y) is defined as:
Cov( X , Y )
 XY = Corr ( X , Y ) = r ( X , Y ) =
 X . Y

The correlation coefficient is bounded by − 1 ≤ ρ ≤ 1. It will have value ρ = 0 when the covariance
is zero and value ρ = ±1 when X and Y are perfectly correlated or anti-correlated.
Chebyshev's inequality

If X is a RV with mean µ =np and variance=npq=  , then foe any positive number k
2

1
P | X −  | k  
k2
1
Or P | X −  | k   1 −
k2
Bernoulli Experiment
A Bernoulli Experiment is a random experiment, the outcome of which can be classified in
exactly one of two mutually exclusive and exhaustive ways, say, success or failure (e.g. female
or male, non-defective or defective)
Bernoulli Random Variable:
Suppose that a trial, or an experiment, whose outcome can be classified as either a success or a
failure is performed. If we let X = 1 when the outcome is a success and X = 0 when it is a failure,
then the probability mass function of X is given by
p(0) = P{X = 0} = 1 − p
p(1) = P{X = 1} = p
where p, 0  p  1 , is the probability that the trial is a success.
SVKM’s Narsee Monjee Institute of Management Studies
Mukesh Patel School of Technology Management & Engineering

A random variable X is said to be a Bernoulli random variable (after the Swiss mathematician
James Bernoulli) if its probability mass function is given by above equations for some p ∈(0, 1).
Sequence of Bernoulli trials
A sequence of n trials is said to be a sequence of n Bernoulli trials if The trials are independent
Each trial results in exactly one of the 2 outcomes – success and failure. The probability of
success in each trial is p, 0  p  1
Binomial Distribution
Suppose X denotes the number of successes in a sequence of n Bernoulli trials and let the
probability of success in each trial be p. Then X is said to follow a Binomial distribution with
parameters n and p if the probability distribution of X is given by
B(n, p) = P( X = x) = px = nC x p x q n − x , x = 0,1, 2......n

Example:
Suppose there are 2000 computer chips in a batch and there is a 2% probability that any one chip
is faulty. Then the number of faulty computer chips in the batch follows a Binomial distribution
with parameters n=2000 and p=2%.
Mean and variance of the Binomial distribution:
E ( X ) = np
E ( X 2 ) = n(n − 1) p 2 + np
Var ( X ) = E ( X 2 ) − {E ( X )}2
=n(n − 1) p 2 + np − n 2 p 2
=npq

 d 
Recurrence relation for central moments of Binomial Distribution: r +1 = pq  nr r −1 + r 
 dp 

Moment generating function of the Binomial Distribution: M x (t ) = (q + pet ) n

Skewness = npq ( q − p )
Kurtosis = npq 1 + 3 pq ( n − 2 ) 

Poisson Distribution
A random variable X that takes on one of the values 0, 1, 2, . . . is said to be a Poisson random
variable with parameter λ if, for some λ>0

e−  . x
P( X = x) = , x = 0,1, 2..........
x!
SVKM’s Narsee Monjee Institute of Management Studies
Mukesh Patel School of Technology Management & Engineering

Poisson distribution as a limiting case of the Binomial distribution:


The Poisson random variable has a tremendous range of applications in diverse areas because
it may be used as an approximation for a binomial random variable with parameters (n, p)
when n is large and p is small enough so that np is of moderate size.
The Poisson distribution is a limiting case of the binomial distribution under the following
conditions:
n, the no of trails is indefinitely large, i.e. n → ∞.
p ,the constant probability for the success of each trail is indefinitely small, i.e. p → 0.
np =λ is finite.

Moment Generating Function of the Poisson distribution= M X (t ) = e−  ee


t

Mean of the Poisson distribution= Mean = E ( X ) = 

Variance of the Poisson distribution= Var ( X ) = 

d 
In general, any ( k + 1) order central moment for Poisson distribution= k +1 =   k + k k −1 
th

 d 

Skewness = 

Kurtosis =  ( 3 + 1)

Multinomial Distribution
The multinomial distribution is used to find probabilities in experiments where there are more
than two outcomes. The multinomial distribution arises from an extension of the binomial
experiment to situations where each trial has k ≥ 2 possible outcomes.
Suppose E1,E2,……Ek are k mutually exclusive and exhaustive outcomes of a trial with respective
probabilities p1,p2…pk. The probability that E1 occurs n1 times, E2 occurs n2 times …….Ek occurs
nk times in n independent observations is given by:
n!
P(n1 , n2 ,......nk ) = p1n1 . p2n2 ...... pknk
(n1 !).(n2 !).....(nk !)

References of the entire unit


1. Probability, Statistics and Random Processes, T. Veerarajan, Tata McGraw Hill, 3 rd edition.
2. Fundamentals of Mathematical Statictics, S.C. Gupta & V.K. Kapoor, Sultan Chand & Sons
SVKM’s Narsee Monjee Institute of Management Studies
Mukesh Patel School of Technology Management & Engineering
Unit-II
Formula Sheet

CONTINUOUS RANDOM VARIABLES


A random variable which takes non-countable infinite number of values is called discrete
random variable. Example: length of time during which a vacuum tube is installed in a circuit
functions is a continuous RV.
Probability Density Function (PDF)
Suppose X is a continuous RV such that
x + dx /2
 1 1 
P  x − dx  X  x + dx  =  f ( x )dx
 2 2  x − dx /2

Then f(x) is called a pdf of X, provided f(x) satisfies the following conditions:
i. f ( x)  0

ii.  f ( x)dx = 1
−

Cumulative Distribution Function (CDF):


Cdf of continuous random variable X is defined by F(x) = P(X ≤ x) where x is a real number (–
∞ < x < ∞) such that
x
F ( x) = 
−
f ( x)dx

Properties:
i. F(x) is non-decreasing.
ii. F (−) = 0
iii. F ( ) = 1
d
iv. F ( x) = f ( x)
dx

Mathematical Expectation:
Using the pdf or the distribution function, we can obtain the average value/mean/expected
value of the continuous r.v. X.

E( X ) =  xf ( x)dx
−


E( X 2 ) = x
2
f ( x)dx
−

Expectation of a constant is that constant itself i.e. E(a) = a, if a is a constant

Variance: V ar( X ) = E ( X 2 ) −  E ( X )
2
Unit-II
Formula Sheet

Standard Deviation S.D. =  = Var ( X ) this definition not required


Measure of central tendency
A measure of central tendency is the value of the random variable which is
representative of the entire distribution of the variable.

NORMAL DISTRIBUTION
The normal distribution was introduced by the French mathematician Abraham De Moivre in
1733, who used it to approximate probabilities associated with binomial random variables
when the binomial parameter n is large. It is also known as the Gaussian distribution and the
bell curve. Many things closely follow a Normal Distribution such as heights of people, size of
things produced by machines, errors in measurements, blood pressure, marks on a test etc.
Definition:
A continuous random variable X is said to follow Normal distribution with parameters mean
2
μ and variance σ if its probability density function (PDF) is given by:
−1  x −  
2

1  
f ( x) = e2  
, −  x,    &   0
 2
2
If random variable X follows Normal distribution with parameters μ and σ then it can be
written as X~ N( μ, σ).
The Normal Distribution has:
1. Mean = median = mode= μ
2. Symmetry about the center
3. 50% of values less than the mean and 50% greater than the mean
2
4. Var= σ
Moment Generating Function (MGF)= et +(
2 2
5. t /2)

6. Coefficient of skewness 3 = 0
7. Coefficient of Kurtosis 4 = 3

Area property of normal distribution:


The area under the bell-shaped curve of normal distribution denotes probability. The total area
under the curve is equal to one. The normal curve approaches, but never touches, the x-axis:
Unit-II
Formula Sheet

Standard Normal Variate( SNV):


A standard normal variate is a normal variate with mean µ=0 and standard deviation σ =1 with
a probability density function is
1 − z 2 /2,
 ( z) = e −  z  
2
x−
z= is called Standard Normal Variate.

In general, Central moments of odd powers of a normal distribution is zero.
2 n+1 = 0; n = 0,1, 2,.....
Central moments of even powers is given by 2n = 1.3.5.... ( 2n −1)  2n
Skewness = 0
Kurtosis = 3 4

EXPONENTIAL DISTRIBUTION
The exponential distribution is one of the widely used continuous distributions. It is often used
to model the time elapsed between events suppose we are posed with the question- How much
time do we need to wait before a given event occurs?
The answer to this question can be given in probabilistic terms if we model the given problem
using the Exponential Distribution.
Definition: A Continuous random variable X is said to follow an exponential distribution or
negative exponential distribution with parameter λ>0, If its pdf is given by:
 e−  x , x  0
f ( x) = 
0, otherwise
Unit-II
Formula Sheet

1
Mean of Exponential Distribution Mean = E ( X ) =

1
Variance of Exponential Distribution: Var ( X ) =
2
Memoryless property: P( X  s + t / X  s ) = P( X  t ) for s, t  0

GAMMA DISTRIBUTION
A continuous RV X is said to follow Gamma/Erlang Distribution with parameters λ>0 and k>0,
  k x k −1e−  x
 ,x 0
if its probability density function is given by f ( x) =  k
0, otherwise

Note: If k =1, the gamma distribution reduces to the exponential distribution.

k
Mean of Gamma Distribution Mean = E ( X ) =

k
Variance of Gamma Distribution: Var ( X ) =
2
Unit-II
Formula Sheet

You might also like