You are on page 1of 87

Descriptive Statistics

A More Rigorous Discussion


Statistic vs Statistics 2

• Statistic: Any function of sample observations


• Statistics (as a plural noun): numerical data of
any kind
• STATISTICS (as a singular noun): A collection of
specialized scientific methods for the collection,
analysis and interpretation of numerical data

Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
Descriptive Statistics 3

Descriptive
Statistics

Based on Based on
Sample Sample
Moments Quartiles

Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
Sample Moments 4

• Let X denote the random variable of interest defined over a


population of interest
• Suppose X is measured on every member of a random sample of
size n drawn from the population
• Let 𝑥𝑥1 , 𝑥𝑥2 , ⋯ , 𝑥𝑥𝑛𝑛 represent these n observations
• Sample moments are special types of statistics
Raw moments
Sample
moments Central
Statistical Structures in Data, PGDBA Programme, ISI, 2022 moments October 14, 2022
Raw Moments 5

• For the given set of observations, the raw moment of order r is


defined as 𝑛𝑛
1
𝑚𝑚𝑟𝑟 = � 𝑥𝑥𝑖𝑖𝑟𝑟 ,
′ 𝑟𝑟 = 0, 1, 2, ⋯
𝑛𝑛
𝑖𝑖=1
• Note that 𝑚𝑚0′ = 1
and
𝑚𝑚1′ = 𝑥𝑥,̅ the sample mean

Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
Central Moments 6

• For the given set of observations, the central moment of


order r is defined 𝑛𝑛as
1
𝑚𝑚𝑟𝑟 = � 𝑥𝑥𝑖𝑖 − 𝑚𝑚1′ 𝑟𝑟 , 𝑟𝑟 = 0, 1, 2, ⋯
𝑛𝑛
𝑖𝑖=1
• Note that 𝑚𝑚1 = 0
and
𝑚𝑚2 = 𝑠𝑠 2 , the sample variance
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
Moment-based Descriptive Measures or
Statistics 7

• Measures of Central Tendency


• Sample mean: 𝑚𝑚1′ = 𝑥𝑥̅

• Measures of Variation or Dispersion


1
• Variance: 𝑠𝑠 2 = 𝑚𝑚2 = ∑𝑛𝑛𝑖𝑖=1 𝑥𝑥𝑖𝑖 − 𝑚𝑚1′ 2
𝑛𝑛
• Standard deviation: 𝑠𝑠 = 𝑚𝑚2
𝑚𝑚2 𝑠𝑠
• Coefficient of variation: 𝑐𝑐𝑣𝑣 = =
𝑚𝑚1′ 𝑥𝑥̅

Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
Moment-based Descriptive Measures or
Statistics 8

• Measures of Skewness
𝑚𝑚32 𝑚𝑚3
• 𝑏𝑏1 = and 𝑔𝑔1 = 𝑏𝑏1 =
𝑚𝑚23 3⁄2
𝑚𝑚2

• Measures of Kurtosis
𝑚𝑚4 𝑚𝑚4
• 𝑏𝑏2 = = and 𝑔𝑔2 = 𝑏𝑏2 − 3
𝑚𝑚22 𝑠𝑠 4

Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
Quartile-based Descriptive Statistics 9

• Measure of Central Tendency


• Median: 𝑄𝑄2
• Measures of Dispersion
• Interquartile Range 𝐼𝐼𝐼𝐼𝐼𝐼 = 𝑄𝑄3 − 𝑄𝑄1
• Mean Absolute Deviation about𝑛𝑛 Median
1
� 𝑥𝑥𝑖𝑖 − 𝑄𝑄2
𝑛𝑛
𝑖𝑖=1
• Bowley’s Measure of Skewness
𝑄𝑄3 − 𝑄𝑄2 − 𝑄𝑄2 − 𝑄𝑄1
𝑆𝑆𝑆𝑆 =
2 𝑄𝑄3 − 𝑄𝑄1
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
Rationale for Bowley’s Measure 10

Q1 Q3 Q3
Q1 Q3
Q1
Q2 Q2 Q2

Negatively Symmetric Positively


Skewed (Not Skewed) Skewed
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
Other Descriptive Measures 11

• Central Tendency
• Mode

• Pearson’s Measures of Skewness


• First measure:
𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀 − 𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀
𝑠𝑠
• Second measure:

3(𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀 − 𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀)
𝑠𝑠
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
Rationale for Pearson’s Measures 12

Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
Suggested Book 13

A. M. Gun, M. K. Gupta and B. Dasgupta, Fundamentals of Statistics


(Volume I), World Press (2016).

Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
Standard Probability
Distributions
Univariate Discrete Distributions 15

Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
General Notions 16

• Let X be a discrete random variable


• Let 𝑥𝑥1 , 𝑥𝑥2 , 𝑥𝑥3 , … be values that X takes with non-zero probabilities.
• 𝑥𝑥1 , 𝑥𝑥2 , 𝑥𝑥3 , … are called the mass points of X .
• Let 𝑝𝑝𝑖𝑖 be the probability of X taking the value 𝑥𝑥𝑖𝑖 , for 𝑖𝑖 = 1,2,3, …,
with 0 < 𝑝𝑝𝑖𝑖 < 1 ∀𝑖𝑖 and ∑∞ 𝑖𝑖=1 𝑝𝑝𝑖𝑖 = 1.
• Then the probability mass function (p.m.f.) of X is defined as
𝑝𝑝 if 𝑥𝑥 = 𝑥𝑥𝑘𝑘 , 𝑖𝑖 = 1,2,3, …
𝑓𝑓 𝑥𝑥 = 𝑃𝑃 𝑋𝑋 = 𝑥𝑥 = � 𝑘𝑘
0 otherwise

Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
General Notions (contd.) 17
• Cumulative Distribution Function (c.d.f.)
𝐹𝐹 𝑥𝑥 = 𝑃𝑃 𝑋𝑋 ≤ 𝑥𝑥 = � 𝑓𝑓(𝑥𝑥𝑖𝑖 ) = � 𝑝𝑝𝑖𝑖
𝑖𝑖:𝑥𝑥𝑖𝑖 ≤𝑥𝑥 𝑖𝑖:𝑥𝑥𝑖𝑖 ≤𝑥𝑥
• Expectation or Expected∞ Value
𝜇𝜇 = 𝐸𝐸 𝑋𝑋 = � 𝑥𝑥𝑖𝑖 𝑓𝑓(𝑥𝑥𝑖𝑖 )
𝑖𝑖=1
• Raw moments of order r,∞𝑟𝑟 = 1,2,3, …
𝜇𝜇𝑟𝑟′ = 𝐸𝐸 𝑋𝑋 𝑟𝑟 = � 𝑥𝑥𝑖𝑖𝑟𝑟 𝑓𝑓(𝑥𝑥𝑖𝑖 )
𝑖𝑖=1
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
General Notions (contd.) 18
• Central moments of order∞r, 𝑟𝑟 = 1,2,3, …
𝜇𝜇𝑟𝑟 = 𝐸𝐸(𝑋𝑋 − 𝜇𝜇)𝑟𝑟 = �(𝑥𝑥𝑖𝑖 − 𝜇𝜇)𝑟𝑟 𝑓𝑓(𝑥𝑥𝑖𝑖 )
𝑖𝑖=1
• Variance: 𝜎𝜎 2 = 𝜇𝜇2
• Median is defined to be that value 𝜇𝜇� for which
1
𝐹𝐹 𝜇𝜇� =
2
• Mode 𝜇𝜇0 is defined to be that value of X for
which

𝑓𝑓 𝜇𝜇0 = max 𝑓𝑓(𝑥𝑥𝑖𝑖 )


Statistical Structures in Data, PGDBA Programme, ISI, 2022 𝑖𝑖 October 14, 2022
General Notions (contd.) 19

• Skewness
𝜇𝜇32 𝜇𝜇3
𝛽𝛽1 = 3 , 𝛾𝛾1 =
𝜇𝜇2 𝜎𝜎 3

• Kurtosis
𝜇𝜇4
𝛽𝛽2 = 2 , 𝛾𝛾2 = 𝛽𝛽2 − 3.
𝜇𝜇2
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Discrete Uniform Distribution 20

Let X be a discrete random variable which can take any of the values
𝑎𝑎, 𝑎𝑎 + ℎ, 𝑎𝑎 + 2ℎ, 𝑎𝑎 + 3ℎ, … , 𝑎𝑎 + 𝑛𝑛 − 1 ℎ with the same probability, where a
is any real number, ℎ is a positive real number and 𝑘𝑘 is a positive integer.

• Then X is said to have a discrete uniform distribution or discrete rectangular


distribution with p.m.f.
1
ℎ = 1 in
𝑓𝑓 𝑥𝑥 = �𝑛𝑛 if 𝑥𝑥 = 𝑎𝑎, 𝑎𝑎 + ℎ, 𝑎𝑎 + 2ℎ, 𝑎𝑎 + 3ℎ, … 𝑏𝑏,
most cases
0 otherwise.
where 𝑏𝑏 = 𝑎𝑎 + 𝑛𝑛 − 1 ℎ

Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Discrete Uniform Distribution (contd.) 21

𝑥𝑥 −𝑎𝑎+1
• c.d.f.: 𝐹𝐹 𝑥𝑥 =
𝑏𝑏−𝑎𝑎+1
𝑎𝑎+𝑏𝑏
• Expectation: μ =
2
2 (𝑏𝑏−𝑎𝑎+1)2 −1 𝑛𝑛2 −1
• Variance: 𝜎𝜎 = = With ℎ = 1
12 12
• Skewness: 𝛾𝛾1 = 0
6(𝑛𝑛2 +1)
• Kurtosis: 𝛾𝛾2 = − 2
5(𝑛𝑛 −1)
𝑏𝑏−𝑎𝑎
• Median: 𝜇𝜇� =
2
• Mode: 𝜇𝜇0 = 𝑎𝑎, 𝑎𝑎 + ℎ, 𝑎𝑎 + 2ℎ, 𝑎𝑎 + 3ℎ, … , 𝑎𝑎 + 𝑛𝑛 − 1 ℎ
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Discrete Uniform Distribution (contd.) 22

p.m.f. c.d.f.
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Bernoulli Distribution 23

• Let X be a random variable that takes only two possible values, 0


and 1, with

𝑃𝑃 𝑋𝑋 = 1 = 𝑝𝑝, 𝑃𝑃 𝑋𝑋 = 0 = 1 − 𝑝𝑝.

• Then X is said to have a Bernoulli distribution with p.m.f.


𝑝𝑝 for 𝑥𝑥 = 1,
𝑓𝑓 𝑥𝑥 = �1 − 𝑝𝑝 for 𝑥𝑥 = 0,
0 otherwise.
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Bernoulli Distribution (contd.) 24
0 for 𝑥𝑥 < 0,
• c.d.f.: 𝐹𝐹 𝑥𝑥 = �1 − 𝑝𝑝 for 0 ≤ 𝑥𝑥 < 1,
1 for 𝑥𝑥 ≥ 1.
• Expectation: μ = 𝑝𝑝
• Variance: 𝜎𝜎 2 = 𝑝𝑝 1 − 𝑝𝑝 = 𝑝𝑝𝑝𝑝, where 𝑞𝑞 = 1 − 𝑝𝑝.
• Raw moments of order r : 𝜇𝜇𝑟𝑟′ = 𝑝𝑝
• Central moments of order r : 𝜇𝜇𝑟𝑟 = 𝑝𝑝(1 − 𝑝𝑝)𝑟𝑟 +(1 − 𝑝𝑝)(−𝑝𝑝)𝑟𝑟
1−2𝑝𝑝 1−6𝑝𝑝𝑝𝑝
• Skewness: 𝛾𝛾1 = Kurtosis: 𝛾𝛾2 =
𝑝𝑝(1−𝑝𝑝) 𝑝𝑝𝑝𝑝
0 if 𝑞𝑞 > 𝑝𝑝, 0 if 𝑞𝑞 > 𝑝𝑝,
• Median: 𝜇𝜇� = �0.5 if 𝑞𝑞 = 𝑝𝑝, Mode: 𝜇𝜇0 = �0, 1 if 𝑞𝑞 = 𝑝𝑝,
1 if 𝑞𝑞 < 𝑝𝑝. 1 if 𝑞𝑞 < 𝑝𝑝.
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Bernoulli Distribution (contd.) 25

p.m.f. c.d.f.
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
Bernoulli Trials 26

• Consider an experiment which has exactly two


random outcomes
• Success, with probability 𝑝𝑝
• Failure, with probability 1 − 𝑝𝑝

• Any repetition of such an experiment is called a


Bernoulli trial
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Binomial Distribution 27

• Consider 𝑛𝑛 independent Bernoulli trials, each with probability of


success 𝑝𝑝.
• Define a random variable 𝑋𝑋 to be the number of successes in the
𝑛𝑛 Bernoulli trials
• Then 𝑋𝑋 is said to have a Binomial distribution with parameters 𝑛𝑛
and 𝑝𝑝, or a Binomial( 𝑛𝑛, 𝑝𝑝) distribution
• Its p.m.f. is
𝑛𝑛 𝑥𝑥 (𝑛𝑛−𝑥𝑥)
𝑝𝑝 𝑞𝑞 , 𝑥𝑥 = 0,1,2, … , 𝑛𝑛,
𝑓𝑓 𝑥𝑥 = � 𝑥𝑥
0 otherwise.
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Binomial Distribution (contd.) 28
𝑞𝑞
∫0 𝑥𝑥 𝑎𝑎−1 (1−𝑥𝑥)𝑏𝑏−1 𝑑𝑑𝑑𝑑
𝐼𝐼𝑞𝑞 𝑎𝑎, 𝑏𝑏 = ,
• c.d.f.: 𝐹𝐹 𝑥𝑥 = 𝐼𝐼𝑞𝑞 𝑛𝑛 − 𝑥𝑥, 𝑥𝑥 + 1 1
∫0 𝑥𝑥 𝑎𝑎−1 (1−𝑥𝑥)𝑏𝑏−1 𝑑𝑑𝑑𝑑
the Incomplete Beta Function
• Expectation: μ = 𝑛𝑛𝑛𝑛 with parameters 𝑎𝑎 and 𝑏𝑏.
• Variance: 𝜎𝜎 2 = 𝑛𝑛𝑛𝑛 1 − 𝑝𝑝 = 𝑛𝑛𝑛𝑛𝑛𝑛.
• Raw moments of order r : 𝜇𝜇𝑟𝑟′ = (𝑛𝑛)𝑟𝑟 𝑝𝑝𝑟𝑟
𝑞𝑞−𝑝𝑝
• Skewness: 𝛾𝛾1 =
𝑛𝑛𝑛𝑛(1−𝑝𝑝)
1−6𝑝𝑝𝑝𝑝
• Kurtosis: 𝛾𝛾2 =
𝑛𝑛𝑛𝑛𝑛𝑛
• Median: 𝜇𝜇� = 𝑛𝑛𝑛𝑛 or 𝑛𝑛𝑛𝑛
• Mode: 𝜇𝜇0 = (𝑛𝑛 + 1)𝑝𝑝 or (𝑛𝑛 + 1)𝑝𝑝 − 1
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Binomial(n,p) distribution (contd.) 29

p.m.f.
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Binomial(n,p) distribution (contd.) 30

Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
c.d.f.
The Geometric Distribution 31

• Consider an infinite sequence of independent Bernoulli trials with


probability of success 𝑝𝑝.
• Define a random variable 𝑋𝑋 as the number of trials required to get
the first success.
• Then 𝑋𝑋 is said to have a geometric distribution with parameter 𝑝𝑝.
• Its p.m.f. is
𝑥𝑥−1
𝑓𝑓 𝑥𝑥 = �𝑞𝑞 𝑝𝑝 if 𝑥𝑥 = 0,1,2, … .
0 otherwise

Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Geometric Distribution (contd.) 32

• c.d.f.: 𝐹𝐹 𝑥𝑥 = 1 − 𝑞𝑞 𝑥𝑥
𝑞𝑞
• Expectation: μ =
𝑝𝑝
𝑞𝑞
• Variance: 𝜎𝜎 2 =
𝑝𝑝2
2−𝑝𝑝
• Skewness: 𝛾𝛾1 =
1−𝑝𝑝
𝑝𝑝2
• Kurtosis: 𝛾𝛾2 = 6 +
1−𝑝𝑝
−1
• Median: 𝜇𝜇� =
log2 (1−𝑝𝑝)
• Mode: 𝜇𝜇0 = 1
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Geometric(p) distribution (contd.)
33

p.m.f. c.d.f.
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Negative Binomial Distribution 34

• Consider an infinite sequence of independent Bernoulli trials with probability of


success p.
• Suppose the trials are continued till 𝑟𝑟 successes are observed, where 𝑟𝑟 is a
prespecified positive integer.
• Define a random variable 𝑋𝑋 to be the number of trials preceding the 𝑟𝑟th success.
• Then X is said to have a negative binomial distribution with parameters 𝑟𝑟 and 𝑝𝑝.
• Its p.m.f. is
𝑟𝑟 + 𝑥𝑥 − 1 𝑟𝑟 𝑥𝑥
𝑝𝑝 𝑞𝑞 , 𝑥𝑥 = 0,1,2, …
𝑓𝑓 𝑥𝑥 = � 𝑟𝑟 − 1
0 otherwise.
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Negative Binomial Distribution (contd.) 35
• c.d.f.: 𝐹𝐹 𝑥𝑥 = 1 − 𝐼𝐼𝑝𝑝 (𝑥𝑥 + 1, 𝑟𝑟)
𝑝𝑝𝑝𝑝
• Expectation: μ =
𝑞𝑞
𝑝𝑝𝑝𝑝
• Variance: 𝜎𝜎 2= 2
𝑞𝑞
1+𝑝𝑝
• Skewness: 𝛾𝛾1 =
𝑝𝑝𝑝𝑝
6 𝑞𝑞2
• Kurtosis: 𝛾𝛾2 = +
𝑟𝑟 𝑝𝑝𝑝𝑝
𝑝𝑝(𝑟𝑟−1)
if 𝑟𝑟 > 1,
• Mode: 𝜇𝜇0 = � 𝑞𝑞

Statistical Structures in Data, PGDBA Programme, ISI, 2022


0 if 𝑟𝑟 ≤ 1. October 14, 2022
Relation with the Geometric(𝑝𝑝) Distribution 36

• The geometric(𝑝𝑝) distribution is a special case of the


negative Binomial(𝑟𝑟, 𝑝𝑝) distribution when
𝑟𝑟 = 1

Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Negative Binomial Distribution (contd.) 37

Orange line: Mean


Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
Green line: standard deviation
The Hypergeometric Distribution 38

• Consider a finite population of size 𝑁𝑁.


• Suppose it contains a proportion 𝑝𝑝 of successes, on account of possessing certain
characteristics
• The remaining proportion 𝑞𝑞 (= 1 − 𝑝𝑝 ) are designated failures
• If n objects are drawn randomly without replacement from this population, let 𝑋𝑋 be a
random variable denoting the number of successes among these 𝑛𝑛 objects.
• Then X is said to have a hypergeometric distribution with p.m.f.
𝑁𝑁𝑝𝑝 𝑁𝑁𝑞𝑞
𝑥𝑥 𝑛𝑛 − 𝑥𝑥 if 𝑥𝑥 = 0, 1, 2, 3, …
𝑓𝑓 𝑥𝑥 = 𝑁𝑁
Statistical Structures in Data, PGDBA Programme, ISI, 2022
𝑛𝑛 October 14, 2022
0 otherwise.
The Hypergeometric Distribution (contd.) 39

• Expectation: μ = 𝑛𝑛𝑛𝑛
𝑁𝑁−𝑛𝑛
• Variance: 𝜎𝜎 2 = 𝑛𝑛𝑛𝑛𝑛𝑛
𝑁𝑁−1 1
= 0 if 𝑝𝑝 =
2
1
• Skewness: 𝛾𝛾1 = 𝑞𝑞 − 𝑝𝑝 𝑎𝑎 > 0 if 𝑝𝑝 <
2
1
< 0 if 𝑝𝑝 >
2
where 𝑎𝑎 is a function of N, 𝑛𝑛, 𝑝𝑝.
(𝑛𝑛+1)(𝑁𝑁𝑁𝑁+1)
• Median: 𝜇𝜇� =
𝑁𝑁+2
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Hypergeometric Distribution (contd.) 40

𝑟𝑟 = sample size
𝑛𝑛 = 𝑁𝑁𝑁𝑁

Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
p.m.f.
Relation with the Binomial Distribution 41

The hypergeometric distribution with parameters


(N, 𝑛𝑛, 𝑝𝑝) can be approximated by the binomial distribution
with parameters (𝑛𝑛, 𝑝𝑝)
• if 𝑛𝑛 is very small compared to N,
𝑛𝑛
• that is, if is negligibly small.
𝑁𝑁

Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Poisson Distribution 42

• Let X be a discrete random variable with mass points


0, 1, 2, 3, …
• X is said to have a Poisson distribution with parameter (𝜆𝜆 >
0) if its p.m.f. is
𝑥𝑥
−𝜆𝜆
𝜆𝜆
𝑓𝑓 𝑥𝑥 = �𝑒𝑒 if 𝑥𝑥 = 0, 1, 2, 3, … ,
𝑥𝑥!
0 otherwise.
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Poisson Distribution (contd.) 43

1 ∞ −𝑡𝑡 𝑥𝑥
• c.d.f.: 𝐹𝐹 𝑥𝑥 = ∫ 𝑒𝑒 𝑡𝑡 𝑑𝑑𝑑𝑑,
𝑥𝑥 ! 𝜆𝜆
• Expectation: μ = 𝜆𝜆
• Variance: 𝜎𝜎 2 = 𝜆𝜆
1
• Skewness: 𝛾𝛾1 = > 0 always.
𝜆𝜆
1
• Kurtosis: 𝛾𝛾2 = > 0 always.
𝜆𝜆
• Mode: 𝜇𝜇0 = 𝜆𝜆

Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Poisson Distribution (contd.) 44

λ=1 λ=2

λ=5 λ=10

Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022

p.m.f.
Relationship with the Binomial Distribution 45

The Poisson distribution with parameter 𝜆𝜆 is the


limiting form of the binomial distribution with
parameters (𝑛𝑛, 𝑝𝑝) if the following conditions hold:
• 𝑛𝑛 → ∞
• 𝑝𝑝 → 0
• 𝑛𝑛𝑛𝑛 = 𝜆𝜆 is finite.

Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
Fitting Discrete Distributions to Data 46

Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Objective 47

• Given a frequency table of the following type,


Value of X 𝑥𝑥1 𝑥𝑥2 𝑥𝑥3 𝑥𝑥4 …..
Observed Frequency 𝑓𝑓1 𝑓𝑓2 𝑓𝑓3 𝑓𝑓4 …..

• The objective is to identify a discrete probability distribution


𝑓𝑓(𝑥𝑥|𝑎𝑎1 , 𝑎𝑎2 , … , 𝑎𝑎𝑘𝑘 ), k being the number of parameters, which fits the data
well, that is, provides a very good approximation to the data
• This is useful as a probabilistic model of the data
• There may exist specific theory of statistical inference for f that can be
exploited to make inference regarding this data.
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
Main Steps 48

• Estimate the 𝑘𝑘 parameters 𝑎𝑎1 , 𝑎𝑎2 , … , 𝑎𝑎𝑘𝑘 by, say, the methods of
moments, that is, by equating the first k sample moments
respectively with the first 𝑘𝑘 theoretical moments, and solving the
𝑘𝑘 equations.
• Denote these estimates by� �2 , … , 𝑎𝑎�𝑘𝑘 .
𝑎𝑎1 , 𝑎𝑎
• Estimate the theoretical frequencies for the different values of X
by
𝑓𝑓�𝑖𝑖 = 𝑛𝑛𝑓𝑓(𝑥𝑥𝑖𝑖 |� �2 , … , 𝑎𝑎�𝑘𝑘 )
𝑎𝑎1 , 𝑎𝑎
where 𝑛𝑛 = ∑∞
𝑖𝑖=1 𝑓𝑓𝑖𝑖 , the total number of observations in the
data set
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
Example: Fitting a Binomial (m,p)
distribution 49

Estimation of parameters
• Equations to be solved:
𝑠𝑠 2
𝑚𝑚𝑝𝑝 = 𝑥𝑥̿ 𝑝𝑝̂ = 1−
� 𝑥𝑥̅
2 ⟺ 𝑥𝑥̅
𝑚𝑚𝑚𝑚𝑚𝑚 = 𝑠𝑠
𝑚𝑚
� =
𝑝𝑝̂
• Generally, 𝑚𝑚 is known from the definition of the variable, so that
𝑥𝑥̅
𝑝𝑝̂ =
𝑚𝑚
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
Example (contd.) 50

• Estimation of the expected frequencies for


𝑥𝑥 = 0,1,2, … , 𝑚𝑚
� 𝑚𝑚 𝑥𝑥 (𝑚𝑚−𝑥𝑥)
𝑓𝑓𝑥𝑥 = 𝑝𝑝̂ 𝑞𝑞�
𝑥𝑥
Value of X 0 1 2 ….. 𝑚𝑚
Observed frequency 𝑓𝑓0 𝑓𝑓1 𝑓𝑓2 …... 𝑓𝑓𝑚𝑚
Expected frequency 𝑓𝑓�0 𝑓𝑓�1 𝑓𝑓�2 …… 𝑓𝑓�
𝑚𝑚

Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
Assessing Goodness of Fit 51

By quantile-quantile(Q-Q)
plots
• By plotting the sample
quantiles against the
theoretical quantiles
• Checking how close the
plot is to a straight line
passing through the
origin
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
Assessing Goodness of Fit (contd.) 52

Through statistical tests of significance


• Pearson’s 𝝌𝝌𝟐𝟐 test of goodness of fit
𝑘𝑘
(𝑂𝑂 − 𝐸𝐸 )2
2 𝑖𝑖 𝑖𝑖 2
𝜒𝜒 = � ∼ 𝜒𝜒𝑘𝑘−𝑐𝑐
𝐸𝐸𝑖𝑖
𝑖𝑖=1
where 𝑂𝑂𝑖𝑖 , 𝐸𝐸𝑖𝑖 are the observed and expected frequencies
of the i-th non-empty cell and c is the number of parameters
estimated.
• Kolmogorov-Smirnov test (to be discussed later)
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
Univariate Continuous Distributions 53

Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
General Notions 54

• Let X be a continuous random variable taking values with non-


zero probabilities in the interval (a, b)
• Then the probability density function (p.d.f.) 𝒇𝒇 𝒙𝒙 of X is
defined to be such that the probability of X taking values in an
Δ𝑥𝑥 Δ𝑥𝑥
infinitesimal interval 𝑥𝑥 − , 𝑥𝑥 + of length Δ𝑥𝑥 for any 𝑥𝑥 ∈ (𝑎𝑎, 𝑏𝑏)
2 2
is 𝑓𝑓 𝑥𝑥 Δ𝑥𝑥, that is,

Δ𝑥𝑥 Δ𝑥𝑥
𝑃𝑃 𝑥𝑥 − ≤ 𝑋𝑋 ≤ 𝑥𝑥 + = 𝑓𝑓 𝑥𝑥 Δ𝑥𝑥
2 2
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
General Notions (contd.) 55
• Cumulative Distribution Function
𝑥𝑥
(c.d.f.)

𝐹𝐹 𝑥𝑥 = 𝑃𝑃 𝑋𝑋 ≤ 𝑥𝑥 = � 𝑓𝑓 𝑡𝑡 𝑑𝑑𝑑𝑑
−∞
• Expectation or Expected∞Value

𝜇𝜇 = 𝐸𝐸 𝑋𝑋 = � 𝑡𝑡𝑓𝑓 𝑡𝑡 𝑑𝑑𝑑𝑑
−∞
• Raw moments of order r,∞𝑟𝑟 = 1,2,3, …

𝜇𝜇𝑟𝑟′ = 𝐸𝐸 𝑋𝑋 𝑟𝑟 = � 𝑡𝑡 𝑟𝑟 𝑓𝑓 𝑡𝑡 𝑑𝑑𝑑𝑑
Statistical Structures in Data, PGDBA Programme, ISI, 2022 −∞ October 14, 2022
General Notions (contd.) 56
• Central moments of order∞r, 𝑟𝑟 = 1,2,3, …

𝜇𝜇𝑟𝑟 = 𝐸𝐸(𝑋𝑋 − 𝜇𝜇)𝑟𝑟 = � (𝑡𝑡 − 𝜇𝜇)𝑟𝑟 𝑓𝑓 𝑡𝑡 𝑑𝑑𝑑𝑑


−∞
• Variance: 𝜎𝜎 2 = 𝜇𝜇2
• Median is defined to be that value 𝜇𝜇� for which
1 �
𝜇𝜇 1
𝐹𝐹 𝜇𝜇� − 0 < ≤ 𝐹𝐹 𝜇𝜇� that is, ∫−∞ 𝑓𝑓 𝑥𝑥 𝑑𝑑𝑑𝑑 =
2 2

• Mode 𝜇𝜇0 is defined to be that value of X for


which

Statistical Structures in Data, PGDBA Programme, ISI, 2022


𝑓𝑓 𝜇𝜇0 = max 𝑓𝑓(𝑥𝑥 ) October 14, 2022
𝑥𝑥
General Notions (contd.) 57

• Skewness
𝜇𝜇32 𝜇𝜇3
𝛽𝛽1 = 2 , 𝛾𝛾1 =
𝜇𝜇3 𝜎𝜎 3⁄2

• Kurtosis

𝜇𝜇4
𝛽𝛽2 = 2 , 𝛾𝛾2 = 𝛽𝛽2 − 3.
𝜇𝜇2

Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Uniform Distribution 58

• Let 𝑎𝑎 and 𝑏𝑏 be two real numbers such that 𝑎𝑎 < 𝑏𝑏.


• A continuous random variable 𝑋𝑋 is said to have a uniform or
rectangular distribution over (𝑎𝑎, 𝑏𝑏) if its pdf is
1
𝑓𝑓 𝑥𝑥 = �𝑏𝑏 − 𝑎𝑎 if 𝑎𝑎 ≤ 𝑥𝑥 ≤ 𝑏𝑏,
0 otherwise

Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Continuous Uniform Distribution (contd.) 59

p.d.f. c.d.f.

Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Continuous Uniform Distribution (contd.) 60

0 if 𝑥𝑥 < 𝑎𝑎,
𝑥𝑥−𝑎𝑎
• c.d.f.: 𝐹𝐹 𝑥𝑥 = � if 𝑎𝑎 ≤ 𝑥𝑥 ≤ 𝑏𝑏,
𝑏𝑏−𝑎𝑎
0 otherwise
𝑎𝑎+𝑏𝑏
• Expectation: μ =
2
(𝑏𝑏−𝑎𝑎)2
• Variance: 𝜎𝜎 2 =
12
• Skewness: 𝛾𝛾1 = 0
• Kurtosis: 𝛾𝛾2 = −1.2
𝑎𝑎+𝑏𝑏
• Median: 𝜇𝜇� =
2

Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Exponential Distribution 61

• The p.d.f. of a random variable 𝑋𝑋 having an exponential


distribution with parameter 𝜆𝜆 (> 0) is

𝑓𝑓 𝑥𝑥 = 𝜆𝜆𝑒𝑒 −𝜆𝜆𝜆𝜆 , 0 ≤ 𝑥𝑥 < ∞

Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Exponential(𝜆𝜆) Distribution (contd.) 62

• c.d.f.: 𝐹𝐹 𝑥𝑥 = 1 − 𝑒𝑒 −𝜆𝜆𝜆𝜆
1
• Expectation: μ =
𝜆𝜆
1
• Variance: 𝜎𝜎 2 =
𝜆𝜆2
• Skewness: 𝛾𝛾1 = 2
• Kurtosis: 𝛾𝛾2 = 6
ln 2
• Median: 𝜇𝜇� =
𝜆𝜆
• Mode: 𝜇𝜇0 = 0
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Exponential(𝜆𝜆) Distribution (contd.) 63

p.d.f. c.d.f.
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Gamma Distribution 64

• The pdf of a gamma random variable with shape parameter 𝛼𝛼 and


rate parameter 𝛽𝛽, where 𝛼𝛼, 𝛽𝛽 > 0 is

𝛽𝛽𝛼𝛼 𝛼𝛼−1 −𝑥𝑥𝑥𝑥


𝑓𝑓 𝑥𝑥 𝛼𝛼, 𝛽𝛽 = �Γ(𝛼𝛼) 𝑥𝑥 𝑒𝑒 if 0 ≤ 𝑥𝑥 < ∞.
0 otherwise

Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Gamma(𝛼𝛼, 𝛽𝛽) Distribution (contd.) 65

1 𝑥𝑥 𝛼𝛼−1 −𝑡𝑡𝛽𝛽
• c.d.f.: 𝐹𝐹 𝑥𝑥 = ∫ 𝑥𝑥 𝑒𝑒 𝑑𝑑𝑑𝑑
Γ(𝛼𝛼) 0
𝛼𝛼
• Expectation: μ =
𝛽𝛽
2 𝛼𝛼
• Variance: 𝜎𝜎 = 2
𝛽𝛽
2
• Skewness: 𝛾𝛾1 =
𝛼𝛼
6
• Kurtosis: 𝛾𝛾2 =
𝛼𝛼
𝛼𝛼−1
• Mode: 𝜇𝜇0 =
𝛽𝛽

Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Gamma(𝛼𝛼, 𝛽𝛽) Distribution (contd.) 66

p.d.f. 1 c.d.f.
𝑘𝑘 = 𝛼𝛼, 𝜃𝜃 =
Statistical Structures in Data, PGDBA Programme, ISI, 2022 𝛽𝛽 October 14, 2022
The Beta Distribution 67

• The beta distribution with shape parameters 𝛼𝛼 and 𝛽𝛽 (𝛼𝛼, 𝛽𝛽 > 0)


has p.d.f.

1
𝑥𝑥 𝛼𝛼−1 (1 − 𝑥𝑥)𝛽𝛽−1 if 0 ≤ 𝑥𝑥 ≤ 1,
𝑓𝑓 𝑥𝑥 𝛼𝛼, 𝛽𝛽 = �Β(𝛼𝛼, 𝛽𝛽)
0 otherwise,

Γ(𝛼𝛼)Γ(𝛽𝛽)
where Β 𝛼𝛼, 𝛽𝛽 =
Γ(𝛼𝛼+𝛽𝛽)
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Beta(𝛼𝛼, 𝛽𝛽) Distribution (contd.) 68

• c.d.f.: 𝐹𝐹 𝑥𝑥 = 𝐼𝐼𝑥𝑥 (𝛼𝛼, 𝛽𝛽)


𝛼𝛼
• Expectation: μ =
𝛼𝛼+𝛽𝛽
2 𝛼𝛼𝛼𝛼
• Variance: 𝜎𝜎 =
𝛼𝛼+𝛽𝛽 2 (𝛼𝛼+𝛽𝛽+1)
𝛼𝛼−1
• Mode: 𝜇𝜇0 =
𝛼𝛼+𝛽𝛽−2
• Symmetric if 𝛼𝛼 = 𝛽𝛽, positively skewed if 𝛼𝛼 < 𝛽𝛽, negatively skewed
if 𝛼𝛼 > 𝛽𝛽.

Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Beta(𝛼𝛼, 𝛽𝛽) Distribution (contd.) 69

p.d.f.
Statistical Structures in Data, PGDBA Programme, ISI, 2022
c.d.f. October 14, 2022
The Normal Distribution 70

• The normal distribution with (parameters) mean 𝜇𝜇 and standard


deviation 𝜎𝜎 2 has the p.d.f.

1
1 − 2 (𝑥𝑥−𝜇𝜇)2
𝑓𝑓 𝑥𝑥 𝜇𝜇, 𝜎𝜎 = 𝑒𝑒 2𝜎𝜎 , −∞ < 𝑥𝑥 < ∞.
𝜎𝜎 2𝜋𝜋

Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Normal(𝜇𝜇, 𝜎𝜎) Distribution (contd.) 71

• c.d.f.: 𝐹𝐹 𝑥𝑥
• Expectation: μ
• Variance: 𝜎𝜎 2
• Skewness: 𝛾𝛾1 = 0
• Kurtosis: 𝛾𝛾2 = 0
• Median: 𝜇𝜇� = 𝜇𝜇
• Mode: 𝜇𝜇0 = 𝜇𝜇

• The standard normal distribution: 𝑁𝑁(0,1) with pdf 𝜙𝜙(𝑥𝑥) and cdf Φ(𝑥𝑥)
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
7
2
The Normal(𝜇𝜇, 𝜎𝜎) Distribution (contd.)

p.d.f. c.d.f.
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
Some Properties of the Normal Distribution 73

Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022

Points of inflection of the 𝒩𝒩(𝜇𝜇, 𝜎𝜎) curve


Distributions Associated with the Normal
Distribution 74

Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
A Few Such Commonly Used Distributions 75

• The log-normal distribution


• The chi-square or 𝜒𝜒 2 -distribution
• The Student’s t distribution
• The F distribution

Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Log-normal Distribution 76

• Is a continuous probability distribution of a random variable X


whose logarithm is normally distributed.
• That is, if X is log-normally distributed, then 𝑌𝑌 = ln 𝑋𝑋 has a
normal distribution.
• Equivalently, if Y has a normal distribution, then 𝑋𝑋 = 𝑒𝑒 𝑌𝑌 has a log-
normal distribution.
• Its p.d.f. is
1 1
− 2 ( ln 𝑥𝑥−𝜇𝜇)2
𝑓𝑓 𝑥𝑥 𝜇𝜇, 𝜎𝜎 = 𝑒𝑒 2𝜎𝜎 , 0 < 𝑥𝑥 < ∞.
𝑥𝑥𝜎𝜎 2𝜋𝜋
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Log-normal Distribution (contd.) 77

1 1 ln 𝑥𝑥−𝜇𝜇
• c.d.f.: 𝐹𝐹 𝑥𝑥 = + Φ
2 2 𝜎𝜎 2
𝜇𝜇+𝜎𝜎 2 ⁄2
• Expectation: μ = 𝑒𝑒
2 2𝜇𝜇+𝜎𝜎 2
• Variance: 𝜎𝜎 2 = 𝑒𝑒 𝜎𝜎 − 1 𝑒𝑒
2 2
• Skewness: 𝛾𝛾1 = 𝑒𝑒 𝜎𝜎 + 2 𝑒𝑒 𝜎𝜎 − 1
4𝜎𝜎 2 3𝜎𝜎 2 2𝜎𝜎 2
• Kurtosis: 𝛾𝛾2 = 𝑒𝑒 + 2𝑒𝑒 + 3𝑒𝑒 −6
• Median: 𝜇𝜇� = 𝑒𝑒 𝜇𝜇
2
• Mode: 𝜇𝜇0 = 𝑒𝑒 𝜇𝜇−𝜎𝜎
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
2
The lognormal ln 𝒩𝒩(𝜇𝜇, 𝜎𝜎 ) (contd.) 78

Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The (Central) 𝜒𝜒 2 -Distribution with k degrees
of Freedom 79

• If 𝑋𝑋1 , 𝑋𝑋2 , … . , 𝑋𝑋𝑘𝑘 are independent 𝒩𝒩(0,1) random variables, then


𝜒𝜒𝑘𝑘2 = 𝑋𝑋12 + 𝑋𝑋22 + ⋯ 𝑋𝑋𝑘𝑘2
is said to have a central 𝜒𝜒 2 -distribution with k degrees of freedom (d,f.)
• Its p.d.f. is
2−𝑘𝑘⁄2 𝑘𝑘−1 −𝑘𝑘𝑥𝑥
𝑥𝑥 2 𝑒𝑒 2 if 0 ≤ 𝑥𝑥 < ∞.
𝑓𝑓 𝑥𝑥 𝑘𝑘 = Γ(𝑘𝑘)
2
0 otherwise
𝑘𝑘 1
which is identical with the Gamma , p.d.f.
2 2
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
2
The Central 𝜒𝜒 (𝑘𝑘) Distribution (contd.) 80

1 𝑥𝑥 𝑘𝑘−1 − 𝑡𝑡
• c.d.f.: 𝐹𝐹 𝑥𝑥 = 𝑘𝑘 ∫0 𝑥𝑥
2 𝑒𝑒 2 𝑑𝑑𝑑𝑑
Γ( )
2
• Expectation: μ = 𝑘𝑘
• Variance: 𝜎𝜎 2 = 2𝑘𝑘
8
• Skewness: 𝛾𝛾1 =
𝑘𝑘
12
• Kurtosis: 𝛾𝛾2 =
𝑘𝑘
• Mode: 𝜇𝜇0 = max(𝑘𝑘 − 2, 0)

Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
2
The Central 𝜒𝜒 (𝑘𝑘) Distribution (contd.) 81

p.d.f. c.d.f.
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The (Central) Student’s t- Distribution 82

• Student was the pen name of William Sealy Gosset (1876 –1937), an English
statistician.
• He developed the Student's t-distribution.
𝑋𝑋
• If 𝑋𝑋 ∼ 𝒩𝒩 0,1 , 𝑌𝑌 ∼ 𝜒𝜒𝑘𝑘2 , and 𝑋𝑋, 𝑌𝑌 are independent, then 𝑡𝑡𝑘𝑘 =
𝑌𝑌�
𝑘𝑘

is said to have a central t-distribution with 𝑘𝑘 degrees of freedom, with p.d.f.

𝑘𝑘 + 1 𝑘𝑘+1
Γ 𝑡𝑡 2 − 2
2
𝑓𝑓 𝑡𝑡 𝑘𝑘 = 1+ , for 0 < 𝑡𝑡 < ∞.
𝑘𝑘 𝑘𝑘
𝑘𝑘𝜋𝜋Γ
2
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Central 𝑡𝑡-Distribution with k d.f.(contd.) 83

• Expectation: μ = 0
2 𝑘𝑘
• Variance: 𝜎𝜎 = , 𝑘𝑘 >2
𝑘𝑘−2
• Skewness: 𝛾𝛾1 = 0, 𝑘𝑘 > 3
6
• Kurtosis: 𝛾𝛾2 = , 𝑘𝑘 > 4,
𝑘𝑘−4
• Mode: 𝜇𝜇0 = 0
• Median: 𝜇𝜇� = 0
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Central 𝑡𝑡-Distribution with k d.f.(contd.) 84

p.d.f. c.d.f.
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Central F -Distribution 85

𝑋𝑋⁄
• If X ∼ 2,
𝜒𝜒𝑚𝑚 𝑌𝑌 ∼ 𝜒𝜒𝑛𝑛2 and 𝑋𝑋, 𝑌𝑌 are independent, then 𝐹𝐹𝑚𝑚,𝑛𝑛 = 𝑚𝑚
𝑌𝑌⁄
𝑛𝑛
is said to have a central F -distribution with 𝑚𝑚, 𝑛𝑛 degrees of
freedom.
• Its p.d.f. for 0 < 𝐹𝐹 < ∞ is
𝑛𝑛 𝑚𝑚+𝑛𝑛
1 𝑚𝑚 𝑚𝑚
𝑚𝑚 − 2
2 −1
𝑓𝑓 𝐹𝐹 𝑚𝑚, 𝑛𝑛 = 𝑚𝑚 𝑛𝑛 𝐹𝐹 2 1+ 𝐹𝐹 .
Β , 𝑛𝑛 𝑛𝑛
2 2
𝑚𝑚
𝑛𝑛
𝐹𝐹 𝑚𝑚 𝑛𝑛
• Note that 𝑚𝑚 ~𝐵𝐵𝐵𝐵𝐵𝐵𝐵𝐵 ,
1+ 𝑛𝑛 𝐹𝐹 2 2

Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Central 𝐹𝐹-Distribution with (𝑚𝑚, 𝑛𝑛)
d.f.(contd.) 86

𝑛𝑛
• Expectation: μ = , 𝑛𝑛 > 2.
𝑛𝑛−2
2𝑛𝑛 2 (𝑚𝑚+𝑛𝑛−2)
• Variance: 𝜎𝜎 2 = 2 , 𝑛𝑛 >4
𝑚𝑚 𝑛𝑛−2 (𝑛𝑛−4)

Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022
The Central 𝐹𝐹-Distribution with (𝑚𝑚, 𝑛𝑛)
d.f.(contd.) 87

m=1, n=1
m-=2, n=1
m=5, n=2
m=10, n=1
m-=n=100
m=1, n=1
m-=2, n=1
m=5, n=2
m=10, n=1
m-=n=100

p.d.f. c.d.f.
Statistical Structures in Data, PGDBA Programme, ISI, 2022 October 14, 2022

You might also like