You are on page 1of 40

MTH220

Statistical Methods and Inference


“Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write”
- H. G. Wells1866 – 1946

Seminar 1
17 March 2022

MTH220 Lecture Notes SU01 Jan22 1


Study Unit 1
◼ Introduction to Statistics

◼ Review on Probability

◼ Point Estimation

◼ Sampling Distribution

◼ Normal Approximation for Discrete Distributions

MTH220 Lecture Notes SU01 Jan22 2


Chapter 1
Introduction to Statistics

MTH220 Lecture Notes SU01 Jan22 3


1.1 Introduction to Statistical Methods and Inference
◼ Statistical study
1. What questions are we trying to answer?
(e.g., determine if walking 30 minutes per day will reduce the occurrence of heart attacks)
2. What is the population?
(e.g., at-risk for heart attacks)
3. What variables will be record?
(e.g., number of volunteers, number of minutes walking, number of heart attacks)
4. Making statistical inference

MTH220 Lecture Notes SU01 Jan22 4


1.1 Introduction to Statistical Methods and Inference
◼ Population: all items of interest
• Random variable

• Probability distribution

◼ Sample: a subset of a population

Remark: use sample information to make inference for the population

MTH220 Lecture Notes SU01 Jan22 5


1.1 Introduction to Statistical Methods and Inference
◼ Learning Statistics with R
❑ Download R from https://cran.rstudio.com/
❑ Download Rstudio from
https://www.rstudio.com/products/rstudio/download/
❑ Details can be found on page SU1-5 of the Study Guide

MTH220 Lecture Notes SU01 Jan22 6


Chapter 2
Review on Probability

MTH220 Lecture Notes SU01 Jan22 7


2.1 Basis Probability
◼ Experiment:
◼ Sample Space: the set of all possible outcomes of an experiment

e.g. flipping a coin, the sample space is {𝐻, 𝑇}


◼ Event: any subset of the sample space

◼ Random Variable: is a function that assigns a real number to each


outcome in the sample space of a random experiment. Notation: X, Y

MTH220 Lecture Notes SU01 Jan22 8


2.1 Basis Probability
◼ Discrete Random Variable: is a random variable with a finite (or
countably infinite) range.
◼ Continuous Random Variable: is a random variable with an interval
(either finite or infinite) of real numbers for its range.
◼ Expectation (expected value): refers to the “average” value of a random
variable, 𝐸(𝑋)
◼ Variance: “spread” or “variation”, Var(𝑋)

MTH220 Lecture Notes SU01 Jan22 9


2.2 Bernoulli distribution - discrete
◼ Bernoulli trial: whose outcome can be classified as either a success or a
failure.
◼ Random variable 𝑿~𝑩𝒆𝒓𝒏𝒐𝒖𝒍𝒍𝒊 𝒑 , the probability mass function (pmf)
𝑝, 𝑖𝑓 𝑥 = 1
𝑓 𝑥 =ቊ ,
1 − 𝑝 , 𝑖𝑓 𝑥 = 0
where 𝑥 = 1 represents success and 𝑥 = 0 represents failure.
◼ The mean and variance of 𝑋 are

𝐸 𝑋 = 𝑝 and Var 𝑋 = 𝑝(1 − 𝑝)

MTH220 Lecture Notes SU01 Jan22 10


2.2 Binomial distribution - discrete
◼ Suppose that n independent trials are performed, each of which results in a
success with probability 𝑝 and a failure with probability 1 − 𝑝.
◼ If 𝑋 represents the number of successes that occur in n trials, then 𝑋~𝐵 𝑛, 𝑝
◼ The probability mass function of 𝑋
𝑛 𝑥
𝑝 1 − 𝑝 𝑛−𝑥 , 𝑥 = 0,1, … 𝑛
𝑓 𝑥 =ቐ 𝑥
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
◼ The mean and variance of 𝑋 are
𝐸 𝑋 = 𝑛𝑝 and Var 𝑋 = 𝑛𝑝(1 − 𝑝)

MTH220 Lecture Notes SU01 Jan22 11


2.2 Binomial distribution - discrete

MTH220 Lecture Notes SU01 Jan22 12


2.2 Geometric distribution - discrete
◼ Suppose that independent trials are performed, each having a probability 𝑝
of being a success.
◼ Let 𝑋 be the number of trials required until the first success, then
𝑋~𝐺𝑒𝑜𝑚 𝑝
◼ The probability mass function of 𝑋
1 − 𝑝 𝑥−1 𝑝, 𝑥 = 1,2,3, …
𝑓 𝑥 =ቊ
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
◼ The mean and variance of 𝑋 are
1 1−𝑝
𝐸 𝑋 = and Var 𝑋 =
𝑝 𝑝2

MTH220 Lecture Notes SU01 Jan22 13


2.2 Poisson distribution - discrete
◼ Poisson distribution is to model the number of “events” occurring in
a certain period of time, e.g., the number of claims to an insurer in a
year.
◼ Let random variable 𝑋~𝑃𝑜𝑖𝑠𝑠𝑜𝑛 𝜆 , the probability mass function
𝜆𝑥 𝑒 −𝜆
𝑓 𝑥 = ൞ 𝑥! , 𝑥 = 0,1,2, …
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
◼ The mean and variance of 𝑋 are
𝐸 𝑋 = 𝜆 and Var 𝑋 = 𝜆

MTH220 Lecture Notes SU01 Jan22 14


2.2 Poisson distribution - discrete

MTH220 Lecture Notes SU01 Jan22 15


2.2 Exponential distribution - continuous
◼ Exponential distribution describes the time until a specific event occurs
when the events occur according to a Poisson process with rate 𝜆
◼ Let random variable 𝑋~𝐸𝑥𝑝 𝜆 , the probability density function (pdf)
−𝜆𝑥
𝑓 𝑥 =ቊ 𝜆𝑒 𝑥>0
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
◼ The mean and variance of 𝑋 are
1 1
𝐸 𝑋 = and Var 𝑋 =
𝜆 𝜆2

MTH220 Lecture Notes SU01 Jan22 16


2.2 Normal distribution - continuous
◼ A random variable 𝑋~𝑁 𝜇, 𝜎 2 , the
probability density function
1 1 𝑥−𝜇 2
−2 𝜎
𝑓 𝑥 = 𝑒 , −∞ < 𝑥 < ∞
2𝜋𝜎
◼ The mean and variance of 𝑋 are

𝐸 𝑋 = 𝑝 and Var 𝑋 = 𝑝(1 − 𝑝)

MTH220 Lecture Notes SU01 Jan22 17


2.2 Standard normal distribution
◼ Standard normal distribution 𝑁 0,1
◼ An important implication is:
𝑋−𝜇
if 𝑋~𝑁 𝜇, 𝜎 2 , then 𝑍 = 𝜎
~𝑁(0,1).
We say 𝑍 is a standard normal random variable.

MTH220 Lecture Notes SU01 Jan22 18


2.2 Standard normal distribution
◼ Standard normal distribution Z ∼
𝑁 0,1
◼ The cumulative function of a
standard normal random variable
is
Φ 𝑧 = Ρ(𝑍 ≤ 𝑧)
◼ Properties:
Ρ 𝑍 < −𝑧 = Ρ(𝑍 > 𝑧)

⇒ Φ −𝑧 = 1 − Φ(𝑧)

MTH220 Lecture Notes SU01 Jan22 19


2.2 Standard normal distribution
◼ The standard normal distribution table (page SU1-12)

MTH220 Lecture Notes SU01 Jan22 20


2.2 Standard normal distribution

MTH220 Lecture Notes SU01 Jan22 21


Chapter 3
Point Estimation

MTH220 Lecture Notes SU01 Jan22 22


3.1 Populations and samples
◼ Population: all items of interest
◼ Population parameter: certain characteristic that describes a
population, usually it is unknown
◼ For example,
❑ Population mean 𝜇
❑ Population variance 𝜎 2
❑ Population proportion 𝑝

MTH220 Lecture Notes SU01 Jan22 23


3.2 Sample statistic
◼ Sample: a subset of a population. The part of the population we actually examine and
for which we do have data.
◼ We denote a random sample of size n by 𝑋1 , 𝑋2 , … , 𝑋𝑛 and its observations are
denoted by 𝑥1 , 𝑥2 , … , 𝑥𝑛
◼ Statistic: is a numerical quantity that can be calculated from a sample of data
1
❑ Sample mean: 𝑋ത = 𝑛 σ𝑛𝑖=1 𝑋𝑖
1
❑ Sample variance: 𝑆2 = σ𝑛𝑖=1 𝑋𝑖 − 𝑋ത 2
𝑛−1
❑ Sample standard deviation: S = 𝑆 2
❑ Sample total: 𝑇 = σ𝑛𝑖=1 𝑋𝑖
𝑋ത
❑ Other quantities: 𝑆 and σ𝑛𝑖=1 𝑋𝑖2
ൗ 𝑛

MTH220 Lecture Notes SU01 Jan22 24


3.2 Statistics from a random sample
Examples: The height data (in cm) for female university students is given in
the table

◼ Sample mean:
1
𝑋ത = 10 σ10
𝑖=1 𝑋𝑖 =161
◼ Sample variance:
10
1
𝑆 = 2
෍ 𝑋𝑖 − 𝑋ത 2
= 56.667
10 − 1
𝑖=1
◼ Sample standard deviation:
S = 𝑆 2 = 7.5277
MTH220 Lecture Notes SU01 Jan22 25
3.3 Point estimators
◼ Point estimator: is a statistic that is used to estimate a population parameter.
◼ For example
❑ Sample mean 𝑋 ത is a point estimator for population mean 𝜇
2 2
❑ Sample variance 𝑆 is a point estimator for population variance 𝜎

❑ Sample standard deviation S is a point estimator for population standard deviation

𝜎
◼ Question: all possible values of the point estimator?
◼ Example:
❑ Sample 1: 9.2, 10.1, 10.5, 9.8, 10 (sample mean =9.92 )

❑ Sample 2: 10.4,9.9,11.0, 8.9, 9.5(sample mean =9.94 )

MTH220 Lecture Notes SU01 Jan22 26


Chapter 4
Sampling distribution

MTH220 Lecture Notes SU01 Jan22 27


4.1 Sample mean
◼ Suppose 𝑋1 , … , 𝑋𝑛 is a random sample of size 𝑛 from a population with expectation 𝜇 and
variance 𝜎 2 .
◼ The sample mean is
𝑋
𝑋ത = σ𝑛𝑖=1 𝑛𝑖.
◼ The expectation of the sample mean
ത 𝑛 𝑋𝑖 1 𝑛 1
𝐸 𝑋 = 𝐸 σ𝑖=1 𝑛 = 𝑛 σ𝑖=1 𝐸 𝑋𝑖 = 𝑛 𝜇 + 𝜇 + ⋯ + 𝜇 = 𝜇
◼ The variance of the sample mean
𝑛 𝑋𝑖 1 𝑛 1 𝜎2
𝑉𝑎𝑟 𝑋ത = 𝑉𝑎𝑟 σ𝑖=1 = σ 𝑉𝑎𝑟 𝑋𝑖 = 𝜎2 + 𝜎2 + ⋯+ 𝜎2 =
𝑛 𝑛2 𝑖=1 𝑛2 𝑛

◼ ത
Question: what is the distribution of sample mean 𝑋?

MTH220 Lecture Notes SU01 Jan22 28


4.2 Sampling distribution of the sample mean from a
Normal distribution
◼ Suppose 𝑋1 , … , 𝑋𝑛 is a random sample of size 𝑛 from a normal population
𝑁(𝜇, 𝜎 2 ).
𝑋
◼ The sample mean 𝑋ത = σ𝑛𝑖=1 𝑛𝑖 is normally distributed with mean
𝜇𝑋ത = 𝐸 𝑋ത = 𝜇
and variance
2
𝜎
𝜎𝑋2ത = 𝑉𝑎𝑟 𝑋ത =
𝑛
◼ We have

𝜎2

𝑋~𝑁(𝜇, )
𝑛

MTH220 Lecture Notes SU01 Jan22 29


4.2 Sampling distribution of the sample mean from a
Normal distribution
◼ Example: Suppose 𝑋1 , … , 𝑋𝑛 is a random sample of size 𝑛 from 𝑁 𝜇, 𝜎 2 ,determine
the sampling distribution for the sample means
1. 𝑛 =4
2. 𝑛 = 16

MTH220 Lecture Notes SU01 Jan22 30


4.3 Central Limit Theorem
◼ For a random sample 𝑋1 , … , 𝑋𝑛 of size 𝑛 from a non-normal population with
expectation 𝜇 and variance 𝜎 2 . Then the sample mean
𝜎2

𝑋~𝑁(𝜇, )
𝑛
approximately, when n is sufficient large.
◼ Remark: 𝒏 > 𝟑𝟎 is sufficient large for a good approximation, for most distributions.

MTH220 Lecture Notes SU01 Jan22 31


4.4 Interpretation on sampling distribution
◼ ഥ from the 𝐸𝑥𝑝 (0.25) distribution
Sampling distribution of 𝑿

MTH220 Lecture Notes SU01 Jan22 32


4.4 Interpretation on sampling distribution
◼ ഥ from the 𝐵𝑒𝑟𝑛𝑜𝑢𝑙𝑙𝑖 (0.2) distribution
Sampling distribution of 𝑿

MTH220 Lecture Notes SU01 Jan22 33


4.5 Some common sampling distributions
◼ Suppose 𝑋1 , … , 𝑋𝑛 and 𝑌1 , … , 𝑌𝑚 are two independent
random samples from 𝑁 𝜇, 𝜎 2 , then
𝑛 𝑋𝑖 𝜎2
1. 𝑋ത = σ𝑖=1 ~𝑁
𝑛
𝜇, 𝑛 .
(𝑛−1) 2 2
2. 2 𝑆𝑋 ~𝜒𝑛−1 .
𝜎

𝑋−𝜇 𝑛+𝑚−2 ത 𝑌ത
𝑋−
3. 𝑆𝑋 ~𝑡𝑛−1 and 1 1 ~𝑡𝑛+𝑚−2 .
ൗ 𝑛
+ 2 + 𝑚−1 𝑆 2
𝑛−1 𝑆𝑋
𝑛 𝑚 𝑌

2
𝑆𝑋
4. 2 ~𝐹𝑛−1,𝑚−1 .
𝑆𝑌

MTH220 Lecture Notes SU01 Jan22 34


4.6 Example
Given that the mean contents of bags of salt labelled as containing one kilogram is
1,005g
and that the standard deviation of their contents is three grams. Suppose a random a
random sample of five bags is taken.
1. What is the expected value of the total contents of a sample of five bags?
Determine the standard deviation of the total contents of a sample of five bags.

MTH220 Lecture Notes SU01 Jan22 35


4.6 Example
Given that the mean contents of bags of salt labelled as containing one kilogram is
1,005g
and that the standard deviation of their contents is three grams. Suppose a random a
random sample of five bags is taken.
2. Find the expected value of the mean contents of a sample of five bags. What is
the variance of the mean contents of a sample of five bags?

MTH220 Lecture Notes SU01 Jan22 36


4.6 Example
Given that the mean contents of bags of salt labelled as containing one kilogram is
1,005g
and that the standard deviation of their contents is three grams. Suppose a random a
random sample of five bags is taken.
3. Assuming that the contents of the bags of salts are normally distributed. What is
the probability that the mean contents of a sample of five bags will be less than
1002g?

MTH220 Lecture Notes SU01 Jan22 37


4.6 Example: z-table

MTH220 Lecture Notes SU01 Jan22 38


4.7 Exercise
Suppose we have a situation where a particular cargo lift that is designed to transport
a maximum of 11,000 pounds. A load of cargo containing 40 boxes must be
transported via this cargo lift. Given that the weight of a box of this type of cargo
follows a probability distribution with mean 250 pounds and standard deviation 60
pounds, what is the probability that all 40 boxed can be loaded onto the cargo lift and
transported simultaneously?

MTH220 Lecture Notes SU01 Jan22 39


Reading Material
◼ Study unit 1 of the Study Guide
◼ Sections 1.2,1.3, Chapter 4, 5, 6 of the e-textbook
◼ Visit https://www.r-project.org/ to browse R tutorial for
beginners.

MTH220 Lecture Notes SU01 Jan22 48

You might also like