You are on page 1of 14

Probability and Continuous Distributions: Normal

and Student’s t Distributions

1) Normal distribution and its population parameters.


2) Testing for normality
3) Standard normal distribution, its cumulative distribution function with
probabilities and its corresponding percentiles.
4) Student’s t distribution and its percentiles.

EPHD-310 Basic Biostat Dr. Jaffa 1


Lecture 2

Normal Distribution
• Normal distribution is a distribution that has a symmetrical bell-
shaped curve.

• The curve of the normal distribution is symmetrical about the


population mean μ.

• A normal distribution is defined in terms of the population mean μ and


population variance σ2 and is referred to as N(μ, σ2 ) distribution.

• μ and σ2 are the two population parameters that define the normal
distribution.

• Normal distribution is the most important distribution since all


parametric tests for hypotheses require that the continuous variable
being tested is normally distributed.

EPHD-310 Basic Biostat Dr. Jaffa 2


Lecture 2

1
Comparison of Different Normal Distributions with
Same Variance but Different Means

pdf nor m50


0. 06

N (50, 7)
0. 05
N ( 62, 7)
Shifted to the
0. 04
right

0. 03

0. 02

0. 01

0. 00

20 30 40 50 60 70 80 90 100

EPHD-310 Basic Biostat Dr. Jaffa 3


Lecture 2

Comparison of Different Normal Distributions with


Same Means but Different Variances
pdf nor m50
0. 08

0. 07
N (50, 5)

0. 06

0. 05

0. 04

N (50, 10)
0. 03 more spread

0. 02

0. 01

0. 00

20 30 40 50 60 70 80 90 100

EPHD-310 Basic Biostat Dr. Jaffa 4


Lecture 2

2
Statistical Test of Normality

H0: Age is normally distributed


H1: Age is not normally distributed

Test of normality: If Shapiro-Wilk test of normality P-value > 0.05 then we


assume normality.

The Shapiro-Wilk test of normality for the variable age in this example has a
P-value of 0.959 that is > 0.05. This indicates that age has a normal
distribution.
EPHD-310 Basic Biostat Dr. Jaffa 5
Lecture 2

Statistical Test of Normality

Graphically
the
distribution
of age looks
normal
symmetrical
bell shape

EPHD-310 Basic Biostat Dr. Jaffa 6


Lecture 2

3
Statistical Test of Normality

Q-Q plot
shows that
all the
observations
fall on the
lines of
normality. So
age is
normally
distributed

EPHD-310 Basic Biostat Dr. Jaffa 7


Lecture 2

Standard Normal Distribution

• A standard normal distribution is a special type of normal


distribution that is characterized by its population mean
zero and population variance 1 i.e., N(0,1).

• A Table pertaining to the standard normal distribution


showing different probabilities (areas under the normal
curve) as well as corresponding percentiles is available.

EPHD-310 Basic Biostat Dr. Jaffa 8


Lecture 2

4
Cumulative Distribution Function for a Standard Normal
Distribution

• Cumulative distribution function (cdf) for a standard normal distribution


is the left area under the curve of the normal distribution and is
defined as phi of small x:
Φ(x) = Pr(X ≤ x) where X~N(0,1).

• This cdf is read as probability of a random variable capital X less than


or equal to small x. Its value is obtained from the table of normal
distribution.

• In the next slide I present the standard normal curve showing the cdf.

• Note that X~N(0,1) is read as “X is distributed as standard normal


distribution with mean = 0 and variance = 1”.

EPHD-310 Basic Biostat Dr. Jaffa Lecture 2 9

Cumulative Distribution Function for a Standard


Normal Distribution

0.40
0.39
0.38
0.37
0.36
0.35
0.34
0.33
0.32
Pr(X ≤x) = Φ(x) =
0.31
0.30 area to the left of x
0.29
0.28
0.27
0.26
0.25
0.24
0.23
0.22
0.21
0.20
0.19
0.18
0.17
0.16
0.15
0.14
0.13
0.12
0.11
0.10
0.09
0.08
0.07
0.06
0.05
0.04
0.03
0.02
0.01

-3 -2 -1 0 1 2 3
x

EPHD-310 Basic Biostat Dr. Jaffa 10


Lecture 2

5
Standard Normal Distribution

Symmetry properties of the standard normal distribution:


Φ(-x) = Pr(X ≤ -x) = Pr(X ≥ x) = 1 – Pr(X ≤ x) = 1 - Φ(x)
st dpdf nor m
0. 40
0. 39
0. 38
0. 37
0. 36
0. 35
0. 34
0. 33
0. 32
0. 31
0. 30
0. 29
0. 28
0. 27
0. 26
0. 25
0. 24
0. 23
0. 22
0. 21
0. 20
0. 19 Pr(X< -1) = Φ(-1) Pr(X> 1) = 1-Φ(1)
0. 18
0. 17
0. 16
0. 15
0. 14
0. 13
0. 12
0. 11
0. 10
0. 09
0. 08
0. 07
0. 06
0. 05
0. 04
0. 03
0. 02
0. 01

-3 -2 -1 0 1 2 3

EPHD-310 Basic Biostat Dr. Jaffa 11


Lecture 2

Standard normal table

6
Standard Normal Distribution
Examples: If X ~ N(0,1) use column a in the table for standard normal
distribution to obtain the following probabilities.

Φ(1.96) = Pr(X ≤ 1.96) = 0.975 (column a from standard normal table)

Pr(X ≤ -1.96)
= Pr(X ≥ 1.96) (can be obtained from column b in the standard normal table)
= 1 – Pr(X ≤ 1.96) (or using column a from the normal table)
= 1 – Φ(1.96)
= 1 – 0.975
= 0.025

Φ(1) = Pr(X ≤ 1) = 0.8413 (column a from normal table)

EPHD-310 Basic Biostat Dr. Jaffa Lecture 2 13

Standard Normal Distribution

Note that for any numbers a and b,


Pr(a ≤ X ≤ b) = Pr(X ≤ b) – Pr(X ≤ a)
= Pr(X < b) – Pr(X < a)

Pr(-1 ≤ X ≤ 1.5) = Pr(X ≤ 1.5) – Pr(X ≤ -1)


= Pr(X ≤ 1.5) – Pr(X > 1)
= Pr(X ≤ 1.5) – (1 – Pr(X ≤ 1)
= Φ(1.5) – (1 - Φ(1) )
= 0.9332 – (1 - 0.8413)
= 0.7745 (Using column a from the standard normal table)

Note that Pr(X > 1) can also be obtained from column b for standard normal table

EPHD-310 Basic Biostat Dr. Jaffa 14


Lecture 2

7
Percentiles of a Standard Normal Distribution

• The pth percentile of a standard normal distribution is the


value x obtained from the standard normal table such that p
percent of the standard normal data are below this value x.
So the pth percentile of a standard normal distribution is x.

• To obtain x from the standard normal table you go to column


“a’ look for the value p/100 and report corresponding x.

• The pth percentile is denoted as zp/100

15
EPHD-310 Basic Biostat Dr. Jaffa Lecture 2

Percentiles of a Standard Normal Distribution

Example:
The 97.5th percentile of a standard normal distribution is
denoted as z0.975

Hence, 97.5% of the standard normal data is less than z0.975


i.e. 97.5% of the data is to the left of z0.975

16
EPHD-310 Basic Biostat Dr. Jaffa Lecture 2

8
Percentiles of a Standard Normal Distribution

Example: Compute the 97.5th percentile of a standard normal


distribution
Solution: 97.5 percentile = z0.975 you need to go to column a in
standard normal table look for the value 0.975 and report the
corresponding x which is 1.96.

Hence the 97.5th percentile of a standard normal distribution = 1.96

EPHD-310 Basic Biostat Dr. Jaffa 17


Lecture 2

Student’s t Distributions

Student’s t is a family of distributions that is continuous and


determined by the sample size n – 1 denoted as degrees of
freedom (df).

The student’s t distribution is also referred to as t


distribution.

The shape of each t distribution is determined by its df.


Hence for each sample size there corresponds a df and a
particular shape that the t distribution will take.

EPHD-310 Basic Biostat Dr. Jaffa 18


Lecture 2

9
Student’s t Distributions

t distribution is a class of distributions since for each degree of


freedom (df = n -1) corresponds a unique t distribution.

There is a different t distribution for each sample size. Thus,


each t distribution is determined uniquely by its sample size and
intuitively by its degree of freedom (df).

t distribution is used to approximate the standard normal


distribution when the standard deviation σ is unknown. The
larger the sample size the closer t distribution is to the standard
normal distribution.

t distribution is symmetric about zero like the standard normal


distribution.
EPHD-310 Basic Biostat Dr. Jaffa 19
Lecture 2

t distribution with different degrees of freedom

df → ∞ ~ Standard normal
df = 1
df = 2
df = 5
df = 10
infinity

EPHD310- Basic Biostat Dr. Jaffa 20


Lecture 2

10
Percentiles of a t Distribution

The percentile of a t distribution is determined by the critical value


denoted as:
t 
n 1,1
2


The 1  determines the percentile of this t distribution that has n-1
2
df.

The α is referred to as the level of significance in hypothesis testing


(discussed in the next lecture).

The percentiles of a t distribution can be obtained from the t Table.

EPHD-310 Basic Biostat Dr. Jaffa 21


Lecture 2

Table of percentiles for the t distributions

11
Percentiles of a t Distribution

What does for example t20,0.95 mean?


t20,0.95 is the 95th percentile or the upper 5th percentile of a t
distribution with 20 df. Using the t Table t20,0.95 = 1.725
Interpretation: 95% of the data from the t distribution with 20 df
have values 1.725 and below

23

12
Percentiles of a t Distribution
Find the 95th percentiles of a t distribution with 23 df.

To answer this question we need to find t23,0.95 from the table of the t
distribution.

The 95th percentile of a t distribution with 23 df is t23,0.95 = 1.714


Interpretation: 95% of the data in a t distribution with 23 df have values 1.714
and below

EPHD-310 Basic Biostat Dr. Jaffa 25


Lecture 2

13
EPHD310 Basic Biostatistics: Course Learning Outcomes LOs Per FHS
Catalogue

 Describe commonly used statistical probability


distributions.

 Interpret results of statistical analyses found in public


health studies and biomedical sciences.

• Apply ethical principles to data management and analysis.

EPHD-310 Basic Biostat Dr. Jaffa 27


Lecture 2

14

You might also like