You are on page 1of 87

Normal Distribution

Objectives
The student will be able to:

 identify properties of normal distribution


 apply mean, standard deviation, and z-scores
to the normal distribution graph
 determine probabilities based on z-scores
Theoretical Distribution
• Empirical distributions
• based on data
• Theoretical distribution
• based on mathematics
• derived from model or estimated from data
Probability
Distributions
Normal Distribution
Why are normal distributions so important?
• Many dependent variables are commonly assumed to be normally
distributed in the population
• If a variable is approximately normally distributed we can make
inferences about values of that variable
• Example: Sampling distribution of the mean
Normal Distribution Curve

A normal distribution curve is symmetrical,


bell-shaped curve defined by the mean and
standard deviation of a data set.

The normal curve is a probability


distribution with a total area under
the curve of 1.
Normal Distribution
• Symmetrical, bell-shaped curve
• Also known as Gaussian distribution
• Point of inflection = 1 standard deviation from
mean
• Mathematical formula

(X  ) 2
1 
f (X )  (e) 2 2
 2
Normal Probability Distribution
• Can take on an infinite
number of possible
values.
• The probability of any
one of those values
occurring is essentially
zero.
• Curve has area or
probability = 1
Normal Distribution
• The standard normal distribution will allow us to make claims about
the probabilities of values related to our own data
• How do we apply the standard normal distribution to our data?
Translation to the Standardized
Normal Distribution

 Translate from X to the standardized normal


(the “Z” distribution) by subtracting the mean
of X and dividing by its standard deviation:

X μ
Z
σ
Z always has mean = 0 and standard deviation = 1

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-14
z-score formula

x
z

Where x represents an element
of the data set, the mean is
represented by  and standard
deviation by  .
z-scores

A z-score reflects how many standard


deviations above or below the mean a raw
score is.

The z-score is positive if the data


value lies above the mean and
negative if the data value lies below
the mean.
z-scores

When a set of data values are normally


distributed, we can standardize each score by
converting it into a z-score.

z-scores make it easier to


compare data values measured
on different scales.
Important z-score info
• Z-score tells us how far above or below the mean a value is in terms of
standard deviations
• It is a linear transformation of the original scores
• Multiplication (or division) of and/or addition to (or subtraction from) X by a
constant
• Relationship of the observations to each other remains the same
Z = (X-m)/s
then
X = sZ + m
[equation of the general form Y = mX+c]
Analyzing the data

Suppose SAT scores among college students are


normally distributed with a mean of 500 and a
standard deviation of 100. If a student scores a 700,
what would be her z-score?

700  500
z 2
100
Her z-score would be 2 which
means her score is two standard
deviations above the mean.
Analyzing the data

 A set of math test scores has a mean


of 70 and a standard deviation of 8.
 A set of English test scores has a mean
of 74 and a standard deviation of 16.

For which test would a score of 78


have a higher standing?

Answer Now
Analyzing the data
A set of math test scores has a mean of 70 and a standard deviation of 8.
A set of English test scores has a mean of 74 and a standard deviation of 16.
For which test would a score of 78 have a higher standing?

To solve: Find the z-score for each test.

78-70 76-74
math z -score = 1 English z -score=  .25
8 16
The math score would have the higher standing
since it is 1 standard deviation above the mean
while the English score is only .25 standard
deviation above the mean.
Probabilities and z scores: z tables
• Total area = 1
• Only have a probability from width
• For an infinite number of z scores each point has a probability of 0 (for the
single point)
• Typically negative values are not reported
• Symmetrical, therefore area below negative value = Area above its positive
value
• Always helps to draw a sketch!
One standard deviation away from the mean (  ) in either
direction on the horizontal axis accounts for around 68 percent
of the data. Two standard deviations away from the mean
accounts for roughly 95 percent of the data with three standard
deviations representing about 99.7 percent of the data.
Key Areas under the Curve

• For normal distributions


+ 1 SD ~ 68%
+ 2 SD ~ 95%
+ 3 SD ~ 99.9%
Probabilities are depicted by areas under the curve

• Total area under the curve is


1
• The area in red is equal to p(z
> 1)
• The area in blue is equal to
p(-1< z <0)
• Since the properties of the
normal distribution are
known, areas can be looked
up on tables or calculated on
computer.
Finding Normal Probabilities

 Suppose X is normal with mean 8.0 and


standard deviation 5.0
 Find P(X < 8.6)

X
8.0
8.6
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-28
Finding Normal Probabilities
(continued)
 Suppose X is normal with mean 8.0 and
standard deviation 5.0. Find P(X < 8.6)
X  μ 8.6  8.0
Z   0.12
σ 5.0

μ=8 μ=0
σ = 10 σ=1

8 8.6 X 0 0.12 Z

P(X < 8.6) P(Z < 0.12)


Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-29
Solution: Finding P(Z < 0.12)

Standardized Normal Probability P(X < 8.6)


Table (Portion) = P(Z < 0.12)
Z .00 .01 .02 .5478
0.0 .5000 .5040 .5080

0.1 .5398 .5438 .5478


0.2 .5793 .5832 .5871
Z
0.3 .6179 .6217 .6255 0.00
0.12

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-30
Upper Tail Probabilities

 Suppose X is normal with mean 8.0 and


standard deviation 5.0.
 Now Find P(X > 8.6)

X
8.0
8.6
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-31
Upper Tail Probabilities
(continued)

 Now Find P(X > 8.6)…


P(X > 8.6) = P(Z > 0.12) = 1.0 - P(Z ≤ 0.12)
= 1.0 - .5478 = .4522

.5478
1.000 1.0 - .5478
= .4522

Z Z
0 0
0.12 0.12
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-32
Probability Between
Two Values

 Suppose X is normal with mean 8.0 and


standard deviation 5.0. Find P(8 < X < 8.6)

Calculate Z-values:

X μ 8 8
Z  0
σ 5
8 8.6 X
0 0.12 Z

P(8 < X < 8.6)


= P(0 < Z < 0.12)
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-33
Solution: Finding P(0 < Z < 0.12)

Standardized Normal Probability P(8 < X < 8.6)


Table (Portion) = P(0 < Z < 0.12)
= P(Z < 0.12) – P(Z ≤ 0)
Z .00 .01 .02 = .5478 - .5000 = .0478
0.0 .5000 .5040 .5080 .0478
.5000
0.1 .5398 .5438 .5478
0.2 .5793 .5832 .5871

0.3 .6179 .6217 .6255 Z


0.00
0.12
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-34
Probabilities in the Lower Tail

 Suppose X is normal with mean 8.0 and


standard deviation 5.0.
 Now Find P(7.4 < X < 8)

X
8.0
7.4

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-35
Probabilities in the Lower Tail
(continued)

Now Find P(7.4 < X < 8)…


P(7.4 < X < 8)
= P(-0.12 < Z < 0) .0478
= P(Z < 0) – P(Z ≤ -0.12)
= .5000 - .4522 = .0478 .4522

The Normal distribution is


symmetric, so this probability
7.4 8.0 X
is the same as P(0 < Z < 0.12) Z
-0.12 0

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-36
Suppose Z has standard normal distribution Find
p(0<Z<1.23)
Find p(-1.57<Z<0)
Find p(Z>.78)
Z is standard normal
Calculate p(-1.2<Z<.78)
Example: IQ
• A common example is IQ
• IQ scores are theoretically normally distributed.
• Mean of 100
• Standard deviation of 15
IQ’s are normally distributed with mean 100 and standard
deviation 15. Find the probability that a randomly selected
person has an IQ between 100 and 115

P (100  X  115) 
P (100  100  X  100  115  100) 
100  100 X  100 115  100
P(   
15 15 15
P (0  Z  1)  .3413
Finding the X value for a
Known Probability

 Steps to find the X value for a known


probability:
1. Find the Z value for the known probability
2. Convert to X units using the formula:

X  μ  Zσ

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-37
Finding the X value for a
Known Probability
(continued)

Example:
 Suppose X is normal with mean 8.0 and
standard deviation 5.0.
 Now find the X value so that only 20% of all
values are below this X

.2000

? 8.0 X
? 0 Z
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-38
Find the Z value for
20% in the Lower Tail

1. Find the Z value for the known probability


Standardized Normal Probability  20% area in the lower
Table (Portion) tail is consistent with a
Z … .03 .04 .05 Z value of -0.84

-0.9 … .1762 .1736 .1711


.2000
-0.8 … .2033 .2005 .1977
-0.7 … .2327 .2296 .2266
? 8.0 X
-0.84 0 Z

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-39
Finding the X value

2. Convert to X units using the formula:

X  μ  Zσ
 8.0  ( 0.84)5.0
 3.80

So 20% of the values from a distribution


with mean 8.0 and standard deviation
5.0 are less than 3.80

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-40
Say we have GRE scores are normally distributed with mean 500 and standard deviation
100. Find the probability that a randomly selected GRE score is greater than 620.

• We want to know what’s the probability of getting a score 620 or


beyond.

620  500
 1.2  z
100

• p(z > 1.2)


• Result: The probability of randomly getting a score of 620 is ~.12
Wrap up
• Assuming our data is normally distributed allows for us to use the
properties of the normal distribution to assess the likelihood of some
outcome
• This gives us a means by which to determine whether we might think
one hypothesis is more plausible than another (even if we don’t get a
direct likelihood of either hypothesis)
The Normal Curve
• A mathematical model or and an
idealized conception of the form a
distribution might have taken under
certain circumstances.
• Mean of any distribution has a Normal
distribution (Central Limit Theorem)
• Many observations (height of adults,
weight of children in California,
intelligence) have Normal distributions
• Shape
• Bell shaped graph, most of data in
middle
• Symmetric, with mean, median and
mode at same point

60
Percent of Values Within One
Standard Deviations

68.26% of Cases

61
Percent of Values Within Two
Standard Deviations

95.44% of Cases

62
Percent of Values Within Three
Standard Deviations

99.72% of Cases

63
Percent of Values Greater than
1 Standard Deviation

64
Percent of Values Greater than
-2 Standard Deviations

65
Percent of Values Greater than +2 Standard
Deviations

66
Data in Normal Distribution

(X  1S ) contains a bout 68% of the s cores


(X  2 S ) contains about 95% of the s cores
(X  3S ) contains a bout 99% of the s cores
Properties Of Normal Curve
• Normal curves are symmetrical.
• Normal curves are unimodal.
• Normal curves have a bell-shaped form.
• Mean, median, and mode all have the same value.

68
Standard Scores
• One use of the normal curve is to explore Standard Scores. Standard
Scores are expressed in standard deviation units, making it much
easier to compare variables measured on different scales.
• There are many kinds of Standard Scores. The most common standard
score is the ‘z’ scores.
• A ‘z’ score states the number of standard deviations by which the
original score lies above or below the mean of a normal curve.

69
The Z Score
• The normal curve is not a single curve but a family of curves, each of
which is determined by its mean and standard deviation.
• In order to work with a variety of normal curves, we cannot have a
table for every possible combination of means and standard
deviations.

70
The Z Score
• What we need is a standardized normal curve which can be used for
any normally distributed variable. Such a curve is called the Standard
Normal Curve.

Xi  X
z
S

71
The Standard Normal Curve
• The Standard Normal Curve (z distribution) is the distribution of
normally distributed standard scores with mean equal to zero and a
standard deviation of one.
• A z score is nothing more than a figure, which represents how many
standard deviation units a raw score is away from the mean.

72
Example Z Score
• For scores above the mean, the z score has a positive sign. Example +
1.5z.
• Below the mean, the z score has a minus sign. Example - 0.5z.
• Calculate Z score for blood pressure of 140 if the sample mean is 110
and the standard deviation is 10
• Z = 140 – 110 / 10 = 3

73
Comparing Scores from Different
Distributions
• Interpreting a raw score requires additional information about the
entire distribution. In most situations, we need some idea about the
mean score and an indication of how much the scores vary.
• For example, assume that an individual took two tests in reading and
mathematics. The reading score was 32 and mathematics was 48. Is it
correct to say that performance in mathematics was better than in
reading?

74
Z Scores Help in Comparisons
• Not without additional information. One method to interpret the raw
score is to transform it to a z score.
• The advantage of the z score transformation is that it takes into
account both the mean value and the variability in a set of raw scores.

75
Did Sara improve?
• Score in pretest was 18 and post test was
42
• Sara’s score did increase. From 18 to 42.
• But her relative position in the Class
decreased.
Pretest Post test
Observation 18 42
Mean 17 49
Standard deviation 3 49
Z score 0.33 -0.14
76
Area When Score is Known
• For a normal distribution with mean of 100 and standard deviation of
20, what proportion of cases fall below 80?
• ~16%

77
Score When Area Is Known
• For a normal distribution with mean of 100 and standard deviation of
20, find the score that separates the upper 20% of the cases from the
lower 80%
• Answer = 116.8

78
Transforming Standard Scores
• Sometimes it is more convenient to work with
standard scores that do not have negative
numbers or decimals.
• Standard scores can be transformed to have any
desired mean and standard deviation.
• SAT and GRE are transformed scores (similar to z)
with a mean of 500 and an SD of 100
• (score x 100) + 500
• Widely used cognitive and personality test
(Wechsler IQ test) are standardized to have a
mean of 100 and an SD of 15
• ( z x 15) + 100
79
Transforming a raw score of 12 on
Behavioral Problem Index
• Age 5: Mean: 10.0 SD: 2.0
• Age 6: Mean: 12.0 SD: 3.0
• Age 7: Mean: 14.0 SD: 3.0

80
Transforming a raw score of 12 on
Behavioral Problem Index
• Age 5: Mean: 10.0 SD: 2.0
• Age 6: Mean: 12.0 SD: 3.0
• Age 7: Mean: 14.0 SD: 3.0
• Age 5: Z = (12-10) / 2 = 1.0
• Age 6: Z = (12-12) / 3 = 0.0
• Age 7:Z = (12-14) / 3 = -0.67

81
Transforming a raw score of 12 on
Behavioral Problem Index
• Age 5: Mean: 10.0 SD: 2.0
• Age 6: Mean: 12.0 SD: 3.0
• Age 7: Mean: 14.0 SD: 3.0
• Age 5: Z = (12-10) / 2 = 1.0
• Age 6: Z = (12-12) / 3 = 0.0
• Age 7:Z = (12-14) / 3 = -0.67
• Age 5: Standard Score 100.15=(1.0 X 15) + 100= 115
• Age 6: Standard Score 100.15=(0.0 X 15) + 100= 100
• Age 7: Standard Score 100.15=(-0.67 X 15) +100= 90

82
Other Standard Scores
• A T score is created from a z score simply by multiplying each
standard deviation unit by 10 to get rid of the decimals, and then
adding 50 to each of these scores to get rid of the negatives.
• Now the mean becomes 50 ([10*0] + 50 = 50).
• Plus 1 z becomes 60 ([10*1] + 50 = 60).

83
Multiple Transformation of Data

84
The Normal Curve & Probability
• The normal curve also is central to many aspects of inferential
statistics. This is because the normal curve can be used to answer
questions concerning the probability of events.
• For example, by knowing that 16% of adults have a Wechsler IQ
greater than 115 (z = +1.00), one can state the probability(p) of
randomly selecting from the adult population a person whose IQ is
greater than 115.
• You are correct if you suspect that P is .16.

85
Data on the IQ Scores of 1000 Six Grade
Children

86
The Normal Curve & Probability
• The mean of the distribution is 100 and the SD is 15
• What is the probability that a randomly selected student from this
population would have an IQ score of 115 or greater?
• Approximately .16
• 16 percent of the total area under the curve in the distribution

87

You might also like