You are on page 1of 22

Data distribution2

• The three basics of statistics are:


1. Variability: making sense from variation
2. Inference: making generalization
3. Probability: making proportion and chance
3
Standard Normal Distribution
95% of
probability in
here

2.5% of probability 2.5% of probability


in here in here

Standard Normal
Distribution with 95% area
marked

4
Example:
• suppose the mean of normal distribution
population height is 10 inches
• An observed value 20 inches and standard
deviation 5 inches, then
• 20-10/5=2
• This means that the observation 20 inches is
located at 2 SD
• Example: Male systolic Blood Pressure,
mean = 125, SD= 14 mmHg
if SBP = 167 mmHg if SBP= 83 mmHg
167  125 83  125
Z  3.0 Z  3.0
14 14
if SBP = 97 mmHg
97  125
Z  2.0
14

6
What is the Usefulness of a Standard Normal
Score Z?
• It tells you how many SDs (s) an observation
is far from the mean
• Thus, it is a way of quickly assessing how
“abnormal” an observation is.
• Example: Suppose the mean SBP is 125
mmHg, and standard deviation = 14 mmHg
– Is 167 mmHg an unusually high measure?
– Is 83 mmHg an unusually low measures?
– If we know Z = 3.0 or -3 does that help us?
7
Probability
• Proportion – the relative size of the portion of
a population with a certain characteristic.
• Random selection – a selection where each
person has an equal chance of being selected.
• The chance is measured by the proportion, a
number between 0 and 1, called the
probability.

8
• It is also of interest to investigate how the
information contained in a sample can be used
to infer the characteristics of the population
from which it was drawn.

• The foundation for statistical inference is the


theory of probability.

9
Proportion vs. Probability
• Proportion measures size in descriptive statistic
• Probability measures chance
• Probability always represented by fraction where
the numerator is part of the denominator
• It can never be greater than one or less than zero

10
• Example: A box with 100 balls (90 of them red,
10 of them black)
– If you see the contents of the box you would say
90% of the balls are red.
– Without seeing the balls, you would say there is a
90% chance of selecting a red one or 9/10.
Probability distribution
• If we toss 10 coins one million times
• Each time you get number of heads and tails
• Do frequency distribution curve
Probability and Random Sampling
• Suppose that out of N=100,000 persons a total of
5500 are positive test for AIDS
– “the probability of a randomly selected person
from the target population having a positive test
for AIDS result is 0.055 or 5.5%
• Rationale: on an initial draw the person may or may
not have a positive test. However when this process
is repeated over and over again a large number of
times, the relative frequency of positive people will
approximate 0.055.

13
Normal distribution in binomial population
• How can the normal distribution be used to
evaluate large binomial samples?
• Let us assume a very large jar filled with red and
blue balls ( infinite population) with known 51%
red balls and 49% blue
• If we need to take several samples of red balls
(n)samples= 1795
• Assume in the first sample red balls 10% and blue
90%
( 0.1 and 0.9) &( -1)q
• In the 2nd sample red balls 20% and blue 80%
(0.2 and 0.8) and so on till u draw 1795 sample
of red balls and zero yellow
• Several values for several samples with
different frequencies give rise to normal curve
• In a large sample binomial distribution
approximated to continuous normal
distribution.
How far is a probability to draw 55% red balls
from the large Jar??
Translation to normal distributions
Binomial distribution Normal distribution
P µ Center of normal curve
Mean

 Standard deviation
of normal curve

Observed sample outcome

Standard normal deviate


(z)
Z score
Z score for binomial distribution
How far is a probability to draw 55% red balls
from the large Jar??
P= 0.51 Center of normal curve
Mean of red balls

Observed sample
outcome
Standard error of
proportion

Standard normal deviate


(z)
• the normal distribution centered around the
population mean
• binomial population centered around the
population P and not around the sample
proportion
• The previous normal curve represents a
probability distribution of sample outcomes
drawn from population of 51% proportion red
balls and 49% proportion yellow balls

You might also like