You are on page 1of 31

Dr.

Moataza Mahmoud Abdel Wahab


Lecturer of Biostatistics
High Institute of Public Health
University of Alexandria

moatazamahmoud@yahoo.com
Important statistical terms
Population:
a set which includes all
measurements of interest
to the researcher
(The collection of all
responses, measurements, or
counts that are of interest)

Sample:
A subset of the population
Why sampling?

Get information about large populations


Less costs
Less field time
More accuracy i.e. Can Do A Better Job of
Data Collection
When its impossible to study the whole
population
Target Population:
The population to be studied/ to which the
investigator wants to generalize his results
Sampling Unit:
smallest unit from which sample can be selected
Sampling frame
List of all the sampling units from which sample is
drawn
Sampling scheme
Method of selecting sampling units from sampling
frame
Types of sampling

Non-probability samples

Probability samples
Non probability samples

Convenience samples (ease of access)


sample is selected from elements of a population
that are easily accessible
Snowball sampling (friend of friend.etc.)

Purposive sampling (judgemental)

You chose who you think should be in the


study
Quota sample
Non probability samples

Probability of being chosen is unknown


Cheaper- but unable to generalise
potential for bias
Probability samples

Random sampling
Each subject has a known probability of
being selected
Allows application of statistical sampling
theory to results to:
Generalise
Test hypotheses
Conclusions

Probability samples are the best

Ensure
Representativeness
Precision
Methods used in probability
samples

Simple random sampling


Systematic sampling

Stratified sampling

Multi-stage sampling

Cluster sampling
Simple random sampling
Table of random numbers

684257954125632140
582032154785962024
362333254789120325
985263017424503686
Systematic sampling

Sampling fraction
Ratio between sample size and population
size
Systematic sampling
Cluster sampling
Cluster: a group of sampling units close to each
other i.e. crowding together in the same area or
neighborhood
Cluster sampling
Section 1 Section 2

Section 3

Section 5

Section 4
Stratified sampling
Multi-stage sampling
Errors in sample

Systematic error (or bias)


Inaccurate response (information bias)
Selection bias

Sampling error (random error)


Type 1 error
The probability of finding a difference with
our sample compared to population, and
there really isnt one.

Known as the (or type 1 error)

Usually set at 5% (or 0.05)


Type 2 error

The probability of not finding a difference


that actually exists between our sample
compared to the population

Known as the (or type 2 error)

Power is (1- ) and is usually 80%


Sample size

Quantitative Qualitative

2
Z 2 Z2 (1 )
n n
D2 D2

(12 22 )xF 2 P (1 - P) F
n n
D 2
D2
Problem 1
A study is to be performed to determine a
certain parameter in a community. From a
previous study a sd of 46 was obtained.
If a sample error of up to 4 is to be
accepted. How many subjects should be
included in this study at 99% level of
confidence?
Answer
2
Z 2
n
D 2

2
2.58 x 46 2
n 880.3 ~ 881
42
Problem 2
A study is to be done to determine effect
of 2 drugs (A and B) on blood glucose
level. From previous studies using those
drugs, Sd of BGL of 8 and 12 g/dl were
obtained respectively.
A significant level of 95% and a power of
90% is required to detect a mean
difference between the two groups of 3
g/dl. How many subjects should be include
in each group?
Answer
( )xF
2 2
n 2
1 2
D

(8 12 )x10.5
2 2

n 2
242.6 ~ 243
3
in each group
Problem 3
It was desired to estimate proportion of
anaemic children in a certain preparatory
school. In a similar study at another school
a proportion of 30 % was detected.
Compute the minimal sample size required
at a confidence limit of 95% and accepting
a difference of up to 4% of the true
population.
Answer
Z (1 )
2
n 2
D

1.96 x 0.3(1 0.3)


2
n 2
504.21 ~ 505
(0.04)
Problem 4
In previous studies, percentage of
hypertensives among Diabetics was 70%
and among non diabetics was 40% in a
certain community.
A researcher wants to perform a
comparative study for hypertension among
diabetics and non-diabetics at a
confidence limit 95% and power 80%,
What is the minimal sample to be taken
from each group with 4% accepted
difference of true value?
Answer
2 P (1 - P) F
n 2
D

2 x 0.55 (1 - 0.55) x7.8


n 2
2413 .2
0.04
Precision
Cost

You might also like