You are on page 1of 42

STATISTICS 2

DR Taher
Sampling
•  Sampling is that part of statistical practice concerned with
the selection of a subset of individuals from within a
population to yield some knowledge about the whole
population, especially for the purposes of making
predictions based on statistical inference.
Why sampling?
Get information about large populations
ê  Less costs
ê  Less field time
î  it’s impossible to study the whole population
Sampling terms

•  Sampling: the process of selecting sample from a


population
•  Sampling frame: the set (list) of sampling units from
which the sample to be selected
•  Sampling unit: the unit of selection in sampling process.
Examples: person, household etc..
•  Sampling method (types): probability and non
probability
STEPS IN SAMPLING PROCESS
•  STEP 1: Define the target population
•  STEP 2: Identify the sampling frame
•  STEP 3: Specifying the sampling unit.
•  STEP 4: Selection of the sampling method.
•  STEP 5: Determination of sample size.
•  STEP 6: Specifying the sampling plan.
•  STEP 7: Selecting the sample.
Sampling process
Example of a sample
•  Study evaluation of forth year students grades in
community medicine.
•  the population of forth year students is 600, only 200
students are taking community medicine exam as the
target population and only 100 students are chosen as
samples for the actual study.
Types of Samples
Probability sampling

•  Simple random sample: used with homogenous


population. The process done by preparing sampling
frame for the study population and use the random
number table for selecting sample member
•  Every sample unit has the same chance to be selected
•  Advantages: 1- representative and subject only to
sampling error
•  2-Estimates are easy to be calculated
•  Disadvantages: 1- sampling frame is large
•  2- minority subgroups of interest may not be present the
sample in sufficient numbers for the study
Probability sampling
•  Simple random sample
Probability sampling
•  Systemic random sampling:
relies on arranging the target population according to
some ordering scheme and then selecting elements at
regular intervals through that ordered list.
•  Advantage:
1- easy to select
2- suitable sample frame can be identified easily
3- sample is evenly spread over the entire population
•  Disadvantage: may be biased if a hidden periodicity in the
population coincides with that of the selection
Probability sampling
•  Stratified random sample: suitable for heterogeneous
population. The population divided into strata according to
characteristics of interest, then simple random sample is selected
from each strata
•  Advantage:1-every unit in the stratum has the same chance to be
selected. 2- using the same sampling fraction for all strata ensures
proportionate representation. 3- adequate representation of minority
groups
•  Disadvantage: sampling frame for the entire population has to be
prepared separately for each stratum
Probability sampling

•  Cluster sample: is an example of 'two-stage sampling' . First stage a


sample of areas is chosen;
•  Second stage a sample of respondents within those areas is selected.
Population divided into clusters of homogeneous units, usually based on
geographical contiguity. Sampling units are groups rather than individuals. A
sample of such clusters is then selected. All units from the selected clusters
are studied. Examples: classroom of school children
•  Advantages: reduce the cost of preparing sampling frame
•  Disadvantage: sampling error is higher than simple random
NON PROBABILITY SAMPLING

•  Any sampling method where some elements of population


have no chance of selection (these are sometimes
referred to as 'out of coverage'/ 'under covered'), or where
the probability of selection can't be accurately determined.
It involves the selection of elements based on
assumptions regarding the population of interest, which
forms the criteria for selection. Hence, because the
selection of elements is nonrandom, nonprobability
sampling not allows the estimation of sampling errors..
TYPES OF NON PROBABILITY
SAMPLING
•  Convenience sampling: Use results that are easy to get
TYPES OF NON PROBABILITY
SAMPLING
•  Quota sampling: The population is first segmented into
mutually exclusive sub-groups, just as in stratified sampling.
•  Then judgment used to select subjects or units from each
segment based on a specified proportion.
•  For example, an interviewer may be told to sample 200 females
and 300 males between the age of 45 and 60.
•  It is this second step which makes the technique one of non-
probability sampling.
•  In quota sampling the selection of the sample is non-random.
•  For example interviewers might be tempted to interview those
who look most helpful. The problem is that these samples may
be biased because not everyone gets a chance of selection.
This random element is its greatest weakness and quota
versus probability has been a matter of controversy for many
years
TYPES OF NON PROBABILITY
SAMPLING
•  Snowball sampling:
people (seeds) who meet
the criteria to be asked to
name others who meet
theses criteria. It useful
mostly in hidden
population
Sample Size
•  Example: –

•  From the population of 10,000 clients with tuberculosis, a


nurse-researcher selected a sample size with a margin of
error of 5%.

•  The desired sample size is computed to be 385


Sampling error (random error)

•  A sample is a subset of a population. Because of this


property of samples, results obtained from them cannot
reflect the full range of variation found in the larger group
(population). This type of error, arising from the sampling
process itself, is called sampling error, which is a form of
random error. Sampling error can be minimized by
increasing the size of the sample.

• When n = N ⇒ sampling error = 0


Non-sampling error (bias)

•  It is a type of systematic error in the design or conduct of


a sampling procedure which results in distortion of the
sample, so that it is no longer representative of the
reference population. We can eliminate or reduce the non-
sampling error (bias) by careful design of the sampling
procedure and not by increasing the sample size.

•  Example : taking a sample of female students to evaluate


smoking in Libyan students.
DESCRIPTIVE
STATISTICS
Data presentation:

•  Tabulation
•  Charts and diagrams
•  Text
Tabulation

•  Content of table depends on the type of variable


•  Qualitative variables need categorical form
•  Numerical variables displayed in non overlapping intervals
•  General principles of designing table:-
•  the table should be numbered
•  A brief and self explanatory title should be given to each table.
•  The heading of the columns or rows should be clear and
concise
•  The data must be presented according to the size or
importance, chronologically, alphabetically or geographically
•  If percentage or average are to be compared they should be
placed as closed as possible
•  No table should be too large
•  Vertical arrangement is better than horizontal one
•  Foot notes could be given when necessary
Simple table
Cross tabulation

•  Present two or more variables at same time


Graphical presentation

•  Better retained in the memory than tables


•  Presentation of qualitative data:
•  Bar chart:-
•  A bar chart is a graph with rectangular bars. Each bar’s
length or height is proportional to the bars’ represented
values. In other words, the length or height of the bar is
equal to the quantity within that category. The graph
usually shows a comparison between different categories.
Used with qualitative data.
•  The vertical scale should start at zero
•  a) Simple bar chart
•  b) Multiple bar chart (grouped bar chart)
•  c) Component bar chart (stacked bar chart)
Bar Chart
Bar Chart
Bar Chart
Pie chart

•  It is a circle divided into


sectors with areas
proportional to the
frequencies
•  Used for qualitative
data
Stem and leaf plot

•  Used for small


quantitative data
•  Provide information about
the range of data set
•  Show the location of the
highest concentration of
measurements, the
presence or absent of
symmetry.
•  Shows the information in
single values
Histogram

•  consists of tabular frequencies, shown as adjacent


rectangles, erected over discrete intervals (bins), with an
area equal to the frequency of the observations in the
interval. used for quantitative data or ordinal data

•  Disadvantage: can’t be used to compare two variables


Histogram
Frequency polygon ( line charts)

•  suitable for comparing two or more distribution


•  • it constructed by the joining mid point of the top of
histogram
Frequency polygon ( line charts)
Frequency curve

•  it is a frequency polygon
after it is smoothed
•  • Different variables has
different distribution pattern
( bell shaped, unimodal,
bimodal, etc)
Line diagram

•  Used to show the trend of events with the passage of time


pictogram

•  It is easy to explain for


general population
•  Using of small pictures or
symbol to present data
Statistical map
Scatter plot

•  Shows the relation


between two variable

You might also like