You are on page 1of 13

SLIDE 1- group name and group number

SLIDE 2- AGE SPECIFIC FERTILITY RATES

SLIDE 3-
DCOVA
 In the DCOVA framework, the words define, collect, organize, visualize, and
analyze are used as mnemonics to remind the student of the five steps that form the
process of applying a statistical method.

SLIDE 4-
DEFINE
TYPE OF VARIABLE: Numerical- (quantitative) variables have values that represent quantities.
DIscrete variables- variables arise from a counting process.

SLIDE 5-
LEVEL OF MEASUREMENT: Ratio scale: is an ordered scale in which the difference between
the measurements is a meaningful quantity and the measurements have a true zero point.

 The data shows an age ranging from 15 to 49


 The data shows the number of women who are pregnant on that specific age range

SLIDE 6-
COLLECT
Secondary Sources: The person performing data analysis is not the data collector
 Analyzing census data
 Examining data from print journals or data published on the internet.

SLIDE 7
HOW DID WE COLLECT OUR DATA?
 The data is collected using a probability sample, specifically using the stratified method. 

SLIDE 8
STRATIFIED

SLIDE 9
 First, we divided the population into strata according to continents.  
 Then, we chose two countries among the six continents, excluding Antarctica— resulting
in 12 countries in total.
 We used a random number table to determine the two countries we analyzed in every
strata.
 We used a stratified sample to ensure the representation of individuals across the entire
population
.
SLIDE 10
SAMPLE

SLIDE 11
TABLE

SLIDE 12
TYPE OF SURVEY ERROR
Coverage error- in analyzing our data, coverage error or selection bias exists since there are
some countries that are excluded.
Sampling error- this error also exists since variation from sample to sample is always present.
SLIDE 13
ORGANIZE
Summary table
 summary table tallies the frequencies or percentages of items in a set of categories so that you
can see differences between categories.

(INSERT ‘TABLE 2’ NGA SHEET SA EXCEL. DAKO KAYTO WA LANG NAKO GIBUTANG ARI PERO DAPAT
MAKITA SAD TO)

SLIDE 14
CATEGORICAL DATA
 Used to study patterns that may exist between the responses of two or more categorical
variables.
 We tallied the data using a contingency table.
SLIDE 15 (CONTINGENCY TABLE)

SLIDE 16(INSIGHT OF THE CONTINGENCY TABLE)


INSIGHT

SLIDE 17
NUMERICAL DATA
 We organized the data using an ordered array, frequently distributions, and cumulative
distributions.

SLIDE 18
Sum of No. of fertility Age-specific fertility rates
Row Labels 15-19 20-24 25-29 30-34 35-39 40-44 45-49 Grand Total
Latest 766 1788 1850 1566 939 414 95 7418
Belize 90 193 172 123 71 27 4 680
China 5 95 101 54 22 10 7 294
Fiji 31 148 156 111 58 16 2 522
Germany 9 38 81 90 47 8 0 273
Haiti 68 161 169 175 132 65 13 783
Madagascar 148 234 207 169 131 63 13 965
Nauru 69 200 155 141 56 50 0 671
Nigeria 121 225 265 241 161 87 44 1144
Norway 8 61 125 131 57 11 1 394
Peru 61 124 124 108 72 25 3 517
Philippines 54 163 172 136 84 38 6 653
Venezuela 102 146 123 87 48 14 2 522
Grand Total 766 1788 1850 1566 939 414 95 7418

SLIDE 19
ORDERED ARRAY
 An ordered array is a sequence of data, in rank order, from the smallest value to the largest
value.
 Shows range (minimum value to maximum value)
 May help identify outliers (unusual observations)

SLIDE 20
ORDERED ARRAY

SLIDE 21
FREQUENCY DISTRIBUTION
 The frequency distribution is a summary table in which the data are arranged into numerically
ordered classes.

WHY USE A FREQUENCY DISTRIBUTION?

 It condenses the raw data into a more useful form


 It allows for a quick visual interpretation of the data
 It enables the determination of the major characteristics of the data set including where the
data are concentrated / clustered

SLIDE 22
FREQUENCY DISTRIBUTION TABLE

SLIDE 23
RELATIVE & PERCENT FREQUENCY DISTRIBUTION
SLIDE 24
CUMULATIVE FREQUENCY DISTRIBUTION

SLIDE 25
VISUALIZE
CATEGORICAL DATA
SIDE BY SIDE BAR CHART
 The side-by-side bar chart represents the data from a contingency table.

SLIDE 26
CONTINGENCY TABLE FOR TWO VARIABLES
Sum of No. of Age-specific
fertility fertility rates
Row Labels 15-19 20-24 25-29 30-34 35-39 40-44 45-49 Grand Total
Latest 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00%
Belize 11.75% 10.79% 9.30% 7.85% 7.56% 6.52% 4.21% 9.17%
China 0.65% 5.31% 5.46% 3.45% 2.34% 2.42% 7.37% 3.96%
Fiji 4.05% 8.28% 8.43% 7.09% 6.18% 3.86% 2.11% 7.04%
Germany 1.17% 2.13% 4.38% 5.75% 5.01% 1.93% 0.00% 3.68%
Haiti 8.88% 9.00% 9.14% 11.17% 14.06% 15.70% 13.68% 10.56%
Madagascar 19.32% 13.09% 11.19% 10.79% 13.95% 15.22% 13.68% 13.01%
Nauru 9.01% 11.19% 8.38% 9.00% 5.96% 12.08% 0.00% 9.05%
Nigeria 15.80% 12.58% 14.32% 15.39% 17.15% 21.01% 46.32% 15.42%
Norway 1.04% 3.41% 6.76% 8.37% 6.07% 2.66% 1.05% 5.31%
Peru 7.96% 6.94% 6.70% 6.90% 7.67% 6.04% 3.16% 6.97%
Philippines 7.05% 9.12% 9.30% 8.68% 8.95% 9.18% 6.32% 8.80%
Venezuela 13.32% 8.17% 6.65% 5.56% 5.11% 3.38% 2.11% 7.04%
Grand Total 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00%
SLIDE 27
SIDE-BY-SIDE BAR CHART

Venezuela

Philippines

Peru

Norway

Nigeria 45-49
40-44
Nauru
35-39
Latest

30-34
Madagascar
25-29
Haiti 20-24
15-19
Germany

Fiji

China

Belize

0.00% 5.00% 10.00% 15.00% 20.00% 25.00% 30.00% 35.00% 40.00% 45.00% 50.00%

SLIDE 28
STEM-AND-LEAF DISPLAY
 A simple way to see how the data are distributed and where concentrations of data exist.

SLIDE 29
HISTOGRAM
 A vertical bar chart of the data in a frequency distribution is called a histogram.
 The class boundaries (or class midpoints) are shown on the horizontal axis.
 The vertical axis is either frequency, relative frequency, or percentage.
 The height of the bars represent the frequency, relative frequency, or percentage.

SLIDE 30
HISTOGRAM

SLIDE 31
PIE CHART
 The pie chart is a circle broken up into slices that represent categories. The size of each slice of
the pie varies according to the percentage in each category.

SLIDE 32
PIE CHART

SLIDE 33
TIME-SERIES PLOT
 A Time-Series Plot is used to study patterns in the values of a numeric variable over time.
 Numeric variable is measured on the vertical axis and the time period is measured on the
horizontal axis

SLIDE 34
TIME-SERIES PLOT

SLIDE 35
MEASURES OF CENTRAL TENDENCY
Mean
 The arithmetic mean(often just call Ed the “mean”) is the most common measure
of central tendency.
 The most common measure of central tendency.
 Mean=sum of values divide2 by the number of value.
 Affected by extreme values (outliers).

SLIDE 36
MEASURES OF CENTRAL TENDENCY
Median
 In an ordered array, the median is the “middle” number (50% above, 50% below).
 The location of the median when the values are in numerical order (smallest to
largest)

 If the number of values is odd, the median is the middle number.


 If the number of values is even, the median is the average of the two middle
numbers.

SLIDE 37
MEASURES F CENTRAL TENDENCY
MODE
 Value that occurs most often.
 Not affected by extreme values.
 Used for either numerical or categorical data.  There may be no mode.
 There may be several modes.

SLIDE 38
MEASURES OF VARIATION
 Measures of variation give information on the spread or variability or
dispersion of the data values.

RANGE
 Simplest measure of variation.
 Difference between the largest and the smallest values:

SLIDE 39
MEASURES OF VARIATION: The Sample Variance
 Average (approximately) of squared deviations of values from the mean.

SLIDE 40
MEASURES OF VARIATION: The Sample Standard Deviation
 Most commonly used measure of variation.
 Shows variation about the mean.
 Is the square root of the variance.
 Has the same units as the original data.

SLIDE 41
STEPS FOR CIMPUTING STANDARD DEVIATION:
1. Compute the difference between each value and the mean.
2. Square each difference.
3. Add the squared differences.
4. Divide this total by n-1 to get the sample variance.
5. Take the square root of the sample variance to get the sample standard deviation.

SLIDE 42
MEASURES OF VARIATION: The Coefficient of Variation
 Measures relative variation.
 Always in percentage (%).
 Shows variation relative to mean.
 Can be used to compare the variability of two or more sets of data measured in
different units.

SLIDE 43
LOCATING EXTREME OULIERS: Z-Score
 To compute the Z-score of a data value, subtract the mean and divide by the
standard deviation.
 The Z-score is the number of standard deviations a data value is from the mean.
 A data value is considered an extreme outlier if its Z- score is less than -3.0 or
greater than +3.0.
 The larger the absolute value of the Z-score, the farther the data value is from
the mean.

SLIDE 43
SHAPE OF A DISTRIBUTION
 Describes how data are distributed.
 Two useful shape related statistics are:
o Skewness:
 Measures the extent to which data values are not symmetrical.
 Kurtosis
o measures the peakedness of the curve of the distribution—that is, how
sharply the curve rises approaching the center of the distribution.
SLIDE 44
SHAPE OF A DISTRIBUTION (Skewness)
 Measures the extent to which data is not symmetrical.

SLIDE 45
SHAPE OF A DISTRIBUTION (Kurtosis)
 Measures how sharply the curve rises approaching the center of the distribution.

SLIDE 46
GENERAL DESCRIPTIVE STATS USING MICROSOFT EXCEL FUNCTIONS

NOTE: INSERT THE PIC FOR THE GENERAL DESCRIPTIVE STATS USING
MICROSOFT EXCEL FUNCTIONS.

SLIDE 47
EXPLORING NUMERICAL DATA USING QUARTILES
 Can visualize the distribution of the valuesfor a numerical variable by computing:
o The quartiles.
o The five-number summary.
o Constructing a boxplot.

SLIDE 48
QUARTILE MEASURES: Locating Quartiles
 Find a quartile by determining the value in the
 appropriate position in the ranked data, where:
o First quartile position: Q1 = (n+1)/4 ranked value.
o Second quartile position: Q2 = (n+1)/2 ranked value.
o Third quartile position: Q3 = 3(n+1)/4 ranked value.

 Where n is the number of observed values.

SLIDE 49
QUARTILES MEASURES: The Interquartile Range (IQR)
 The IQR is Q3 – Q1 and measures the spread in the middle 50% of the data.
 The IQR is also called the midspread because it covers the middle 50% of the
data.
 The IQR is a measure of variability that is not influenced by outliers or extreme
values.
 Measures like Q1, Q3, and IQR that are not influenced by outliers are called
resistant measures.

SLIDE 50
THE FIVE NUMBER SUMMARY
The five numbers that help describe the center, spread and shape of data are:
 Xlargest
 Third Quartile (Q3)
 Median (Q2)
 First Quartile (Q1)
 Xsmallest

SLIDE 51
RELATIONSHIPS AMONG THE FIVE-NUMBER SUMMARY AND DISTRIBUTION
SHAPE
SLIDE 52
FIVE NUMBER SUMMARY AND THE BOXPLOT
 The BOXPLOT: A graphical display of the data based on the five-number
summary:

SLIDE 53
FIVE NUMBER SUMARY: Shape of Boxplots
 If data are symmetric around the median then the box and central line are
centered between the endpoints.

 A Boxplot can be shown in either a vertical or horizontal orientation.

SLIDE 54

SLIDE 55
ETHICAL CONSIDERATIONS
Numerical descriptive measures:
 Should document both good and bad results.
 Should be presented in a fair, objective and neutral manner.
 Should not use inappropriate summary measures to distort facts.

You might also like