Professional Documents
Culture Documents
2
Probability
-derived from the verb ‘probe’ meaning to ‘to search
into” or ‘to look for’
Statistics
-a field of study that deals with the collection,
organization, presentation, summarization, analysis
and interpretation of numerical data.
CATEGORIES OF
STATISTICS
4
CATEGORIES OF STATISTICS
DESCRIPTIVE STATISTICS
o concerned with the organization, classification and presentation of
collected data
oInvolves techniques to describe or characterize a set of gathered
data without any attempt to make inference or conclusion about
them.
oAlso concerned with measuring the relationship between two or
more variables.
5
CATEGORIES OF STATISTICS
INFERENTIAL STATISTICS
o refers to the technique of interpreting values resulting from
obtained sample data to draw conclusion, generalizations, or make
predictions or inferences about the population
o The inference about the population is based on values computed
using the methods of descriptive statistics.
6
“ Statistical Methodology may be looked
upon as being three types: descriptive,
correlational, and inferential.
”
Downie and Heath (1984)
8
REVIEW EXERCISES
1. A newspaper article reports the average salaries of health
practitioners based on the average salaries obtained from samples
in different health centers and hospitals
2. A social psychologist is interested in determining whether
individuals who graduate from technical vocational schools earn
more than those who finished a four-year college degree. He
gathered data from 150 randomly selected graduates of technical
vocational schools and 150 randomly selected graduates of a four-
year college degree and presented the results he obtained using
tables and graphs.
9
REVIEW EXERCISES
10
11
BASIC TERMS in STATISTICS
EXPERIMENT
- It is a systematic, planned and controlled activity aimed at
obtaining results that would yield to a set of data.
POPULATION
- It refers to the collection of people, objects, individuals, or scores
that can be described as having a unique combination of qualities.
12
BASIC TERMS in STATISTICS
SAMPLE
- a part of population
- it is a collection of some elements in a population and it
is a representative of the entire population.
13
BASIC TERMS in STATISTICS
VARIABLE
- is any property or characteristic of interest about each
individual unit of a population or of a sample
14
BASIC TERMS in STATISTICS
Types of Variables
15
BASIC TERMS in STATISTICS
Symbols:
PARAMETER
- is a summary measure • µ - for population
calculated on an entire mean
population data • π – for population
- it quantifies the proportion
characteristics of the
population under investigation. • σ – for population
standard deviation
16
BASIC TERMS in STATISTICS
STATISTIC
- is a summary measure calculated or a value calculated on
sample data.
- it quantifies the characteristics of the sample which
represents the population.
17
BASIC TERMS in STATISTICS
DATA
- referred to as the raw material of statistics
- It is a set of values collected for the variable from each of the
elements of the sample.
18
19 Planning – starts with a concise and clear definition of the problem. There should be a
Planning
Framework of clear vision of priorities, and how to achieve them.
Statistical Collection of Data
Data Collection – refers to the process of acquiring measurements, counts, or raw
Analysis
Organization and Presentation of Data
data.
Statistical data are the
raw material of statistical
investigations and they
Analysis of Collected Data Organization and Presentation of Data – Summarizing, organizing and presenting data
arise when
measurements Interpretations and Conclusions is another phase of statistical study.
Recommendations based on conclusions
Data Analysis – includes conversion of the data into relevant information that leads to
the formulation of clear, summarized and comprehensible numerical description.
Conclusion and Interpretation of Results – Intelligent conclusions are drawn from the
analysis of data.
Recommendations – based on the interpretations and conclusions, recommendations
are made.
20
NATURE OF DATA
classified according to source as primary or secondary and
according to type as qualitative or quantitative.
21
NATURE OF DATA
Advantages:
- accuracy, reliability and
PRIMARY DATA relevance to the study
- gathered directly from an because of the researcher’s
original source. direct participation in
gathering the information or
data
22
NATURE OF DATA
SECONDARY DATA
- information gathered from published or unpublished
materials that have been previously obtained by other
individuals or agencies.
23
NATURE OF DATA
QUALITATIVE DATA
- measure a quality, an attribute or a characteristics on each
experimental unit.
- they are labels in which category or class an individual,
object, or process fall
24 Examples:
NATURE OF DATA Discrete Variables:
QUANTITATIVE DATA
- measure a numerical quantity or amount in each
- number of defective items
experiment unit. - number of orders per day (for a certain product)
Categories:
• Discrete variables – can assume a finite or countable number of - number of times you visit a doctor
values.
• Continuous variables – can assume the infinitely many values - number of family members
corresponding to the points on a line interval; measurable,
expressed on a continuous scale Continuous Variables
- Height
- Weight
- time
- Volume
- Serum chl
25
Review Exercises
Identify which of the following represent continuous variables
and which represent discrete variables.
26 Answers:
Review Exercises
1. Discrete
1. number of male students in statistics class 2. Discrete
2. how many kinds of fruits you have eaten last week
3. life span of a sample of batteries 3. Continuous
4. number of words you can encode in one minute
5.
6.
speed of the horses in a race
height of the grade 6 pupils in your school
4. Discrete
7. reaction time of the subjects in an experiment
5. Continuous
6. Continuous
7. Continuous
27
28
METHODS OF DATA COLLECTION
1. INTERVIEW METHOD
- a direct method of investigation because the collection of
information and data is face-to-face or through a direct verbal
interaction between the interviewer and the interviewee
29
METHODS OF DATA COLLECTION
1. INTERVIEW METHOD: ADVANTAGES
31
METHODS OF DATA COLLECTION
1. INTERVIEW METHOD: DISADVANTAGES
32
METHODS OF DATA COLLECTION
2. QUESTIONNAIRE METHOD
- an indirect method of investigation
The respondents are asked to provide responses to the
prepared and well-planned list of questions
33
METHODS OF DATA COLLECTION
2. QUESTIONNAIRE METHOD: ADVANTAGES
35
METHODS OF DATA COLLECTION
3. REGISTRATION METHOD
- usually enforced by law
Examples:
▪Registration of births, marriages, and deaths with the
Philippine Statistics Authority
▪Registration of motor vehicles and securing drivers’ licenses
from the Land Transportation Office
36
METHODS OF DATA COLLECTION
3. REGISTRATION METHOD: ADVANTAGES
37
METHODS OF DATA COLLECTION
4. OBSERVATION METHOD
- the investigator collects information on the characteristics
of the units under study by actual measurements or by
observing the behavior of persons or organizations and their
outcomes
38
METHODS OF DATA COLLECTION
4. OBSERVATION METHOD: ADVANTAGES
39
METHODS OF DATA COLLECTION
5. EXPERIMENTATION METHOD
- used to describe any process that generates a set of data
- used when the objective is to determine the cause-and-
effect relationship of certain phenomena under controlled
conditions such as in scientific researches.
41 Examples:
Classification of Measurement Data - gender distribution of 100 adults (55 male and 45 female)
- marital status (married, single, widowed, separated)
- outcome of tossing a coin (head or tail)
NOMINAL SCALE
- consists of labels or names to classify the observed elements to the
categories which they belong to
42 Examples:
Classification of Measurement Data - academic performance of students (poor, fair, good, very good, outstanding)
- choice of SIM (most preferred, next preferred, least preferred)
ORDINAL OR RANKING SCALE - size of 100 shirts (25 are small, 25 are medium, 25 are large and 25 are extra large)
- elements are arranged in some meaningful kind of natural order
which corresponds to their relative position or size but no information
about the difference between adjacent positions.
44 Examples:
Classification of Measurement Data - height of pine trees in Camp John Hay
- volume of helium gas in balloons
RATIO SCALE - time (in minutes) of each runner in a marathon
- here, we have not only the order property, a unit of measurement,
and a meaningful difference between elements but we also have a
fixed origin or zero point as opposed to an arbitrary origin in the
interval scale.
• Named variables
Nominal
46 The figure shows the classification of the different measurement scales.
Classification of Measurement Data
Measurement Scale
47 a) Nominal
Classification of Measurement Data
48 b) Ordinal
Classification of Measurement Data c) Ratio
b) Classify the product into three or more categories according to d) Ratio
some characteristics such as: good, better, best.
c) Inspect each handy phone to determine the storage capacity:
32GB, 64 GB, or 128 GB
d) Count the number of units produced per day for a given number
of days to determine the average daily production and measure its
variation
49 e) Nominal
Classification of Measurement Data f) Ordinal
Sampling
The act, process, or technique of selecting suitable sample, or a
representative part of a population for the purpose of
determining parameters or characteristics of the whole
population
53 Answers:
SAMPLING a) n = 6875
Illustration: In a population of 22,000 students enrolled at b) n = 1492
Saint Louis University in a particular semester, what sample
size is needed to get an accurate result for a study using a c) n = 393
margin of error:
a) 1 %
b) 2.5 %
c) 5 %
54 Sampling is the procedure of gathering sampling units or observations from the
population.
Sampling Techniques
Probability Sampling
55 In simple random sampling, a sample size (n) is selected from a population (N) such
that each member of the population has an equal and independent chance of being
Simple Random drawn and included in the sample.
Sampling
56 This method consists of randomly selecting one unit and choosing additional elements
at equal intervals until the desired sample size is reached.
Systematic
Random Sampling
57 Under this method, the researcher selects simple random samples from each of the
subpopulations or strata of the population.
Steps:
Stratified Sampling 1. Divide the population into sub populations
2. From each stratum, obtain a sample random size proportional to the size of each
stratum.
58 The following table shows the share of each stratum (age group) if the desired sample
Stratified Sampling: Illustration B
size, n, is 1500.
Stratum
Population % share Sample (ni)
(age range)
30 – 44 1,500 1500 / 15000 = 10% 0.1 x 1500 = 150
Total: 15,000
1500
59 First, determine the population. Then using a margin of error, say 5%, determine the
Stratified Sampling: Illustration A
Number of
sample size using Yamane’s formula. n=363 (round up).
Sample
Department Students
(
(
Next, determine the number of respondents per group
Business Administration 1,500 140
Management 1,200 112
- Determine the proportion of the sample size and the population size. p=n\N =
Finance 850 79 362.79/3900 = 0.09302
Entrepreneurship 200 19
Culinary Arts 150 14
- Multiply each subpopulation (Ni) by the computed proportion)
Total (N) 3,900 364
61 This method uses several stages or phases in getting random samples from the general
population.
This is useful in conducting nationwide surveys or any survey involving a very large
Multi-stage
Sampling population.
62 In non – probability sampling, the selection of units is solely determined by rules or
guidelines set by the researcher/investigator.
Sampling Techniques
Non - Probability Sampling
63 The researcher lays down the criteria, and subjects that satisfy the criteria are included
in the sample.
Purposive
Sampling
64 The interviewer’s aim is just to fill the prescribed quota provided he follows the given
definite instructions about the section of the public he is to question.
Quota Sampling
65 The samples are selected according to the opinion of someone who is familiar with the
relevant characteristics of the population. Often used when the required sample is
small or when the population is highly heterogeneous.
Judgment Sampling
66 Snowball sampling is especially useful when populations are inaccessible or hard to
find.
In snowball sampling, the investigator begins by identifying someone who meets the
Snowball Sampling criteria for inclusion in the study. Then he asks them to recommend others who they
may know who also meet the criteria.
67
68 Textual Presentation – the use of words, statements, and paragraphs to present data
Presentation of Data or information
Data may be presented in various ways:
Graphical Presentation – a method wherein the set of data is presented by visual
▪ Textual
▪ Graphical (e.g. pie charts, bar charts) forms called graph.
▪ Tabular
Tabular Presentation – use of tables. One of which is the frequency distribution table.
69
Stem-and-leaf Plots
• Data are sorted according to a pattern which involves separating a number into
two parts, usually the first digit and the other digits.
70
Stem-and-leaf Plots
71
Steam-and-leaf Plots
Data of the daily price Stem-and-leaf Plot for the given
quotations for a certain stock data
over a period of 20 days Stem Leaves Frequency (f)
1 015 3
10 11 15 23 27
2 378 3
40 41 44 45 46 3 8899 4
4 014566 6
28 38 38 39 39
5 278 3
46 52 57 58 65 6 5 1
Total N=20
72 A frequency distribution for qualitative data lists all categories and the number of
Frequency Distribution Table elements that belong to each of the categories.
A sample of rural country arrests gave the
following set of offenses with which Frequency Table
individuals were charged:
Offense Tally Frequency
rape theft burglary
Rape II 2
robbery arson murder
burglary burglary murder Robbery III 3
73
Frequency Distribution
Table for Ungrouped
➢ Relative Frequency – tabular arrangement of data
Data showing the proportion of each frequency to the total
𝑓
Large masses of data can be frequency. 𝑅𝐹 = . It may be expressed in decimal or in
𝑁
analyzed better and quicker when percentage.
organized and arranged in some
meaningful order like in a ➢Cumulative Frequency of each score equals sum of its
frequency and the frequencies of all the scores below it.
frequency distribution table.
74
Frequency Distribution Table for Ungrouped Data
A class of 20 students receive the Relative Frequency and Cumulative
following scores on a quiz of 35 Frequency Distribution Table
points: Score Tally f RF (%) CF
35 I 1 5 20
30 35 28 26 32 34 0 0 19
33 II 2 10 19
32 29 32 33 31 32 IIII 4 20 17
31 III 3 15 13
28 29 29 32 33
30 I 1 5 10
29 29 27 31 31 29 IIIII 5 25 9
28 II 2 10 4
27 I 1 5 2
26 I 1 5 1
Total 20 100
75
Frequency Distribution Table for Ungrouped Data
Interpretation: Relative Frequency and Cumulative
Frequency Distribution Table
Score Tally f RF (%) CF
35 I 1 5 20
▪ 5 % of the class got a perfect score of 35 34 0 0 19
points
33 II 2 10 19
▪ Half of the class got more than 30 points
(because 10 under CF column) is half of 20. 32 IIII 4 20 17
▪ The highest percentage is in the score 29 31 III 3 15 13
which means 25% of the class obtained a 30 I 1 5 10
score of 29, followed by the score 32 with
29 IIIII 5 25 9
20% of the class obtaining it.
28 II 2 10 4
27 I 1 5 2
26 I 1 5 1
Total 20 100
76
➢ Interval Width – the number of units from the lower
class limit to the upper class limit
Frequency Distribution ➢ Daniel (1999) – cited the Sturges’s rule (1926) as a guide
in the matter of deciding how many class intervals are
Table for Grouped Data needed.
22 23 24 26 27 27 27
The following data are the time, in
minutes, it took a group of 28 29 29 30 31 32 34
volunteer workers to perform a
given task. Construct a frequency 35 37 41 41 42 45 47
distribution table.
50 53 56 60 62 52 21
17
12
20
12
21
13
21
15
21
15
22
16
22
Interval
19.05 8
4. Compute for class midpoint(mark),
IIIII-IIIII-
22
28
23
29
24
29
26
30
27
31
27
32
27
34
20-28
29-37
IIIII
IIII-III
15
8
24
33
19.5-28.5
28.5-37.5
35.71
19.05
23
31
5. Identify Class Boundaries, , true limits ()
38-46 IIII 4 42 37.5-46.5 9.52 35
35 37 41 41 42 45 47 47-55 IIII 4 51 46.5-55.5 9.52 39
56-64 III 3 60 55.5-64.5 7.14 42
50 53 56 60 62 52 21
Total 42 100
Interpretation: The highest percentage 35.71% which means that most group of
81
Graphical Presentation
82
Graphical Presentation
HISTOGRAM
83
Graphical Presentation
POLYGONS
84
Graphical Presentation
OGIVES
85
Graphical Presentation
Frequency Distribution for the time(in minutes) a group of volunteer workers to perform a given task
16
12
FREQUENCY
0
11-19 20-28 29-37 38-46 47-55 56-64
CLASSES
86
Graphical Presentation
Frequency Distribution for the time(in minutes) a group of volunteer workers to
perform a given task
16
15
12
FREQUENCY
8 8
8
4 4
4
3
0
11-19 20-28 29-37 38-46 47-55 56-64
CLASSES
87
Graphical Presentation