You are on page 1of 50

STATIS130

Basic Statistical Methods Applied to Research

Renato V. Herrera, Jr. Ph.D.

UNO-R Vision-Mission
Vision: A Catholic University committed to the integral
formation of the human person with passion for
excellence and service to the Church and society.
Mission: An Augustinian Recollect University that
educates the mind and heart by providing the climate,
structure, and the means to develop the vocation,
knowledge, skills, talents, and attitude of the community
as permeated by the Gospel values for the service of
humanity, love, and praise to the One God.
Recoletos De Bacolod GS Vision-Mission
Vision: To be a leading graduate education institution
that is preferred by clients because of its quality
graduate programs responsive to the contemporary
glocal needs of academe, industry, the Church, and the
society.

Mission: To help students achieve professional growth


and develop their technical competence, character, and
faith in God through quality graduate education programs
in the spirit Caritas et Scientia.
Institutional Graduate Attributes
Spiritually sound individuals who are Christ-centered and Marian-inspired.
Intellectually cultured individuals who are able to rationally and
eloquently communicate their ideas and appreciate the arts as a reflection
of the infinite beauty of God.
Morally healthy individuals who can weigh values with a great sense
of accountability.
Physically healthy individuals who give due respect to the human
body, keeping it fit as a temple of the Holy Spirit.
Institutional Graduate Attributes
Culturally conscious individuals who value heritage earned by past
generations, enriching it by promoting desirable traditions and
rendering authentic service to the Church and the country for the
common good.
Socially concerned individuals who are sensitive and responsive to
the needs of the marginalized sector of the community and the
society.
Technically proficient individuals who have superior skills in the practice
of their professions
Scholarly leaders of science who extend the frontiers of knowledge
through experimentation and verification, bringing about a deeper
evaluation of problems that will make them see profoundly the synthesis of
faith, reason, culture, and life.
Course Information
Basic Statistical Methods Applied to Research

• Basic concepts and methods of descriptive and


inferential statistics and their use in the design, analysis
and interpretation.
• Provide with basic statistical concepts,
methods, and application.
• Prepare students for advanced statistical
analyses.
Course Information
Basic Statistical Methods Applied to Research

Topics: types of statistical data, collection of


data, tabulation and presentation of numerical
data, validity and reliability of the instrument,
the different measures of central tendency,
measures of variability or spread, normal
probability distributions, hypothesis testing,
independent and dependent samples t-test,
analysis of variance, correlation and regression.
Course Outcomes
Basic Statistical Methods Applied to Research

• Appreciate the role of statistics as part of


quantitative research methodology.

• Apply the skills in collecting, tabulating and


graphing numerical data.

• Describe quantitative results using descriptive statistics.


Course Outcomes
Basic Statistical Methods Applied to Research

• Use inferential statistics to test hypothesis.


• Develop competence in the use of statistical software
for classifying and describing data, as well as for
inference.
• Plan and carry out basic statistical analyses of
research data.

• Choose appropriate statistical methods according


to circumstances.
Grading System
Basic Statistical Methods Applied to Research

• Class Standing (assignments/problem set,


reports, class participation, etc.) - 40%
• Examination - 20%
• Project/Performance Task - 40%
The “INC” (Incomplete) mark will be given to students who failed
to complete and submit the necessary requirements of the
course.
The student must complete the requirements of the course within
one year after the final exam period.

Course Outputs
Basic Statistical Methods Applied to Research

• Problem Sets
• Performance Tasks
Attendance Policy
Basic Statistical Methods Applied to Research

• Prompt and regular attendance in all classes are required of all students
from the first meeting and throughout the entire duration of the
semester or term.

• Excuses are for the time missed only. All work covered in class during
the absence of the student shall be made up to the satisfaction of the
professor within a reasonable time from the date of absence.
Attendance Policy
Basic Statistical Methods Applied to Research

• A student is allowed the maximum number of absences of 20% of the


total number of class hours for the whole semester. A student may
be excused due to the following: death in the family, illness, and a
school activity with a letter of endorsement by the Dean. In case of
illness, the
student is required to present a medical certificate.

• The student who has incurred an absence of more than 20% allowable
maximum number of absences, he/she shall be marked dropped from
the subject.
Introduction to Statistics
Overview of Statistics
Data Classification
Data Collection

Dr. Herrera / October 15, 2022

Overview of
Statistics Definition of
Statistics

• Statistics is a science of
collecting, organizing,
summarizing, analyzing,
and interpreting data in
order to make decisions or
provide answers or solutions
to an inquiry.
Overview of
StatisticsStatistics
enables us to:

• Characterize persons, objects,


situations, and phenomena.

• Explain relationships among


variables.
• Formulate objective
assessments and comparisons.

• Make evidence-based decisions


and predictions.
Overview of
StatisticsStatistics in the
context of research
Overview of
StatisticsBranches of
Statistics

• The study of statistics has two


major branches: descriptive
statistics and inferential
statistics.

• Descriptive statistics is the


branch of statistics that involves
the organization, summarization,
and display of data.

• Summaries usually consist of


graphs and numbers such as
averages and percentages.
Overview of
StatisticsBranches of
Statistics
• Inferential statistics is the
branch of statistics that involves
using a sample to draw
conclusions or make predictions
about a population.

• Performing estimations and


hypothesis tests and determining
relationships among variables
are also part of inferential
statistics.
was studied for 18 years. For
Overview of Statistics unmarried men, approximately 70%
Branches of Statistics were alive at age 65. For married
men, 90% were alive at age 65.
(Source: The Journal of Family Issues)
Example:
Determine which part of the study represents the
descriptive branch of statistics and what
1. A large sample of men, aged 48, conclusion might be drawn from the study using
inferential statistics. Example:

2. Determine whether descriptive or


inferential statistics were used.

a. A survey of 2234 people conducted by the


Solution: Harris Poll found that 55% of the
respondents said that excessive
Descriptive: “For unmarried men, complaining by adults was the most
approximately 70% were alive at age annoying social media habit.
65” and “For married men, 90% were
alive at age 65” b. Last August 31, 2021, the OCTA
Research Group predicted that the average
Possible Inference: “Being married daily cases of COVID for the first week of
is associated with a longer life for September 2021 would be 25,000.
men”
Overview of Statistics
Branches of Statistics
Solution: b. Inferential
statistics were used.
a. Descriptive statistics were used.
Overview of
StatisticsVariable and
Data

• A variable is a characteristic or
attribute that can assume
different values.

• Data are the values


(measurements or observations)
that the variables can assume.

• A collection of data values forms


a data set. Each value in the data
set is called a data value.
Overview of Statistics
Variable and Data

Data Classification
Classification of Variables

Variables

Qualitative Quantitative Discrete

Continuous

Data
Classification
Classification of Variables

• Qualitative variables
express a categorical
attribute.

• Quantitative variables are


those that can be counted
or measured.
Example:
Data Classification
Classification of Variables Qualitative or Quantitative?

a. Amount of time it takes to assemble a


simple puzzle. d. Qualitative

b. Number of errors in a midterm exam.


Data Classification
c. Rating of a newly elected politician Classification of Variables
(excellent, good, fair, poor)

d. The Region in which a person lives. • Discrete variables assume

Discrete Variable
Solution:
Continuous Variable
a. Quantitative b. Quantitative c. Qualitative
Possible outcomes are listed/counted Possible outcomes are measured
values that can be counted.
Countably finite Uncountably infinite
• Continuous variables assume
infinite number of values by measuring. Often Possible outcomes is a set of separate
values
between any two specific include fractions and Possible outcomes form an interval
values. They are obtained decimals.
b. When the typical patient has blood
Data Classification drawn as part of a routine examination, the
Classification of Variables volume of blood drawn is between 0 mL
and 50mL.

Example:

Discrete or Continuous?

a. Each of several physicians plans to Solution:


count the number of physical
examinations given during the next full a. Discrete b. Continuous
week.
feet from a large sample of random
Data Classification subjects.
Classification of Variables
d. The number of times that randomly
selected drivers spend texting while
Example: driving during the past 7 days.

Discrete or Continuous?

c. When studying the relationship between


lengths of feet and heights so that
footprint evidence at a crime scene can be Solution:
used to estimate the height of the suspect,
a researcher records the exact length of a. Continuous b. Discrete

Data Classification
Levels of Measurement
• There are four levels of measurement of
variables: nominal, ordinal, interval and
ratio.

• Nominal level of measurement arises


when we have variables that are
categorical and non-numeric or where the
numbers have no sense of ordering.
Data Classification
Levels of Measurement

• Ordinal level classifies data into


categories that can be ranked; however,
precise differences between the ranks do
not exist.

• Interval level ranks data, and precise


difference between units of measure do
exist; however, there is no meaningful
zero.
Data
Classification
Levels of Measurement

• Ratio level possesses all the


characteristics of interval measurement,
and there exist a true zero.
b. Weight
Data Classification
Levels of Measurement c. Zip Code

d. Grade Level
Example:

What level of measurement would be used


to measure each variable?

a. Temperature Solution: a. Interval b. Ratio

c. Nominal d. Ordinal

Data Collection
Methods of Collecting Data

• Statistics is a tool for converting data


into information.

Where does the data come from? Is it


Data representative of the population from
which it was drawn?
How is the data gathered?
STATISTICS
How do we ensure it is accurate? Is the
Information data reliable?

Data Collection
Methods of Collecting Data
• Published Sources. Data in print or electronic form, including data
found on internet websites. Primary data sources are those published
by the individual or group that collected the data. Secondary sources
are those compiled from primary sources.

• Experiments. A study that examines the effect on a variable of


varying the values of another variable or variables while keeping all
other things equal. A typical experiment contains both a treatment
group and a control group.

• Surveys. A process that uses questionnaires or similar means to


gather values for the responses from a set of participants.
Data
CollectionMethods of
Collecting Data

• Example Published Sources.


Many government
agencies publish primary
data sources that are
available at the
data.gov.ph website.
Data Collection
Methods of Collecting
Data

• Example Experiments.
Pharmaceutical companies
use experiments to determine
whether a new drug is effective.
A group of patients who have
many similar characteristics is
divided into two subgroups.
Treatment group receive the new
drug. Control group often
receive a placebo, a substance
that has no medical effects.
Data
CollectionMethods of
Collecting Data

• Example Surveys. Globe


Telecommunication use their
Customer Information Survey to
know their customers,
understand their customers’
needs, and deliver more relevant
and customized products and
services. Globe and TM
customers are randomly
selected as respondents to the
survey.
Data
CollectionPopula
tion and Sample

• Two types of data sets:


population and sample.

• A population is the collection of


all outcomes, responses,
measurements, or counts that
are of interest. (Census)

• A sample is a subset, or part, of


a population. (Survey)
the sample data set.
Data Collection
Population and Sample

Example:
Solution:
In a survey, 614 small business owners
in the Philippines were asked whether Population: All small business
they thought their owners in the Philippines.
company’s Facebook presence was
valuable. Two hundred fifty-eight of Sample: 614 small business owners in
the survey.
the 614 respondents said yes. Identify
the population and the sample. What
Data set: consists of 258 owners
is the population? Sample? Describe who said yes and 356 owners who
said no. Solution:
Data Collection Population: All hospitals in the
Population and Sample Philippines

Sample: 48 hospitals
Example:
Data set: 48 hospital bed occupancy
Forty-eight hospitals from different rate
regions were randomly selected. Then,
data were gathered from these 48
hospitals to estimate the hospital bed
occupancy rate of hospitals in the
Philippines.

Data
CollectionParam
eter and Statistic
• A parameter is a numerical
description of a population.

• A statistic is a numerical
description of a sample
characteristic.

Data Collection
Parameter and Statistic starting salary for marketing majors is
15,000 pesos.

Example: 2. Only 25% of all senior citizens in the


Philippines have been fully vaccinated
Determine whether the numerical value against COVID-19.
describes a population parameter or
sample statistic.

1. A survey of approximately 4,000


employers reported that the average
Solution: population and housing three in every five
institutional living quarters are of
1. The average starting salary of 15,000 residential type.
pesos is a sample statistic.
4. Based on a nationwide survey, 52.4 %
2. The percent, 25%, is a population of elementary graduates can read, write,
parameter. compute, and comprehend.
Data Collection
Parameter and Statistic

Example: Solution:

Determine whether the numerical value 1. Three in every five is a population


describes a population parameter or parameter.
sample statistic.
2. The percent, 52.4%, is a sample
3. According to the 2020 census of statistic.
Data Collection
Sampling Techniques
• Random Sampling: A random sample
is a sample in which all members of
the population have an equal chance
of being selected.

• Systematic Sampling: A systematic


sample is a sample obtained by
th
selecting every k member of the
population where k is a counting
number.
Data Collection
Sampling Techniques

• Stratified Sampling: A stratified


sample is a sample obtained by
dividing the population into groups or
strata according to some characteristic
relevant to the study. Then subjects are
selected from each subgroup.

• Cluster Sampling: A cluster sample


is obtained by dividing the population
into sections or clusters and then
selecting one or more clusters and
using all members in the clusters as
members of the sample.
Sampling Techniques
Data Collection
Example:
State which sampling method was used.
Data Collection
a. Out of 10 hospitals in a municipality, a Sampling Techniques
researcher selects one and collects
records for a 24-hour period on the types
Example:
of emergencies that were treated there.
State which sampling method was used.
b. A researcher divides a group of students
according to gender, major field, and low, c. The subscribers to a magazine are
average, and high grade point average. numbered. Then a sample of these people
Then she randomly selects six students is selected using random numbers.
from each group to answer questions in a
survey. d. Every 10th bottle of Super-Duper Cola is
selected, and the amount of liquid in the
bottle is measured. The purpose is to see
if the machines that fill the bottles are
working properly.
Solution: a. Cluster b. Stratified
Solution: c. Random d. Systematic

Data Collection
Other Sampling Techniques

• Convenience Sampling: A sampling technique that involves using


respondents who are convenient to the researcher.
• Volunteer or Self-selected Sampling: A sampling technique where
people decide for themselves if they wish to be included in the
sample.
• Other non-probability sampling techniques: purposive
sampling and snowball sampling.

Sampling error is the difference between the results


obtained from a sample and the results obtained from
the population from which the sample was selected.
Organizing Data
Organizing Categorical Variables

Summary Table: Summary table tallies the set of individual values as


frequencies or percentages for each category. A summary table helps
reveal differences among the categories by displaying the frequency,
amount, or percentage of items in a set of categories in separate
column.

Contingency Table: Contingency table cross-tabulates, or tallies jointly,


the data of two or more categorical variables, which enables one to study
patterns that may exist between the variables. Tallies can be shown as a
frequency, a percentage of overall total, a percentage of the row total, or a
percentage of the column total.

Organizing Data
Organizing Categorical Variables

Example Summary Table: Typos on a resume do not make a very good


impression when applying for a job. Senior executives were asked how many
typos in a resume would make them not consider a job candidate (“Job
Seekers Need a Keen Eye,” USA Today). The resulting data are summarized in
the summary table.

Organizing Data
Organizing Categorical Variables

Example Contingency Table:


The paper “Facial Expression
of Pain in Elderly Adults with
Dementia” (Journal of
Undergraduate Research
[2006]) examined the
relationship between a nurse’s
assessment of a patient’s facial
expression and his or her self
reported level of pain. Data for
89 patients are summarized in
the given table.

Organizing Data
Organizing Numerical Variables

• Organizing numerical variables: frequency distribution, relative


frequency distribution and percentage distribution and
cumulative frequency distribution.

• A frequency distribution tallies the values of a numerical variable


into a set of numerically ordered classes.
• Frequency distribution should have at least 5 and
no more than 15 classes.
Organizing Data
Organizing Numerical Variables

• A relative frequency distribution presents relative frequency, or


proportion, of the total for each group that each class represents.
The proportion or relative frequency is equal to the number of
values in each class divided by the total number of values.

• A percentage distribution presents the percentage of the total for


each group that each class represents.

• A cumulative percentage distribution provides a way of presenting


information about the percentage of values that are less than a
specific amount.
Steps for constructing a frequency distribution:

1. Find the range of the scores.

2. Determine the width of each class interval.

3. List the limits of each class interval, placing the interval containing the lowest
score value at the bottom.

4. Tally the raw scores into the appropriate class intervals.

5. Add the tallies for each interval to obtain the interval frequency.
Organizing Data
Organizing Numerical Variables

• Example Frequency Table/Distribution: Scores from statistics exam (N=70)


Scores from statistics exam (N = 70)
Organize the given data set through (a) frequency, (b) frequency percentage,
and (c) cumulative frequency percentage.

Visualizing Data
Bar Graph and Histogram

• A bar graph visualizes a categorical variable as a series of bars,


each bar separated by space, called a gap. In a bar chart, each bar
represents the tallies for a single category, and the length of each
bar represents either the frequency or percentage of values for a
category gap.

• A histogram visualizes as vertical bar chart in which each bar


represents a class interval from a frequency or percentage
distribution. There are never any gaps between adjacent bars in a
histogram.
Visualizing Data
Bar Graph and
Histogram

Example Bar Graph:

Students enrolled in various


undergraduate majors in a
college of arts and
sciences.

Visualizing Data
Bar Graph and
Histogram

Organizing and Visualizing Data


using JASP
Exercise 1: Scores from statistics exam (N = 70) Construct a histogram for the given data

set.
Exercise 2: Create a
summary table and
contingency for Gender
(Sex) and Ethnic
(Ethnicity).

Keep safe everyone!

You might also like