You are on page 1of 36

Section 2 – Mathematics as a Tool (Part 1)

Module 4

Data Management
OUTLINE:
•Review of
Descriptive Statistics
Normal Distribution,
Hypothesis Testing
Regression & Correlation
•Chi-Square Distribution
•Planning or Conducting an Experiment of Study
NOTE:
•The topics to be reviewed are expected to have
been covered in Junior and Senior High School.
•The focus should then be on deepening and using
these to be able to critically examine information
from various sources (e.g. newspapers)
•Exert efforts to use technology that are available to
students.
LEARNING OUTCOME
•Use a variety of statistical tools to process and
manage numerical data.
•Use the methods of linear regression and
correlations to predict the value of a variable given
certain conditions.
•Advocate the use of statistical data in making
important decisions.
STATISTICS
• Statistics plays a vital role in every field of human activity.

• Example:
 The Philippine Statistics Authority (PSA) collects data on
the population of the Philippines.
 It issues statistical reports that shows changes and
trends in the Philippine population.
The trend in the Philippine population from 1990 to 2015.
HOUSING CHARACTERISTICS IN THE PHILIPPINES
(Results of the 2015 Census of Population)
STATISTICS
Statistics involves the collection,
organization, summarization,
presentation, and interpretation of
data.
DESCRIPTIVE STATISTICS
 Describing data is an essential part of statistical analysis aiming
to provide a complete picture of the data before moving to
advanced methods.
 The type of statistical methods used for this purpose are
called descriptive statistics.
 They include both numerical (e.g. mean, mode, variance…)
and graphical tools (e.g. histogram, boxplot…) which allow to
summarize a set of data and extract important information
such as central tendencies and dispersion.
 Moreover, we can use them to describe the association
between several variables.
LEARNING OUTCOME: Use a variety of statistical tools to
process and manage numerical data.
Descriptive Statistics
• Measures of Central Tendency • Measures of Relative Position
Mean Z-score
Median
Quantiles
Mode
Percentiles
• Measures of Dispersion
Range • Measures of Shape
Standard Deviation Skewness
Variance Kurtosis
Coefficient of Variation
Type of
Objective Example Descriptive
Variable
Estimate a frequency How many people per age Frequency Table
distribution class attended this event?
(here the investigated
variable is age in a
quantitative form)
Measure the central What is the average grade Mean, Median, Mode
tendency of one in a classroom?
sample
Quantitative
Measure the dispersion How widely or narrowly are Range, Standard
of one sample the grades dispersed Deviation, Variance,
around the mean grade in a Coefficient of Variation
classroom?
Characterize the shape Is the employee wage Skewness and Kurtosis
of a distribution distribution in a company
symmetric?
Type of
Objective Example Descriptive
Variable
Compute the How many clients Frequency Table
frequencies of said they are
different categories satisfied by the
service and how
Qualitative many said they
were not?
Detect the most What is the most Mode
frequent category frequent hair color
in SLSU?
What is a Statistical Distribution?
 The distribution of a statistical data set (or a population) is a
listing or function showing all the possible values (or
intervals) of the data and how often they occur.
 When a distribution of categorical data is organized, you see
the number or percentage of individuals in each group.
 When a distribution of numerical data is organized, they’re
often ordered from smallest to largest, broken into
reasonably sized groups (if appropriate), and then put into
graphs and charts to examine the shape, center, and amount
of variability in the data.
NORMAL DISTRIBUTION
 One of the most well-known distributions is called the normal
distribution, also known as the bell-shaped curve.
 The normal distribution is based on numerical data that is
continuous; its possible values lie on the entire real number line.
 Its overall shape, when the data are organized in graph form, is
a symmetric bell-shape.
 In other words, most (around 68%) of the data are centered
around the mean (giving you the middle part of the bell), and as
you move farther out on either side of the mean, you find fewer
and fewer values (representing the downward sloping sides on
either side of the bell).
NORMAL DISTRIBUTION
NORMAL DISTRIBUTION
STANDARD NORMAL DISTRIBUTION
STANDARD NORMAL DISTRIBUTION
HYPOTHESIS TESTING
Definitions (Hypothesis, Hypothesis Testing)
Components of Hypothesis Testing
• Hypotheses (Null & Alternative Hypothesis)
• Type I and Type II Errors
• Level of Significance
• Test Statistic
• Rejection Region
• Critical Value
Steps in Hypothesis Testing
LEARNING OUTCOME: Use the methods of linear regression and
correlations to predict the value of a variable given certain
conditions.

Regression and Correlation


When one wants to determine whether the two
variables are related, a correlation analysis is useful.
Then if the variables are related, one may find an
equation that can be used to model the relationship
and use regression analysis.
CORRELATION
Scatter Diagrams and Type of Linear Correlation Between 𝒙 and 𝒚
Example:
Data:
Result:
 The correlation coefficient (Pearson) is computed as 0.95 which means
that there is a strong linear relationship between store size and annual
sales as shown by the SPSS output below:

 The variables are also significantly related as shown by the p-value of


0.00. Hence, we may find a may find an equation that can be used to
model the relationship and use regression analysis.
Regression Analysis:
• Using regression analysis, we will find the prediction equation and use
the prediction equation to forecast the annual sales for a new store with
4,000 square feet.

Result:
CHI-SQUARE
The Chi-Square test is a statistical test used to examine
patterns in distinct or categorical variables.
This test is used in:
• estimating if two random variables are independent of
one another (also known as the Test of Independence).
• estimating how closely a sample matches the expected
distribution (also known as the Goodness-of-fit Test).
CHI-SQUARE
Example: We would use Chi-Square Goodness-of-Fit test to evaluate if
there was a preference in the types of lunch that grade 11 students bought
in the canteen.
Research Question: Do grade 11 students prefer certain type of lunch?
Data:
CHI-SQUARE
Example: Chi-Square Test of Independence is used when analyzing
whether women are more likely to vote for a Republican or Democratic
candidate when compared to men.
Research Question: Is voting pattern independent of gender?
Data:
PLANNING OR CONDUCTING AN EXPERIMENT OR STUDY
 Census vs. Sample
 Sampling Methods
 Experimental Design
oTreatments
oRandomization
oReplication
oBlocking

You might also like