You are on page 1of 66

STATISTICAL ANALYSIS WITH

COMPUTER APPLICATION

L E SS O N 1 – B a s i c s o f S t a t i s t i cs
What you will learn…

 Importance of Statistics
 Basic Statistics Concepts
 Variables
 Scale of Measurement
 Types of Data
What is Statistics

Wait! Before answering the question let us


first ask
‘what Statistics can do’
20%

25%

75%

15%

55%

25%

45%

Basics of Statistics
Introduction

Liu Chang works as a team


leader in the Learning &
Development (L&D) Department
at Helios Inc.

In his recent appraisals, Liu has


been promoted as the Assistant
Manager.
Introduction

Liu’s key responsibilities


as an Assistant Manager
would be to determine
the training and
development needs of
the employees of Helios.

Also, to determine the


type and level of training
required by various
teams and new
employees and to
analyze the pre-training
and post-training skill
levels of the employees.
Introduction

Therefore, the main


focus of Liu in his new
role would be to collect
and analyze data
regarding the various
skills of employees, their
areas of improvement,
areas where they need to
develop new skill sets
etc.
Introduction

Liu had always been


appreciated by the
senior management
for his dedicated and
proficient work as a
team leader.

However, after his


promotion, Liu is
finding it difficult to
cope with the
challenges in his new
role as an Assistant
Manager.
Introduction

Liu found that in his new role, he


is constantly required to analyze
large amounts of data.

He has assigned one of his


subordinates, Kate to collect
various data related to his
queries.

However, Liu found that when


Kate collected the data and
collated it all in the form of
graphs, Liu was just not able to
understand the data.
Introduction

Liu had to design a


training program for
one of the teams in the
Operations department
to help them enhance
their soft skills.

He asked Kate to
collect data regarding
the current soft skill
levels of the members
of the team and also
the areas where they
perceive that they need
to improve.
Introduction

Kate collected the data


and represented it in
the form of a pie-chart
so that Liu could
analyze and interpret
the data at a glance.

However, Liu could not


understand the pie-
chart and interpret it.
Introduction

Instead he chose to go
through almost 20
pages of data that Kate
had collected for the
purpose.

Yet, Liu could not make


any sense of all the 20
pages.

This was because he


could not see the
complete data in a
snapshot as was
shown by the pie-chart.
Introduction

Hence, Liu understood


that it was important for
him to learn and
understand Statistics to
succeed in his new role.
Introduction

Therefore, you can understand


that it is crucial that you have a
good grasp of statistics to
succeed in the corporate
world. This is because in any
organization, almost all
information is represented in
the form of data that is
communicated through
reports. Large chunks of such
quantitative data is collected,
organized and presented using
statistics in a concise form to
help people analyze and
interpret the data for their
purposes.
Introduction

 the branch of applied


mathematics that is
concerned with the
collection, organization,
analysis, interpretation and
presentation of quantitative
data.

 transforms data into useful information for decision makers

 It’s all about making sense of data and figuring out how to
put that information to use

 It help us make decisions in uncertain situations


MEANINGS OF STATISTICS

Field of Statistics
The study and practice of collecting and
analyzing data

Statistics
Facts about, or summaries of data
TYPES OF STATISTICS

Descriptive Statistics Involves


organizing, summarizing, and
presenting data.

Inferential Statistics uses sample data to draw


conclusions about a population.
DESCRIPTIVE STATISTICS

Collect data
 e.g. Survey
Present data
 e.g. Tables and graphs
Characterize data
 e.g. Sample mean = X i

18
INFERENTIAL STATISTICS

Estimation – estimating some population


parameter with a certain level of precision
 e.g. Estimate the population mean
weight using the sample mean weight

Hypothesis testing – determine how much


evidence the data provides for or against a
hypothesized relationship
 e.g. Test the claim that the population
mean weight is 120 pounds

Drawing conclusions and/or making decisions concerning a


population based on sample results.
19
IMPORTANT TERMS

Population
The collection of all responses,
measurements, or counts that are of
interest.

The population must be defined explicitly before the study begins and the
research hypothesis/questions specify the population being studied

Defined by certain characteristics:


- Inclusion criteria
- Exclusion criteria
IMPORTANT TERMS

Sample
A portion or subset of the
population.

Sampling Error
 reflects the fact that the result we get from our sample is not
going to be exactly equal to the result we would have got if
we have been able to measure the entire population. And
each possible sample we could take give a different result.
IMPORTANT TERMS

•Parameter

A number that describes a population characteristic.


Average gross income of all people
in the Philippines in 2019.

•Statistic

A number that describes a sample characteristic.


2019 gross income of people in a sample of 3
regions.
SAMPLING TECHNIQUES
SAMPLING

Sampling is the process of identifying the sample from the


population to ensure that what is true for the sample is also
true for the population or simply “the process of measuring
a small portion of something and making a general
statement about the whole thing.”
TYPES OF SAMPLING

1. Probability - each element in the population has an equal,


independent chance of being selected. The goal is to obtain a
sample representative of the target population

Examples:
 Simple random sampling
 Stratified random sampling
 Cluster sampling
 Systematic Sampling

2. Nonprobability
 Consecutive sampling: commonly used in intervention studies.
 Convenience sampling
 Purposive sampling: commonly used in qualitative research.
Random Sampling: Each member of the population
has an equal chance of being selected.
Simple Random Sampling: All samples of the same size are
equally likely.
x x x x x x x
x x x
x x x x x x
x x
x x
x x x x x
x x x x xx x x
x x x
x
x
x x x x x x x x x xx x
x x
x x x x x xx x x x x x x x x
x x x x x
xxx xx x x x x x x x x x x x x x
x x
x xx x x x x x x
x x x x
x xx x

 Assign a number to each member of the population.


 Random numbers can be generated by a random
 number table, software program or a calculator.
 Data from members of the population that correspond to
these numbers become members of the sample.
STRATIFIED RANDOM SAMPLING

Divide the population into groups (strata) and select a


random sample from each group. Strata could be age
groups, gender or levels of education, for example.

Sample

27
CLUSTER SAMPLING

Divide the population into individual units or groups and


randomly select one or more units. The sample consists of
all members from selected unit(s).

Cluster Sample
SYSTEMATIC SAMPLING

Choose a starting value at random. Then choose sample


members at regular intervals.

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

We say we choose every kth member. In this example,


k = 5. Every 5th member of the population is selected.

29
OTHER SAMPLING TECHNIQUE

Convenience Sampling: Choose readily available members


of the population for your sample.
Purpose: To make a rough estimate of how many subjects
required to answer the research question. During the design
of the study, the sample size calculation will indicate
whether the study is feasible. During the review phase, it
will reassure the reviewers that not only is the study
feasible, but that resources are not being wasted by
recruiting more subjects than is necessary
1. Hypothesis-based
2. Confidence interval-based
BRIEF OVERVIEW OF SAMPLE SIZE CALCULATIONS

Hypothesis-based sample sizes indicate the number of


subjects necessary to reasonably test the primary study
hypothesis. Hypotheses can be shown to be wrong, but they
can never be proven correct. This is because the investigator
cannot test all people in the world with the condition of
interest. The investigator attempts to test the research
hypothesis through a sample of the larger target population
HYPOTHESIS-BASED SAMPLE SIZES

From the data collected, inferences are made about the larger
population. For example, if 80% of patients self-administering
analgesia report good pain control, whereas only 40% of
patients receiving nurse-administered analgesia report good pain
control, one would conclude that there is a difference between
the two methods and that self-administered analgesia is superior.
However, there is always a possibility that since we have only
used a sample of all possible patients, there may, in fact, be no
difference between the two but the results have just occurred due
to chance To test this formally, a statistical test would be done.
HYPOTHESIS-BASED SAMPLE SIZES

In this case the P value is 0.03. This P value means that the
probability of obtaining these results or results even more
extreme, if in truth there is no difference between the two
methods, is no more than 3%. Therefore, either self-administered
analgesia is better than nurse-administered analgesia or a very
unusual event has occurred. When there is truly no difference
between two interventions, but the results of our study suggest
there is a difference, a type 1 error has occurred. Generally,
studies will accept a 5% risk (α level) of making a type 1 error.
The calculated P value is the probability that we may have made
a type 1 error.
HYPOTHESIS-BASED SAMPLE SIZES

A type 2 error occurs when we conclude there is no evidence of a


difference between two groups, when in truth there is. Most
investigators accept a greater risk of making a type 2 error, usually
10% or 20% (β level).
Components of the Hypothesis-based Sample Size Calculation 1.
Type 1 error (α): falsely rejects null hypothesis ∗ Usual risk 0.05
2. Type 2 error (β): falsely accepts null hypothesis ∗ Usual risk 0.1 -
0.2 ∗ Study’s power = 1-β
VARIABLES and their MEASUREMENT
A. VARIABLES
VARIABLE

• A characteristic of persons, objects or events that differs in


value across persons, objects or events

Dichotomous variable
• A variable that can have only two values
Qualities of Variables

• Exhaustive
– Should include all possible answerable responses
• Mutually exclusive
– No respondent should be able to have two attributes
simultaneously
e.g. Employed vs. Unemployed
- it is possible to be both if looking for a second
job while employed
Variables

Qualitative / Quantitative/
categorical variable numerical variable

Discrete Continuous

Levels of 1. Nominal 3. Interval


Measurement 2. Ordinal 4. Ratio
Qualitative Variable

• Variable whose observations vary in kind but not in


degree
e.g. Sex
Religion
Marital status
Quantitative Variable

• Variable whose observations vary in magnitude


e.g. Age
No. of children
Income
Discrete Variables

• Quantitative variables whose observations can


assume only a countable number of values

e.g. No. of children in the family


No. of family planning methods heard
No. of dates in the past month
Continuous Variables

• Quantitative variables whose observations can assume


any one of the countless number of values in a line
interval

e.g. Height
Weight
Time
Independent Variables

• Cause or determine or influence the dependent


variable(s)
Dependent Variables

• Presumed outcome of the influence of the


independent variable(s)
Direct relationship between independent
and dependent variables

Independent Cause or
Determine or Dependent Variables
Variables
Influence
Intervening Variables

• Sometimes referred to as test or control variables


• Used to test whether the observed relations between
the independent and
dependent variables are spurious
• Serve either to increase or decrease the effect the
independent variable has on the dependent variable
Intervening variable

Intervening
Variables

Independent Dependent
Variables Variables
B. LEVELS OF MEASUREMENT

Measurement refers to the procedure of attributing


qualities or quantities to specific characteristics of objects,
persons or events. Measurement is a key process in
quantitative research and evaluation. If the measurement
procedures are inadequate its usefulness will be limited
(Polgar & Thomas, 2008)

Levels of Measurement
• Nominal
• Ordinal
• Interval
• Ratio
Nominal level

• A measurement level in which numbers are used as


labels or names rather than to reflect quantitative
information
e.g. Sex 1 = Male
2 = Female

- Marital status
- Religion
- Type of car used
Ordinal level

• A measurement level in which values reflect only rank


order
e.g. Educational attainment 1 = Elementary
2 = High School
3 = College

- Service quality rating


- Opinion on an issue (Strongly agree, Agree,
Neutral, Disagree, Strongly disagree)
Interval level

• A measurement level with an arbitrary zero point in


which numerically equal intervals at different locations
on the scale reflect the same quantitative difference

e.g. Temperature in Celsius or Fahrenheit


IQ level
Standardized exam score
Ratio level

• The highest level of measurement that has all the


characteristics of the interval scale plus a true zero point

e.g. Income
No. of children
Age
Weekly mobile data load spending
Properties held by each level of measurement

Property
Level of
measurement Equal True zero
Categories Ranks
intervals point

Nominal Yes No No No

Ordinal Yes Yes No No

Interval Yes Yes Yes No

Ratio Yes Yes Yes Yes


Appropriate statistical techniques for
each level of measurement

Level of
Nominal Ordinal Interval/Ratio
Measurement
Mode
Measures of Mode
Mode Median
central tendency Median
Mean

Min/Max/Range
Measures of Min/Max/Range
IQR
dispersion IQR
Std. Deviation

Graph Bar/Pie Bar/Pie Histogram

Frequencies
Procedures Frequencies Frequencies
Descriptives
Levels of measurement guidelines

• It is usually best to gather data at highest level of


measurement possible because one can perform more
mathematical operations and gain greater precision of
measurement

• Interval and ratio variables can be changed to become


ordinal or nominal variables but not vice versa
DATA CLASSIFICATION
Data

 Values associated with a variable


 The “data” to be analyzed
Why we need data?

 To provide input to survey


 To provide input to study
 To measure performance of service or production
process
 To evaluate conformance to standards
 To assist in formulating alternative courses of
action
 To satisfy curiosity
Data Sources

Primary Secondary
Data Collection Data Compilation

Observation Survey Print or Electronic

Experimentation
Types of data

Data

Categorical Numerical
(Qualitative) (Quantitative)

Discrete Continuous
(integers) (takes any value)
LESSON OVERVIEW

In this lesson, you have been introduced to the role


of statistics in turning data into information. In
addition, you have studied the basics of Statistics
which includes the different types of variables and
their level of measurement. Furthermore, you have
been familiarized with the various types of data
and its sources.

You might also like