Lesson 1 Basics of Statistics PDF

STATISTICAL ANALYSIS WITH
COMPUTER APPLICATION
L E SS O N 1 – B a s i c s o f S t a t i s t i cs
What you will learn…
 Importance of Statistics
 Basic Statistics Concepts
 Variables
 Scale of Measurement
 Types of Data
What is Statistics
Wait! Before answering the question let us

first ask
‘what Statistics can do’
20%
25%
75%
15%
55%
25%
45%
Basics of Statistics
Introduction
Liu Chang works as a team

leader in the Learning &
Development (L&D) Department
at Helios Inc.
In his recent appraisals, Liu has

been promoted as the Assistant
Manager.
Introduction
Liu’s key responsibilities

as an Assistant Manager
would be to determine
the training and
development needs of
the employees of Helios.
Also, to determine the

type and level of training
required by various
teams and new
employees and to
analyze the pre-training
and post-training skill
levels of the employees.
Introduction
Therefore, the main

focus of Liu in his new
role would be to collect
and analyze data
regarding the various
skills of employees, their
areas of improvement,
areas where they need to
develop new skill sets
etc.
Introduction
Liu had always been

appreciated by the
senior management
for his dedicated and
proficient work as a
team leader.
However, after his

promotion, Liu is
finding it difficult to
cope with the
challenges in his new
role as an Assistant
Manager.
Introduction
Liu found that in his new role, he

is constantly required to analyze
large amounts of data.
He has assigned one of his

subordinates, Kate to collect
various data related to his
queries.
However, Liu found that when

Kate collected the data and
collated it all in the form of
graphs, Liu was just not able to
understand the data.
Introduction
Liu had to design a

training program for
one of the teams in the
Operations department
to help them enhance
their soft skills.
He asked Kate to
collect data regarding
the current soft skill
levels of the members
of the team and also
the areas where they
perceive that they need
to improve.
Introduction
Kate collected the data

and represented it in
the form of a pie-chart
so that Liu could
analyze and interpret
the data at a glance.
However, Liu could not

understand the pie-
chart and interpret it.
Introduction
Instead he chose to go
through almost 20
pages of data that Kate
had collected for the
purpose.
Yet, Liu could not make

any sense of all the 20
pages.
This was because he

could not see the
complete data in a
snapshot as was
shown by the pie-chart.
Introduction
Hence, Liu understood

that it was important for
him to learn and
understand Statistics to
succeed in his new role.
Introduction
Therefore, you can understand

that it is crucial that you have a
good grasp of statistics to
succeed in the corporate
world. This is because in any
organization, almost all
information is represented in
the form of data that is
communicated through
reports. Large chunks of such
quantitative data is collected,
organized and presented using
statistics in a concise form to
help people analyze and
interpret the data for their
purposes.
Introduction
 the branch of applied

mathematics that is
concerned with the
collection, organization,
analysis, interpretation and
presentation of quantitative
data.
 transforms data into useful information for decision makers
 It’s all about making sense of data and figuring out how to
put that information to use
 It help us make decisions in uncertain situations

MEANINGS OF STATISTICS
Field of Statistics
The study and practice of collecting and
analyzing data
Statistics
Facts about, or summaries of data
TYPES OF STATISTICS
Descriptive Statistics Involves

organizing, summarizing, and
presenting data.
Inferential Statistics uses sample data to draw

conclusions about a population.
DESCRIPTIVE STATISTICS
Collect data
 e.g. Survey
Present data
 e.g. Tables and graphs
Characterize data
 e.g. Sample mean = X i
18
INFERENTIAL STATISTICS
Estimation – estimating some population

parameter with a certain level of precision
 e.g. Estimate the population mean
weight using the sample mean weight
Hypothesis testing – determine how much

evidence the data provides for or against a
hypothesized relationship
 e.g. Test the claim that the population
mean weight is 120 pounds
Drawing conclusions and/or making decisions concerning a

population based on sample results.
19
IMPORTANT TERMS
Population
The collection of all responses,
measurements, or counts that are of
interest.
The population must be defined explicitly before the study begins and the
research hypothesis/questions specify the population being studied
Defined by certain characteristics:

- Inclusion criteria
- Exclusion criteria
IMPORTANT TERMS
Sample
A portion or subset of the
population.
Sampling Error
 reflects the fact that the result we get from our sample is not
going to be exactly equal to the result we would have got if
we have been able to measure the entire population. And
each possible sample we could take give a different result.
IMPORTANT TERMS
•Parameter
A number that describes a population characteristic.

Average gross income of all people
in the Philippines in 2019.
•Statistic
A number that describes a sample characteristic.

2019 gross income of people in a sample of 3
regions.
SAMPLING TECHNIQUES
SAMPLING
Sampling is the process of identifying the sample from the

population to ensure that what is true for the sample is also
true for the population or simply “the process of measuring
a small portion of something and making a general
statement about the whole thing.”
TYPES OF SAMPLING
1. Probability - each element in the population has an equal,

independent chance of being selected. The goal is to obtain a
sample representative of the target population
Examples:
 Simple random sampling
 Stratified random sampling
 Cluster sampling
 Systematic Sampling
2. Nonprobability
 Consecutive sampling: commonly used in intervention studies.
 Convenience sampling
 Purposive sampling: commonly used in qualitative research.
Random Sampling: Each member of the population
has an equal chance of being selected.
Simple Random Sampling: All samples of the same size are
equally likely.
x x x x x x x
x x x
x x x x x x
x x
x x
x x x x x
x x x x xx x x
x x x
x
x
x x x x x x x x x xx x
x x
x x x x x xx x x x x x x x x
x x x x x
xxx xx x x x x x x x x x x x x x
x x
x xx x x x x x x
x x x x
x xx x
 Assign a number to each member of the population.

 Random numbers can be generated by a random
 number table, software program or a calculator.
 Data from members of the population that correspond to
these numbers become members of the sample.
STRATIFIED RANDOM SAMPLING
Divide the population into groups (strata) and select a

random sample from each group. Strata could be age
groups, gender or levels of education, for example.
Sample
27
CLUSTER SAMPLING
Divide the population into individual units or groups and

randomly select one or more units. The sample consists of
all members from selected unit(s).
Cluster Sample
SYSTEMATIC SAMPLING
Choose a starting value at random. Then choose sample

members at regular intervals.
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
We say we choose every kth member. In this example,

k = 5. Every 5th member of the population is selected.
29
OTHER SAMPLING TECHNIQUE
Convenience Sampling: Choose readily available members

of the population for your sample.
Purpose: To make a rough estimate of how many subjects
required to answer the research question. During the design
of the study, the sample size calculation will indicate
whether the study is feasible. During the review phase, it
will reassure the reviewers that not only is the study
feasible, but that resources are not being wasted by
recruiting more subjects than is necessary
1. Hypothesis-based
2. Confidence interval-based
BRIEF OVERVIEW OF SAMPLE SIZE CALCULATIONS
Hypothesis-based sample sizes indicate the number of

subjects necessary to reasonably test the primary study
hypothesis. Hypotheses can be shown to be wrong, but they
can never be proven correct. This is because the investigator
cannot test all people in the world with the condition of
interest. The investigator attempts to test the research
hypothesis through a sample of the larger target population
HYPOTHESIS-BASED SAMPLE SIZES
From the data collected, inferences are made about the larger
population. For example, if 80% of patients self-administering
analgesia report good pain control, whereas only 40% of
patients receiving nurse-administered analgesia report good pain
control, one would conclude that there is a difference between
the two methods and that self-administered analgesia is superior.
However, there is always a possibility that since we have only
used a sample of all possible patients, there may, in fact, be no
difference between the two but the results have just occurred due
to chance To test this formally, a statistical test would be done.
In this case the P value is 0.03. This P value means that the
probability of obtaining these results or results even more
extreme, if in truth there is no difference between the two
methods, is no more than 3%. Therefore, either self-administered
analgesia is better than nurse-administered analgesia or a very
unusual event has occurred. When there is truly no difference
between two interventions, but the results of our study suggest
there is a difference, a type 1 error has occurred. Generally,
studies will accept a 5% risk (α level) of making a type 1 error.
The calculated P value is the probability that we may have made
a type 1 error.
A type 2 error occurs when we conclude there is no evidence of a

difference between two groups, when in truth there is. Most
investigators accept a greater risk of making a type 2 error, usually
10% or 20% (β level).
Components of the Hypothesis-based Sample Size Calculation 1.
Type 1 error (α): falsely rejects null hypothesis ∗ Usual risk 0.05
2. Type 2 error (β): falsely accepts null hypothesis ∗ Usual risk 0.1 -
0.2 ∗ Study’s power = 1-β
VARIABLES and their MEASUREMENT
A. VARIABLES
VARIABLE
• A characteristic of persons, objects or events that differs in

value across persons, objects or events
Dichotomous variable
• A variable that can have only two values
Qualities of Variables
• Exhaustive
– Should include all possible answerable responses
• Mutually exclusive
– No respondent should be able to have two attributes
simultaneously
e.g. Employed vs. Unemployed
- it is possible to be both if looking for a second
job while employed
Variables
Qualitative / Quantitative/
categorical variable numerical variable
Discrete Continuous
Levels of 1. Nominal 3. Interval

Measurement 2. Ordinal 4. Ratio
Qualitative Variable
• Variable whose observations vary in kind but not in

degree
e.g. Sex
Religion
Marital status
Quantitative Variable
• Variable whose observations vary in magnitude

e.g. Age
No. of children
Income
Discrete Variables
• Quantitative variables whose observations can

assume only a countable number of values
e.g. No. of children in the family

No. of family planning methods heard
No. of dates in the past month
Continuous Variables
• Quantitative variables whose observations can assume

any one of the countless number of values in a line
interval
e.g. Height
Weight
Time
Independent Variables
• Cause or determine or influence the dependent

variable(s)
Dependent Variables
• Presumed outcome of the influence of the

independent variable(s)
Direct relationship between independent
and dependent variables
Independent Cause or
Determine or Dependent Variables
Variables
Influence
Intervening Variables
• Sometimes referred to as test or control variables

• Used to test whether the observed relations between
the independent and
dependent variables are spurious
• Serve either to increase or decrease the effect the
independent variable has on the dependent variable
Intervening variable
Intervening
Variables
Independent Dependent
Variables Variables
B. LEVELS OF MEASUREMENT
Measurement refers to the procedure of attributing

qualities or quantities to specific characteristics of objects,
persons or events. Measurement is a key process in
quantitative research and evaluation. If the measurement
procedures are inadequate its usefulness will be limited
(Polgar & Thomas, 2008)
Levels of Measurement
• Nominal
• Ordinal
• Interval
• Ratio
Nominal level
• A measurement level in which numbers are used as

labels or names rather than to reflect quantitative
information
e.g. Sex 1 = Male
2 = Female
- Marital status
- Religion
- Type of car used
Ordinal level
• A measurement level in which values reflect only rank

order
e.g. Educational attainment 1 = Elementary
2 = High School
3 = College
- Service quality rating

- Opinion on an issue (Strongly agree, Agree,
Neutral, Disagree, Strongly disagree)
Interval level
• A measurement level with an arbitrary zero point in

which numerically equal intervals at different locations
on the scale reflect the same quantitative difference
e.g. Temperature in Celsius or Fahrenheit

IQ level
Standardized exam score
Ratio level
• The highest level of measurement that has all the

characteristics of the interval scale plus a true zero point
e.g. Income
No. of children
Age
Weekly mobile data load spending
Properties held by each level of measurement
Property
Level of
measurement Equal True zero
Categories Ranks
intervals point
Nominal Yes No No No
Ordinal Yes Yes No No
Interval Yes Yes Yes No
Ratio Yes Yes Yes Yes

Appropriate statistical techniques for
each level of measurement
Level of
Nominal Ordinal Interval/Ratio
Measurement
Mode
Measures of Mode
Mode Median
central tendency Median
Mean
Min/Max/Range
Measures of Min/Max/Range
IQR
dispersion IQR
Std. Deviation
Graph Bar/Pie Bar/Pie Histogram
Frequencies
Procedures Frequencies Frequencies
Descriptives
Levels of measurement guidelines
• It is usually best to gather data at highest level of

measurement possible because one can perform more
mathematical operations and gain greater precision of
measurement
• Interval and ratio variables can be changed to become

ordinal or nominal variables but not vice versa
DATA CLASSIFICATION
Data
 Values associated with a variable

 The “data” to be analyzed
Why we need data?
 To provide input to survey

 To provide input to study
 To measure performance of service or production
process
 To evaluate conformance to standards
 To assist in formulating alternative courses of
action
 To satisfy curiosity
Data Sources
Primary Secondary
Data Collection Data Compilation
Observation Survey Print or Electronic
Experimentation
Types of data
Data
Categorical Numerical
(Qualitative) (Quantitative)
Discrete Continuous
(integers) (takes any value)
LESSON OVERVIEW
In this lesson, you have been introduced to the role

of statistics in turning data into information. In
addition, you have studied the basics of Statistics
which includes the different types of variables and
their level of measurement. Furthermore, you have
been familiarized with the various types of data
and its sources.

Lesson 1 Basics of Statistics PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lesson 1 Basics of Statistics PDF

Uploaded by

Copyright:

Available Formats

STATISTICAL ANALYSIS WITH

Wait! Before answering the question let us

Liu Chang works as a team

In his recent appraisals, Liu has

Liu’s key responsibilities

Also, to determine the

Therefore, the main

Liu had always been

However, after his

Liu found that in his new role, he

He has assigned one of his

However, Liu found that when

Liu had to design a

Kate collected the data

However, Liu could not

Yet, Liu could not make

This was because he

Hence, Liu understood

Therefore, you can understand

 the branch of applied

 transforms data into useful information for decision makers

 It help us make decisions in uncertain situations

Descriptive Statistics Involves

Inferential Statistics uses sample data to draw

Estimation – estimating some population

Hypothesis testing – determine how much

Drawing conclusions and/or making decisions concerning a

Defined by certain characteristics:

A number that describes a population characteristic.

A number that describes a sample characteristic.

Sampling is the process of identifying the sample from the

1. Probability - each element in the population has an equal,

 Assign a number to each member of the population.

Divide the population into groups (strata) and select a

Divide the population into individual units or groups and

Choose a starting value at random. Then choose sample

We say we choose every kth member. In this example,

Convenience Sampling: Choose readily available members

Hypothesis-based sample sizes indicate the number of

A type 2 error occurs when we conclude there is no evidence of a

• A characteristic of persons, objects or events that differs in

Levels of 1. Nominal 3. Interval

• Variable whose observations vary in kind but not in

• Variable whose observations vary in magnitude

• Quantitative variables whose observations can

e.g. No. of children in the family

• Quantitative variables whose observations can assume

• Cause or determine or influence the dependent

• Presumed outcome of the influence of the

• Sometimes referred to as test or control variables

Measurement refers to the procedure of attributing

• A measurement level in which numbers are used as

• A measurement level in which values reflect only rank

- Service quality rating

• A measurement level with an arbitrary zero point in

e.g. Temperature in Celsius or Fahrenheit

• The highest level of measurement that has all the

Ordinal Yes Yes No No

Interval Yes Yes Yes No

Ratio Yes Yes Yes Yes

Graph Bar/Pie Bar/Pie Histogram

• It is usually best to gather data at highest level of

• Interval and ratio variables can be changed to become