You are on page 1of 11

Objectives

In this chapter you learn:


Chapter 1 „ To understand issues that arise when defining
variables.
Defining and Collecting Data
„ How to define variables.
„ To understand the different measurement scales.
„ How to collect data.
„ To identify different ways to collect a sample.
„ To understand the issues involved in data
preparation.
„ To understand the types of survey errors.

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 1 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 2

Classifying Variables By Type Examples of Types of Variables


DCOVA
DCOVA
ƒ Categorical (qualitative) variables take categories as Question Responses Variable Type
their values such as “yes”, “no”, or “blue”, “brown”,
Do you have a Facebook
“green”. profile? Yes or No Categorical

How many text messages Numerical


ƒ Numerical (quantitative) variables have values that have you sent in the past --------------- (discrete)
three days?
represent a counted or measured quantity.
How long did the mobile Numerical
ƒ Discrete variables arise from a counting process. app update take to --------------- (continuous)
ƒ Continuous variables arise from a measuring process. download?

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 3 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 4
Measurement Scales Measurement Scales (con’t.)
DCOVA DCOVA
A nominal scale classifies data into distinct An ordinal scale classifies data into distinct
categories in which no ranking is implied. categories in which ranking is implied.
Categorical Variable Ordered Categories

Categorical Variables Categories Student class designation Freshman, Sophomore, Junior,


Senior
Do you have a
Yes, No Product satisfaction Very unsatisfied, Fairly unsatisfied,
Facebook profile? Neutral, Fairly satisfied, Very
satisfied
Type of investment Growth, Value, Other
Faculty rank Professor, Associate Professor,
Cellular Provider AT&T, Sprint, Verizon, Assistant Professor, Instructor
Other, None Standard & Poor’s bond ratings AAA, AA, A, BBB, BB, B, CCC, CC,
C, DDD, DD, D
Student Grades A, B, C, D, F

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 5 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 6

Measurement Scales (con’t.) Interval and Ratio Scales


DCOVA DCOVA
ƒ An interval scale is an ordered scale in which the
difference between measurements is a meaningful
quantity but the measurements do not have a true
zero point.

ƒ A ratio scale is an ordered scale in which the


difference between the measurements is a
meaningful quantity and the measurements have a
true zero point.

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 7 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 8
Types of Variables Data Is Collected From Either A
DCOVA Population or A Sample
DCOVA
Variables

POPULATION
A population contains all of the items or
Categorical Numerical individuals of interest that you seek to study.

Nominal Ordinal Discrete Continuous


SAMPLE
Examples: Examples: Ratings Examples: Examples:
„ Marital Status „ Good, Better, Best „ Number of Children „ Weight
A sample contains only a portion of a
„ Political Party „ Low, Med, High „ Defects per hour „ Voltage population of interest.
„ Eye Color (Ordered Categories) (Counted items) (Measured
(Defined Categories) characteristics)

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 9 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 10

Collecting Data Via Sampling Is Used


Population vs. Sample DCOVA When Doing So Is
DCOVA
Population Sample
„ Less time consuming than selecting every item
All the items or individuals A portion of the population in the population.
about which you want to reach of items or individuals.
conclusion(s).
„ Less costly than selecting every item in the
A Population of Size 40 A Sample of Size 4 population.

„ Less cumbersome and more practical than


analyzing the entire population.

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 11 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 12
Sources Of Data Arise From
Parameter or Statistic? DCOVA The Following Activities DCOVA
„ Capturing data generated by ongoing business
„ A population parameter summarizes the value activities.
of a specific variable for a population.
„ Distributing data compiled by an organization or
individual.
„ A sample statistic summarizes the value of a
specific variable for sample data. „ Compiling the responses from a survey.
„ Conducting a designed experiment and
recording the outcomes.
„ Conducting an observational study and
recording the results.
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 13 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 14

Examples of Data Collected From Examples Of Data Distributed


Ongoing Business Activities By An Organization or Individual
DCOVA DCOVA
„ A bank studies years of financial transactions to „ Financial data on a company provided by
help them identify patterns of fraud. investment services.

„ Economists utilize data on searches done via „ Industry or market data from market research
Google to help forecast future economic firms and trade associations.
conditions.
„ Stock prices, weather conditions, and sports
„ Marketing companies use tracking data to statistics in daily newspapers.
evaluate the effectiveness of a web site.

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 15 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 16
Examples of Data From A
Examples of Survey Data Designed Experiment
DCOVA DCOVA
„ Consumer testing of different versions of a
„ A survey asking people which laundry detergent
has the best stain-removing abilities. product to help determine which product should
be pursued further.

„ Political polls of registered voters during political


„ Material testing to determine which supplier’s
campaigns.
material should be used in a product.

„ People being surveyed to determine their


„ Market testing on alternative product
satisfaction with a recent product or service
experience. promotions to determine which promotion to
use more broadly.
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 17 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 18

Examples of Data Collected Observational Studies & Designed


From Observational Studies Experiments Have A Common Objective
DCOVA DCOVA
„ Market researchers utilizing focus groups to „ Both are attempting to quantify the effect that a
elicit unstructured responses to open-ended process change (called a treatment) has on a
questions. variable of interest.

„ Measuring the time it takes for customers to be „ In an observational study, there is no direct
served in a fast food establishment. control over which items receive the treatment.

„ Measuring the volume of traffic through an „ In a designed experiment, there is direct control
intersection to determine if some form of over which items receive the treatment.
advertising at the intersection is justified.
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 19 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 20
A Sampling Process Begins With A
Sources of Data DCOVA Sampling Frame
DCOVA

ƒ Primary Sources: The data collector is the one „ The sampling frame is a listing of items that
using the data for analysis: make up the population.
ƒ Data from a political survey. „ Frames are data sources such as population
ƒ Data collected from an experiment. lists, directories, or maps.
ƒ Observed data. „ Inaccurate or biased results can result if a
ƒ Secondary Sources: The person performing frame excludes certain groups or portions of the
data analysis is not the data collector: population.
ƒ Analyzing census data. „ Using different frames to generate data can
ƒ Examining data from print journals or data published lead to dissimilar conclusions.
on the Internet.
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 21 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 22

Types of Samples:
Types of Samples DCOVA Nonprobability Sample DCOVA

„ In a nonprobability sample, items included are


Samples
chosen without regard to their probability of
occurrence.
Non Probability Probability Samples
Samples „ In convenience sampling, items are selected based
only on the fact that they are easy, inexpensive, or
Simple convenient to sample.
Random Stratified
Judgment Convenience
„ In a judgment sample, you get the opinions of pre-
Systematic Cluster
selected experts on the subject matter.

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 23 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 24
Types of Samples: Probability Sample:
Probability Sample DCOVA Simple Random Sample DCOVA

„ Every individual or item from the frame has an


„ In a probability sample, items in the
sample are chosen on the basis of known equal chance of being selected.
probabilities.
„ Selection may be with replacement (selected
Probability Samples
individual is returned to frame for possible
reselection) or without replacement (selected
individual isn’t returned to the frame).

Simple
Systematic Stratified Cluster „ Samples obtained from table of random
Random
numbers or computer random number
generators.
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 25 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 26

Selecting a Simple Random Sample Probability Sample:


Using A Random Number Table DCOVA
Systematic Sample DCOVA
„ Decide on sample size: n
Sampling Frame For Portion Of A Random Number Table „ Divide frame of N individuals into groups of k
Population With 850 49280 88924 35779 00283 81163 07275
11100 02340 12860 74697 96644 89439 individuals: k=N/n
Items 09893 23997 20048 49420 88872 08401

„ Randomly select one individual from the 1st


Item Name Item #
Bev R. 001 group
Ulan X. 002
. . The First 5 Items in a simple „ Select every kth individual thereafter
. . random sample
. . Item # 492 First Group
Item # 808 N = 40
. . Item # 892 -- does not exist so ignore
Joann P. 849 Item # 435
n=4
Item # 779 k = 10
Paul F. 850
Item # 002

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 27 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 28
Probability Sample: Probability Sample
Stratified Sample DCOVA Cluster Sample DCOVA

„ Divide population into two or more subgroups (called „ Population is divided into several “clusters,” each representative of
strata) according to some common characteristic. the population.

„ A simple random sample is selected from each subgroup, „ A simple random sample of clusters is selected.

with sample sizes proportional to strata sizes. „ All items in the selected clusters can be used, or items can be
chosen from a cluster using another probability sampling technique.
„ Samples from subgroups are combined into one.
„ This is a common technique when sampling population of „ A common application of cluster sampling involves election exit polls,
voters, stratifying across racial or socio-economic lines. where certain election districts are selected and sampled.

Population
divided into
16 clusters. Randomly selected
clusters for sample

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 29 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 30

Probability Sample: Data Cleaning Is An Important Data


Comparing Sampling Methods Preprocessing Task Prior To Analysis DCOVA
DCOVA Data cleaning corrects irregularities in the data:
„ Simple random sample and Systematic sample: „ Invalid variable values, including:

„ Simple to use. „ Non-numerical data for numerical variable.


„ Invalid categorical values for a categorical variable.
„ May not be a good representation of the
„ Numeric values outside a defined range.
population’s underlying characteristics.
„ Stratified sample: „ Coding errors, including:
„ Inconsistent categorical values.
„ Ensures representation of individuals across the
„ Inconsistent case for categorical values.
entire population.
„ Extraneous characters.
„ Cluster sample:
„ Data integration errors, including:
„ More cost effective.
„ Redundant columns.
„ Less efficient (need larger sample to acquire the
„ Duplicated rows.
same level of precision).
„ Differing column lengths.
„ Different units of measure or scale for numerical variables.
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 31 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 32
Data Cleaning Cannot Be A Fully Cleaning Invalid Variable Values
Automated Process DCOVA Can Be Semi-Automated DCOVA

„ Excel, JMP, Minitab, and Tableau have „ Invalid variable values can be identified by
functionality to lessen the burden of data simple scanning techniques, for example:
cleaning. „ Non-numeric entries for numerical variables.

„ Values for categorical variables that don’t match a


„ The software guides in the book explain this pre-defined category.
functionality.
„ Values for a numeric variable outside a pre-defined
„ When performing data cleaning, always explicit range.
preserve a copy of the original data for later „ Features exist in Excel, JMP, or Minitab to
reference. assist in this task.
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 33 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 34

Examples Of Coding Errors Data Integration Errors From Combining


DCOVA Two Different Computerized Data Sources
DCOVA
Copy-and-paste or data import can result in poor
recording or entry of data. „ Data integration errors often requires time-
consuming manual effort.
Categorical variable: Gender, Correct coding: F or M „ Some examples:
„ Correctable error: Female. „ Variable names or definitions may differ.
„ Invalid data: New York.
„ Correctable or software tolerated: m. „ Duplicated rows (observations) may also occur.
„ Extraneous and nonprintable characters:
„ Leading or trailing space(s): _F or F_.
„ Different units of measurement (or scale) may not be
„ Other nonprintable characters may also be leading or trailing
obvious without human interpretation.

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 35 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 36
Data Can Be Formatted and / or
Encoded In More Than One Way Stacked vs Unstacked Data
DCOVA DCOVA
„ Some electronic formats are more readily
usable than others. „ For unstacked data you create separate
numerical variables for different groups (i.e.
genders, locations, etc.)
„ Different encodings can impact the precision of
numerical variables and can also impact data
compatibility. „ For stacked data you create a single column for
the variable of interest and create additional
columns for the potential grouping variables.
„ As you identify and choose sources of data you
need to consider / deal with these issues.

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 37 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 38

After Collection It Is Often Helpful To


Recode Some Variables
DCOVA
Evaluating Survey Worthiness
DCOVA
„ Recoding a variable can either supplement or replace
„ What is the purpose of the survey?
the original variable.

„ Recoding a categorical variable involves redefining „ Is the survey based on a probability sample?
categories.
„ Coverage error – appropriate frame?
„ Recoding a numerical variable involves changing this
variable into a categorical variable. „ Nonresponse error – follow up.

„ When recoding be sure that the new categories are „ Measurement error – good questions elicit good
mutually exclusive (categories do not overlap) and responses.
collectively exhaustive (categories cover all possible
values). „ Sampling error – always exists.

ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 39 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 40
Types of Survey Errors (continued)
Types of Survey Errors DCOVA DCOVA

„ Coverage error or selection bias: Excluded from


„ Exists if some groups are excluded from the frame and have „ Coverage error
frame
no chance of being selected.

„ Nonresponse error or bias: Follow up on


„ Nonresponse error
„ People who do not respond may be different from those who nonresponses
do respond.

„ Sampling error: Random


„ Sampling error differences from
„ Variation from sample to sample will always exist.
sample to sample
„ Measurement error:
„ Due to weaknesses in question design and / or respondent „ Measurement error Bad or leading
error. question
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 41 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 42

Chapter Summary
Ethical Issues About Surveys
DCOVA In this chapter we have discussed:
„ Coverage error and nonresponse error can be
„ Understanding issues that arise when defining
leveraged by survey designers to purposely
variables.
bias survey results.
„ How to define variables.
„ Sampling error can be an ethical issue if the
„ Understanding the different measurement scales.
findings are purposely not reported with the
associated margin of error. „ How to collect data.
„ Measurement error can be an ethical issue: „ Identifying different ways to collect a sample.
„ Survey sponsor chooses leading questions. „ Understanding the issues involved in data
„ Interviewer purposely leads respondents in a preparation.
particular direction. „ Understanding the types of survey errors.
„ Respondent(s) willfully provide false information.
ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 43 ALWAYS LEARNING Copyright © 2020 Pearson Education Ltd. All Rights Reserved.. Slide 44

You might also like