You are on page 1of 2

Summary of Chapter 1

This file includes some references to our data in the restaurant file that we will be using in the lab.
These references will be in bold Italic type.

 Statistics – a way of making sense of our world through the collection, analysis and
understanding of data
e.g. We gathered sample data and we will eventually summarize our data so that we can
better understand our customers and our business.
 Statistics – quantities calculated from a subset (aka, a sample) of a complete data set (aka, a
population). (A quantity calculated from a population is called a parameter.)
e.g. We will summarize our data, for example, average amount spent per visit. Average
amount spent in this sample is a statistic.
 Plan, do, report – carefully plan what you intend to do before you carry out your plan; follow
your plan; carefully explain your findings
 e.g., Based on our objective of better knowing our customers and our business, we carefully
determined what type of data to obtain, from whom to obtain the data, etc. The data were
then collected and we analyzed our data according to the plan and wrote a report which both
concisely summarized our findings and made well thought out recommendations based on
these findings
 Who, What, When, Where, Why and hoW
o Who and What are essential for collecting data used for producing useful information
o Who can be described as respondents to surveys, subjects in an experiment,
participants, experimental units, cases
‘Who’ were our customers that were surveyed using an effective sampling scheme
designed to give us the best information possible based on our budget and our time
constraints
o What is the variables of interest (i.e., characteristics) measured from the Who, with
their specific values being the data
‘What’ were the variables included in our data set (e.g. geographical location, gender,
etc.) which we felt would best help us understand our customers and our business with
the objective of improving our profits.
o Why is used to determine the reason for collecting data on specific variables and the
reasons for measuring variables in specific ways
‘Why’ was used to answer the question, ‘Why are we conducting this study?’ and ‘Why
did we collect the data that we did collect on each variable?
o How describes how the data were collected (e.g., through observation, from a survey,
from an experiment). The quality of the data depends on several factors (e.g.,
understanding what is being asked, accuracy, from whom the data was obtained)
‘How’ we chose to collect the data was through questioning random customers
as they were leaving the restaurant
o When determines when the data was or should be collected
‘When’ was used to determine on what day or days and what times of the day we
would sample our customers. (The times of day was included as another variable
because it could have had an effect on satisfaction and amount spent.)
o Where determines where the data was collected or should be collected
Where was used to determine that it was best to collect data from a random sample
of the chain’s restaurants across Canada.
 Types of Variables (and Types of Data)
o categorical (or qualitative) variable
geographical location, gender, loyalty card, satisfaction, meal are qualitative
variables even though numbers were initially assigned to identify the categories
o quantitative (or numerical) variable
age, no. of visits and amount spent are quantitative variables
o continuous (no gaps between possible values) and discrete variables (gaps between
possible values) (quantitative)
recorded age, no. of visits and amount spent are all discrete variables because there
are gaps between possible values (e.g., a customer may visit 25 times or 26 times but
not 25.3 times). Although the age reported is discrete (e.g. 45 years old), the actual
age is continuous, (e.g. 45.2374859… years old). Although amount spent is actually
discrete because there are gaps between possible values, it may be treated as
continuous because these gaps are insignificant
o identifier (used to identify individuals for example), binary (used to identify categories
when there are only two categories), nominal (used to identify categories when there is
no significance to the ordering of categories), and, ordinal (used to identify categories
when there is significance to the ordering of categories) variables (categorical)
ID is an identifier (and also nominal), gender and loyalty is binary (and also nominal),
location is nominal, and, satisfaction is ordinal
o interval (ratio of values has no significance) and ratio variables (ratio of values has
significance) (quantitative)
age, no. of visits, and amount spent are all ratio because the ratio of two ages, no. of
visits and amount spent all have significance
o cross-sectional and time-series variables (generally only applies to quantitative data)
age, no. of visits, and amount spent are all cross-sectional data because they all were
collected during one period of time

You might also like