You are on page 1of 85

INTRODUCTION: DESCRIBING DATA

1

Lectures 1 to 3
Based on Chapter 1 of the textbook

COURSE OUTLINE
2

IMPORTANT NOTE

In this course, I base my lectures on the posted
lecture notes, but of course I elaborate on them in
the class, and I solve problems on the board.
These notes are useful for review, but they are not
meant to take the place of lectures. Students who
rely on notes only, they traditionally do not do well
on their exams in this course.

Whatever is on slides, board, and explained in
words by me are in your exams.
3

AVAILABLE
RESOURCES FOR
PRACTICE QUESTIONS
4





Examples solved during lectures (the Most important
source)
Sets of practice questions and their answers posted by the
instructor on Owl.
Group Assignments in the course.
Bonus Pop-quizzes.
Student solutions manual which is a separate book
bundled by your textbook. This book contains detailed
solutions to all even numbered exercises and applications
in the textbook.
MyStatLab which is bundled by your textbook and contains
tutorial exercises which are algorithmically generated for
unlimited practice. The student access code for MyStatLab
is inside the student access kit. Inside the kit, you can find
the web address for registration on-line as well as a stepby-step guidance. The CourseCompass Course ID is
entezarkheir87955.

5

GROUP ASSIGNMENTS


You have to form groups of 5 students.
Each group has to give me their group member names
(official names not nicknames) on Thursday (September
19).
Give me one page with official names either typed or
written with a legible hand writing.
If this is not done on September 19 by your own
choices, I will go over the list and group the students
without a group in the alphabetical order.
Each group will solve the assignment and give me only
one copy, which is typed (not hand-written) and
stapled, on the due date of the assignment right in the
beginning of the class.
Late assignments will receive mark of zero.

6

MY EXPECTATIONS FROM YOU FOR THIS
COURSE ARE:
Attend lectures on time=>You do not miss pop-quiz
and its bonus mark
 Read chapters of the textbook before attending
lectures.
 Be focused and attentive during lectures and ask
your questions if you have any. Remember that I
also hold office hours for your questions.
 Solve practice questions
 Participate in solving your group assignments and
hand in on time
 Take midterm and final exams

7

SOME OF THE FORMULAS IN THE LECTURES

You are not supposed to memorize

some

of

the formulas. As I lecture, I specify them on slides.
I will give you these formulas in exam without any
explanation. You should know what they are and how to
use them.
However, there are other formulas that you have to
memorize them. This is normal for an econometrics
course.
This is an econometrics course filled with formulas and
equations. Some of you might not understand formulas.
Please let me know if you cannot follow me or please
come to my office hours.

8

Do cigarette taxes
lead to obesity?!
Entezarkheir, Sen, Wilson (2010)
Tax

=>
9

Lecture Goals for Students
After completing the first few lectures of the course, you
should be able to:
 Explain key definitions:
 Population vs. Sample
 Parameter vs. Statistic
 Descriptive vs. Inferential Statistics

Describe random sampling
 Explain the difference between Descriptive and Inferential
statistics
 Identify types of variables and levels of measurement

10

Lecture Goals for Students
Create and interpret graphs to describe categorical
variables:
 frequency distribution, bar chart, pie chart, Pareto
diagram
 Create a line graph to describe time-series data
 Create and interpret graphs to describe numerical variables:
 frequency distribution, histogram, ogive, stem-and-leaf
display
 Construct and interpret graphs to describe relationships
between variables:
 Scatter plot, cross table

11

What Are Data?
• Data values or observations are information
collected regarding some subject
• Data are often organized into a data table such
as the one below

12

What Are Data? Ctd.
The characteristics recorded about each individual,
case, or subject are called variables
Variables are usually shown as the columns of a
data table and identify What has been measured
Variables



13

Variable Types: Categorical and Numerical
When a variable names categories and answers
questions about how cases fall into those
categories, it is called a categorical variable

When a variable has measured numerical values
with units and the variable tells us about the
quantity of what is measured, it is called a
quantitative or numerical variable
14

Categorical variables
Categorical variables …
• arise from descriptive responses to questions
such as “What kind of advertising do you use?”
• may only have two possible values (like “Yes”
or “No”)
• may be a number like a telephone area code

15

Numerical Variables
Numerical or quantitative variables have units. The
units indicate
• how each value has been measured

• the corresponding scale of measurement
• how much of something we have
• how far apart two values are
16

FURTHER CLASSIFICATIONS OF VARIABLES
Variables

Numerical
(quantitative)

Categorical

Discrete

Examples:

Number of
Children
Defects per
hour (Counted
items)

Continuous

Examples:

Weight
Voltage
(Measured
characteristics)

17

MEASUREMENT LEVELS OF CATEGORICAL
VARIABLES: ORDINAL AND NOMINAL
Nominal Data
Ordinal Data
•In nominal data, the numbers are used only for the
purpose of convenience, and they do not mean any
ordering (Example: If female it is 1 and if male it is
zero. The responses are words that describe the
categories)
•Ordinal data shows rank ordering of items, and
similar to nominal data the values are words that
describe responses (Example: product quality rating:
1:poor, 2: average, 3:good).

18

EXAMPLE 1
Upon visiting a newly opened Starbucks store,
customers were given a brief survey. Is the answer to
each of the following questions categorical or
numerical? If categorical, give the level of
measurement. If numerical, is it discreet or
continuous?
a) Is this your first visit to this Starbucks store?
b)On a scale from 1 (very dissatisfied) to 5 (very
satisfied), rate your level of satisfaction with today’s
purchase?
c) What was the actual cost of your purchase today?
19

DECISION MAKING IN AN UNCERTAIN
ENVIRONMENT

Everyday decisions are based on uncertainty or
incomplete information
Examples: Will the job market be strong when I
graduate?

Data are used to assist decision making. How?

Statistics is a tool to help process, summarize,
analyze, and interpret data
20

SOME KEY DEFINITIONS
 A population is the collection of all items of
interest or under investigation

N represents the population size

A

sample is an observed subset of the
population

n represents the sample size

A

parameter is a specific characteristic of a
population

A

statistic is a specific characteristic of a
sample
21

POPULATION VS. SAMPLE
Population

Values calculated
using population
data are called
parameters

Sample

Values computed
from sample data
are called statistic

22

EXAMPLE 2
Suppose we are interested in examining the
household income in Canada.
 Population: All households in Canada
 Sample: A group of households participated in the
LFS survey (Labour Force Survey by statistics
Canada)

23

WHY DO WE NEED SAMPLES?
 Getting

access to the population is not
always feasible, and it is very costly.
 Thus, we select a sample from the
population to make a statement about that
population.
 How valid is this statement? It is valid if the
sample is a representative of the population.
 How can we achieve a representative
sample? One important principal is selecting
a random sample.
24

RANDOM SAMPLING
Simple random sampling is a procedure in
which
 each

member of the sample is chosen
strictly by chance,
 each member of the population is equally
likely to be chosen,
 and every possible sample of n objects is
equally likely to be chosen.
The resulting sample is called a random
sample.

25

 Now

when we collect a random sample, we
are looking for ways to illustrate the
information in the data before calculating
statistic and using it in making a decision
under uncertainty.

 Therefore,

we can use descriptive statistics

26

DESCRIPTIVE AND INFERENTIAL STATISTICS
Two branches of statistics:
 Descriptive statistics

Graphical and numerical procedures to
summarize and process data
It includes tables, graphs, mean, etc.

 Inferential

statistics

Using data to make predictions, forecasts, and
estimates to assist decision making
 Inference is the process of drawing conclusions
or making decisions about a population based on
sample results

27

DESCRIPTIVE STATISTICS: GRAPHICAL
PRESENTATION OF DATA

Data in raw form are usually not easy to
use for decision making

Some type of organization is needed
 Table
 Graph

The type of the organization to use
depends on whether the variable is
categorical or numerical.

28

DESCRIPTIVE STATISTICS: GRAPHICAL
PRESENTATION OF DATA CTD.
Categorical
Variables
• Frequency

distribution
• Cross table
• Bar chart
• Pie chart
• Pareto diagram

Numerical
Variables
• Line

chart
• Frequency
distribution
• Histogram and ogive
• Stem-and-leaf
display
• Scatter plot
29

DESCRIPTIVE STATISTICS: TABLES AND GRAPHS
FOR CATEGORICAL VARIABLES
Categorical
Data

Tabulating Data

-Frequency
Distribution
Table
-Cross Table

Graphing Data

-Bar Chart
-Stacked

or Component
bar chart

Pie
Chart

Pareto
Diagram
30

DESCRIPTIVE STATISTICS: FREQUENCY
DISTRIBUTION FOR TABULATING CATEGORICAL
VARIABLE
A

frequency distribution table is a table used
to organize data.

 The

left column (called classes or groups)
includes all possible responses on a
variable being studied.

 The

right column is a list of frequencies, or
number of observations, for each class.
31

DESCRIPTIVE STATISTICS: THE FREQUENCY DISTRIBUTION
TABLE FOR CATEGORICAL VARIABLE

Example 3: Hospital Patients by Unit
Hospital Unit

Cardiac Care
Emergency
Intensive Care
Maternity
Surgery
Total:

(Variables are
categorical)

Number of Patients

1,052
2,245
340
552
4,630
8,819

Frequencies

Percent

(rounded)

11.93
25.46
3.86
6.26
52.50
100.0

32

A NOTE
 Frequency is the number of observations in each
category.
 Relative frequency is obtained by dividing each
frequency by the number of observations.
 Percent is obtained from dividing each frequency
by the number of observations and multiplying the
resulting proportion by 100%.

33

DESCRIPTIVE STATISTICS: BAR CHART FOR CATEGORICAL
VARIABLE

When we want to draw attention to the
frequency of each category (in the frequency
distribution table) in the categorical variable,
we will use bar chart.

The height of a rectangle for a category is the
frequency of each category or the number of
observations in each category.

There is no need for the bars to touch each
other.

34

EXAMPLE 3 CTD.
Bar chart for patient data

Hospital Patients by Unit
5000
4000
3000

2000

Surgery

Maternity

0

Intensive
Care

1000

Emergency

1,052
2,245
340
552
4,630

Frequencies

Cardiac
Care

Cardiac Care
Emergency
Intensive Care
Maternity
Surgery

Number
of patients

Number of
patients per year

Hospital
Unit

35

DESCRIPTIVE STATISTICS: PIE CHARTS FOR
CATEGORICAL VARIABLE
 If

the goal is drawing attention to the
proportion of frequencies in each category
of the frequency table, pie chart is proper.

 The

circle or pie represents the total.

 The

pieces of pie display shares of the total,
frequencies, or percentage for each
category of the categorical variable.
36

EXAMPLE 3 CTD.
Pie chart for patient data
Hospital
Unit
Cardiac Care
Emergency
Intensive Care
Maternity
Surgery

Number
of Patients

% of
Total

1,052
2,245
340
552
4,630

11.93
25.46
3.86
6.26
52.50

Hospital Patients by Unit
Cardiac Care
12%

Surgery
53%

(Percentages are
rounded to the nearest
percent)

Emergency
25%

Intensive Care
4%
Maternity
37
6%

DESCRIPTIVE STATISTICS: CROSS TABLES FOR
CATEGORICAL VARIABLES
 Cross

Tables (or contingency tables) list the
number of observations for every combination
of values for two categorical variables

 If

there are r categories for the first variable
(rows) and c categories for the second
variable (columns), the table is called an
r
× c cross table.

 When

you want to display two categorical
variables together, you describe them by cross
tables and you picture them by component bar
charts.

38

EXAMPLE 4

3 x 3 Cross Table for Investment Choices by Investor
(values in $1000’s)

Investment Investor A Investor B Investor C
Category

Total

Stocks

46

55

27

128

Bonds
Cash

32
15

44
20

19
33

95
68

Total

93

119

79

291
39

EXAMPLE 4 CTD
 To

display the cross table, we use the stacked
or component bar chart

40

EXAMPLE 5 (Q 1.12 ON PAGE 14)

41

ANSWER TO EXAMPLE 5
Employee Performance
60

Number of Employees

50
40
10 to <15 min
30

5 to <10 min
< 5 min

20
10
0
Less than 3
months

3 to 6 months

6 to 9 months

Experience

9 to 12 months

42

At Home for extra practice:
 Read example 1.2 on page 10 (Cross
table and component bar chart).
 Read example 1.3 on page 11 (Pie
chart).

43

DESCRIPTIVE STATISTICS: PARETO DIAGRAM FOR
CATEGORICAL VARIABLE
 Many

managers who need to identify major
causes of problems and attempt to correct
them quickly with a minimum cost use a
special bar chart known as Pareto diagram.

 The

underlying idea is that in most cases a
small number of factors (vital few) are
responsible for most of the problems (trivial
many).

 This

graph is used to separate the “vital few”
from the “trivial many”
44

DESCRIPTIVE STATISTICS: PARETO DIAGRAM CTD.
 Pareto

diagram is used to portray
categorical variables

 Pareto

diagram is a bar chart, where
categories are shown in descending order
of frequency

45

EXAMPLE 6
400 defective items are examined for cause of defect.
Display the Pareto Diagram.
Source of
Manufacturing Error

Number of defects

Bad Weld

34

Poor Alignment

223

Missing Part

25

Paint Flaw

78

Electrical Short

19

Cracked case

21

Total

400

46

ANSWER TO EXAMPLE 6
Step 1: Sort by defect cause, in descending order
Step 2: Determine % in each category
Source of
Manufacturing Error

Number of defects

% of Total Defects

Poor Alignment

223

55.75

Paint Flaw

78

19.50

Bad Weld

34

8.50

Missing Part

25

6.25

Cracked case

21

5.25

Electrical Short

19

4.75

Total

400

100%
47

ANSWER TO EXAMPLE 6 CTD.
Step 3: Show results graphically
Note: ignore the line for now.
60%

100%

90%
50%
80%

70%
40%
60%

30%

50%

40%
20%
30%

20%
10%
10%

0%

0%

Poor Alignment

Paint Flaw

Bad Weld

Missing Part

Cracked case

Electrical Short

cumulative % (line graph)

% of defects in each category
(bar graph)

Pareto Diagram: Cause of Manufacturing Defect

48

DESCRIPTIVE STATISTICS: GRAPHS TO DESCRIBE
TIME-SERIES VARIABLES
A

line chart (time-series plot) is used to
show the values of a variable over time

 Time

is measured on the horizontal axis

 The

variable of interest is measured on the
vertical axis

Note: Time series data is a numerical data

49

EXAMPLE 7: LINE CHART

50

REMINDER: CLASSIFICATION OF VARIABLES

Data

Categorical

Numerical

Discrete

Continuous

51

REMINDER: TABLES AND GRAPHS FOR
CATEGORICAL VARIABLES
Categorical
variables

Tabulating Data

-Frequency
Distribution Table
-Cross table

Graphing Data

-Bar Chart
-Component
Bar Chart

Pie
Chart

Pareto
Diagram
52

DESCRIPTIVE STATISTICS: GRAPHS TO DESCRIBE
NUMERICAL VARIABLES
Numerical Data

Frequency Distributions
and
Cumulative Distributions

Histogram
Graph

Ogive
Graph

Stem-and-Leaf
Graph

53

DESCRIPTIVE STATISTICS: FREQUENCY DISTRIBUTION
TABLE FOR NUMERICAL VARIABLES
Similar

to categorical variables, we also have the
frequency distribution table for numerical variables to
organize data.
This

table contains class groupings (categories or
ranges within which the data fall) and the
corresponding frequencies (number of observations)
with which data fall within each class or category.
In

contrast to the case of categorical variables,
groups in the frequency distribution table for
numerical variables are defined by numbers.
54

DESCRIPTIVE STATISTICS: FREQUENCY
DISTRIBUTION TABLE FOR NUMERICAL VARIABLE
CTD.
 Therefore,

we need to know:
1-How many classes or groups do we have?
2-How wide should each class be?
 We should also know that classes should have
the same width. This makes comparison of
groups easier.
 We should also know that classes should never
overlap and must be inclusive. Why?
55

DESCRIPTIVE STATISTICS: FREQUENCY
DISTRIBUTION TABLE FOR NUMERICAL VARIABLE
CTD
 No

overlap because we do not want to
count the same observation in two different
groups.
Inclusive because of including all
information that we have.

The above points imply that we need to know
more information than the case of categorical
variable to plot the frequency distribution
table.

56

CLASS INTERVALS AND CLASS BOUNDARIES FOR
FREQUENCY DISTRIBUTION TABLE OF NUMERICAL
VARIABLES
 Each

class grouping has the same width
 Determine the width of each interval by
largest number  smallest number
w  interval width 
number of desired intervals


Use at least 5 but no more than 15-20 intervals
Intervals never overlap
Round up the interval width to get desirable
interval endpoints
57

EXAMPLE 8
A manufacturer of insulation randomly selects 20
winter days and records the daily high
temperature data:

24, 35, 17, 21, 24, 37, 26, 46, 58, 30,
32, 13, 12, 38, 41, 43, 44, 27, 53, 27

Find the frequency distribution table for this
variable.
58

ANSWER TO EXAMPLE 8
 Sort

raw data in ascending order:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38,
41, 43, 44, 46, 53, 58

 Find

range: 58 - 12 = 46

 Select

number of classes: 5 (usually between 5
and 15)

 Compute

interval width: 10 (46/5 then round up)

 Determine

interval boundaries: 10 but less than 20,
20 but less than 30, . . . , 50 but less than 60

 Count

observations & assign to classes
59

ANSWER TO EXAMPLE 8 CTD.
Data in ordered array:

12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

60

DESCRIPTIVE STATISTICS: HISTOGRAM FOR
NUMERICAL VARIABLE
A

histogram is a graph that consists of vertical
bars constructed on a horizontal line that is
marked off with intervals for the variable (in the
frequency distribution table) being displayed.
 The interval endpoints are shown on the
horizontal axis
 the vertical axis is either frequency, relative
frequency, or percentage for each interval.
 Histogram displays the distribution of the
variable.
61

EXAMPLE 8 CTD.: PLOT THE HISTOGRAM FOR THE
MANUFACTURER OF INSULIN

Histogram : Daily High Tem perature
7

6

Frequency

6
5

4

4

3

3

2

2
1

(No gaps
between
bars)

5

0

0
0 0 10 10 2020 30 30 40 40 50 50 60
Temperature in Degrees

0
Ch.
1-6270
60

QUESTIONS FOR GROUPING DATA INTO INTERVALS
FOR FREQUENCY DISTIRBUTION

1. How wide should each interval be?
(How many classes should be used?)
largest number  smallest number
w  interval width 
number of desired intervals

2. How should the endpoints of the
intervals be determined?
 Often answered by trial and error, subject to
user judgment
 The goal is to create a distribution that is
neither too "jagged" nor too "blocky”
 Goal is to appropriately show the pattern of
63
variation in the data

HOW MANY CLASS INTERVALS?

3.5
3
2.5
2
1.5
1
0.5
60

Temperature

Few (Wide class intervals)
 may compress variation too
much and yield a blocky
distribution
 can obscure important patterns
of variation.

Frequency

12
10
8
6
4
2
0
0

30

60

Temperature

More

64

(X axis labels are upper class endpoints)

More

56

52

48

44

40

36

32

28

24

20

16

8

12

0
4

Many (Narrow class intervals)
 may yield a very jagged
distribution with gaps from
empty classes
 Can give a poor indication of
how frequency varies across
classes

Frequency

SHAPE OF A DISTRIBUTION
Histogram helps us in deciding the shape of the data
distribution.
 We can visually determine whether data are evenly
spread from its middle or center.
 A distribution is said to be symmetric if the
observations are balanced or evenly distributed
around its center.
 If you fold the histogram along a vertical line through
the middle and have the edges match pretty close,
you have a symmetric distribution.

65

SHAPE OF A DISTRIBUTION CTD.
 The

usually thinner ends of a distribution
are called tails.
 A distribution is skewed, or asymmetric, if
the observations are not symmetrically
distributed on either side of the center .
 If one tail stretches out farther than the
other, the distribution is skewed to the side
of the longer tail.
66

SHAPE OF A DISTRIBUTION CTD.
A skewed-right distribution (sometimes called
positively skewed) has a tail that extends farther to
the right.
 A skewed-left distribution (sometimes called
negatively skewed) has a tail that extends farther to
the left.

67

A REMINDER
 Frequency is the number of observations in each
category.
 Relative frequency is obtained by dividing each
frequency by the number of observations.
 Percent is obtained from dividing each frequency
by the number of observations and multiplying the
resulting proportion by 100%.

68

THE CUMULATIVE FREQUENCY DISTRIBUTION

A cumulative frequency distribution contains the
total number of observations whose values are less
than the upper limit for each class in the frequency
distribution table of numerical variables.

or

Cumulative distribution is frequency distribution of
each group plus the frequency distribution of
previous group.

69

EXAMPLE 8 CTD.: MANUFACTURER

OF INSULIN

Data in ordered array:

12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

70

At Home for extra practice:
 Read example 1.9 on page 22 (frequency
and cumulative frequency for numerical
variables).

71

DESCRIPTIVE STATISTICS: OGIVE GRAPH FOR
NUMERICAL VARIABLES
 An

ogive, sometimes called a cumulative
line graph, is a line that connects points that
are the cumulative percent of observations
below the upper limit of each interval in a
cumulative frequency distribution.

72

EXAMPLE 8 CTD.: INSULIN MANUFACTURER
Cumulative
percentage

Plot the Ogive graph.
0

0

0

0

Ogive: Daily High Temperature
Cumulative Percentage

Less than 10

100
80
60
40
20

73

0
10

20

30

40
50
60
Upper Interval endpoints

EXAMPLE 9
Suppose we have the following data:
17
62
15
65
28
51
24
65
39
41
35
15
39
32
36
37
40
21
44
37
59
13
44
56
12
54
64
59
Construct a frequency distribution table.
Construct a histogram.
74

DESCRIPTIVE STATISTICS: STEM-AND-LEAF
DIAGRAM FOR NUMERICAL VARIABLES

Stem-and-leaf diagram is a simple way to see
distribution details in a data set.
It is only good for small data sets.
METHOD: Separate the sorted data series
into leading digits (the stem) and
the trailing digits (the leaves)

o

The number of digits in each class indicates
the class frequency.

o

The individual digits indicate the pattern of
values within each class.

75

EXAMPLE 10

Data in ordered array:
21, 24, 24, 26, 27, 27, 30, 32, 38, 41

Here, use the 10’s digit for the stem unit:
Stem Leaf

21 is shown as
38 is shown as

2

1

3

8

76

EXAMPLE 10 CTD.

Data in ordered array:
21, 24, 24, 26, 27, 27, 30, 32, 38, 41

Completed stem-and-leaf diagram:
Stem

Leaves

2

1 4 4 6 7 7

3

0 2 8

4

1
77

USING OTHER STEM UNITS

Using the 100’s digit as the stem:

Round off the 10’s digit to form the leaves
Stem Leaf

613 would become

6

1

776 would become

7

8

12

2


...

1224 becomes

78

USING OTHER STEM UNITS

Using the 100’s digit as the stem:

The completed stem-and-leaf display:

Data:

613, 632, 658, 717, 722, 750,
776, 827, 841, 859, 863, 891,
894, 906, 928, 933, 955, 982,
1034, 1047,1056, 1140, 1169,
1224

Stem
6

Leaves
136

7

2258

8

346699

9

13368

10

356

11

47

12

2

79

EXAMPLE 11
The data presented below were collected on the amount of
time it takes, in hours an employee, to process an order at a
local plumbing wholesaler.
2.8
5.5

4.9
10.2

0.5
1.1

13.2
14.2

14.2
7.8

8.9
4.5

3.7
10.9

15.2
8.8

11.2 13.4
18.2 17.1

Construct a stem-and-leaf display of the data.
Answer:
This time the stem unit is 0.1.
80

So far
 we used bar chart, pie chart, and Pareto
diagram to describe a single categorical
variable.
 we used component bar chart to describe
two categorical variables.
 we used histograms, ogives, and stem-andleaf graphs to describe a single numerical
variable.
Now we use scatter plot to describe two
numerical variables.
81

Descriptive statistics: Scatter
Diagrams or Plots for numerical
variables
 Scatter

Diagrams are used for paired
observations taken from two numerical
variables

 The

Scatter Diagram:
 one variable is measured on the vertical
axis and the other variable is measured
on the horizontal axis
82

EXAMPLE 12: PLOT THE SCATTER DIAGRAM FOR THE
TABLE.
Average SAT scores by state: 1998
Verbal

Math

Alabama

562

558

Alaska

521

520

Arizona

525

528

Arkansas

568

555

California

497

516

Colorado

537

542

Connecticut

510

509

Delaware

501

493

D.C.

488

476

Florida

500

501

Georgia

486

482

Hawaii

483

513

W.Va.

525

513

Wis.

581

594

Wyo.

548

546

LECTURE SUMMARY FOR STUDENTS
 Introduced

key definitions:

Population vs. Sample

Parameter vs. Statistic

Descriptive vs. Inferential statistics

 Described

random sampling
 Examined the decision making process

84

LECTURE SUMMARY FOR STUDENTS

Reviewed types of data and measurement levels
Data in raw form are usually not easy to use for
decision making -- Some type of organization is
needed:
 Table
 Graph
Techniques reviewed in this chapter:
 Frequency distribution
 Line chart
 Cross tables
 Frequency
distribution
 Bar chart
 Histogram and ogive
 Pie chart
 Stem-and-leaf display
 Pareto diagram
85
 Scatter plot