Statistics

Unit 3: Statistics

1
Statistics
Unit 3: Statistics

Introduction
In this unit, you will learn to calculate the three measures of central tendency as well as understand
when one is more appropriate than another. You will also examine and create graphs for given data and
know which graph is more appropriate for the given data set. You will also create frequency tables to
raw data and then create graphs. Also examined will be how statistics are used to support a specific
point of view which manipulates that data.

Assessment Checklist:

Check off the following as you complete each:

o Lesson 1 Assignment: Measures of Central Tendency
o Lesson 3 Assignment: Bar Charts and Histograms

o Lesson 4: Assignment: Circle Graphs and Line Graphs
o Lesson 5 Assignment: Misuse of Statistics
o Lesson 6: Graphing on Excel

Test Unit 3: Statistics

2
Statistics
STATISTICS

Introduction:
Statistics is generally a mathematical process involving:
1. the collection of data
2. the organization of data
a. numerically
b. graphically
3. the analysis of the data
4. making predictions and decisions based on the data

Collecting Data:
Population: used when the entire group is used
For example, the entire population of SHARP students taking grade 11 math, Census uses the entire

Sample: a small group of people from the population that you are interested in collecting data from
For example: a sample of 40 grade 11 students from Miles Macdonell

LESSON 1: MEASURES OF CENTRAL TENDENCY

Say that we conducted a survey of a sample of 50 random students in grade 11 and measured their heights. A
list of 50 heights does not tell us very much about the data. To help the data be more informative we use the
measures of central tendency.

There are three measures of central tendency of a set of numbers that best represent the distribution of the data.
1. mean
2. median
3. mode

Each of these three measures identifies a different feature of the data set so it is important to understand when it
is appropriate to use each. Each has its advantages and disadvantages in its ability to describe the data.

3
Statistics
1. Mean: is the average of the data; it is calculated by adding up all the data and then dividing by how
many data points there are.

π‘βπ π π’π ππ π‘βπ πππ‘π
πΉππππ’ππ: π₯Μ =
ππ’ππππ ππ πππ‘π πππππ‘π

Example:
You have the following test scores on five math test: 70, 92, 84, 91, 70. What is the mean (or average) of your
scores?

Solution:

70 + 92 + 84 + 91 + 70 407
π₯Μ = = = 81.4
5 5

2. Median: is the middle value of the data when it is arranged in increasing (or decreasing) order

Example:
You have the following test scores on five math test: 70, 92, 84, 91, 70. What is the median of your scores?

Solution:
ππ‘ππ 1: π΄ππππππ π‘βπ πππ‘π ππ πππππ ππ π ππππππ π‘ π‘π ππππππ π‘

70, 70, 84, 91, 92

ππ‘ππ 2: πΉπππ π‘βπ ππππππ π£πππ’π ππ‘ ππ  π‘βπ ππππππ

ππππππ = 84

3. Mode: is the number that occurs the most in the data set (if all values occur only once then there is no mode)

Example:
You have the following test scores on five math test: 70, 92, 84, 91, 70. What is the mode of your scores?

Solution:

π‘βπ πππ π‘ ππππππ πππ‘π πππππ‘ (π‘βπ πππ π‘βππ‘ π βππ€π  π’π π‘βπ πππ π‘ ππ π‘βπ πππ π‘ ππ πππ‘π)

ππππ = 70

When is it appropriate?

- It is most appropriate to use mean when there are no extreme values (outliers β values that are either
really small or really big and donβt fit with the rest of the data) within the data set.
- It is most appropriate to use the median when there are a few extreme values because the median is not
influenced by outliers
- It is most appropriate to use mode when you are interested in the βmost commonβ

4
Statistics
Example:
Sara is taking a flying course and had to write six tests. Her results were: 65, 66, 100, 63, 64, 63.
a. find the mean, median and mode for this data
b. which measure best represents Saraβs knowledge of flying? Why?

Solution:
65+66+100+63+64+63 421
a. ππππ: π₯Μ = = = 70.17
6 6

ππππππ: 63, 63, 64, 65, 66, 100

π‘βπππ ππ  ππ ππ₯πππ‘ ππππππ ππ’ππππ π π βπππ π€π βππ£π π‘π average π‘βπ π‘π€π ππ’πππππ  π‘βππ‘ πππ ππππ ππ π‘ π‘π π‘βπ ππππππ

64 + 65
ππππππ = = 64.5
2

ππππ = 63

b. π‘βπ ππππππ ππ  π‘βπ πππ π‘ ππππππππππ‘π π‘π π’π π πππππ’π π π‘βπ πππ‘π βππ  ππ ππ’π‘ππππ, π‘βπ π ππππ ππ 100
ππππ  πππ‘ πππ‘ ππ π€ππ‘β π‘βπ πππ π‘ ππ βππ π‘ππ π‘ π πππππ , ππ‘ ππ  ππ’πβ βππβππ π€βππβ π€πππ πππ’π π π‘βπ ππππ π‘π ππ π‘π βππβ.
πβπ ππππ ππ  π‘π πππ€ πππ ππππ  πππ‘ πππππππ‘ π‘βππ‘ π βπ βππ  π‘ππ π‘ π πππππ π‘βππ‘ πππ βππβππ π‘βππ 63.
Example:
A recent newspaper article stated the average income for people living on Sine Street was \$99,500. A letter was
written to the editor of the newspaper claiming that the average income for people living on Sine Street was
\$30,000. What is the truth: the information in the newspaper article or the information in the letter to the editor?
Name Income
Baker \$500,000
Smith \$220,000
Simpson \$70,000
Ford \$60,000
Campbell \$40,000
Wyatt \$30,000
Grant \$30,000
Bender \$20,000
Burns \$15,000
Milhouse \$10,000
a. Using measures of central tendency, determine the position each person took on this issue.
b. Which measure of central tendency do you think gives the best picture of the βaverageβ income on
Sine Street?
Solution:
500 000+220 000+70 000+60 000+40 000+30 000+30 000+20 000+15 000+10 000 995 000
a. ππππ = π₯Μ = = 10 = \$99 500
10
40 000 + 30 000
ππππππ = = \$35 000
2
ππππ = \$30 000

b. πβπ ππππ ππ  πππ π ππππ πβππππ πππππ’π π π‘βπππ πππ ππ’π‘πππππ . πβπππ πππ π‘π€π βππ’π π ππππππ
π‘βππ‘ πππ ππ΄π βππβππ π‘βππ π€βπππ π‘βπ πππ π‘ ππ π‘βπ βππ’π π ππππππ  ππππ. πβππ π
π€πππ ππππ π‘βπ ππππ βππβππ π‘βππ π‘βπ ππ£πππππ π βππ’ππ ππ. ππ π€π π βππ’ππ π’π π π‘βπ ππππππ.

5
Statistics
Example:
In some Olympic events such as gymnastics, the final mark is determined by dropping the lowest and highest
scores that a contestant receives from the panel of judges. Use the scores given below to answer the following
questions.
Gymnast #1 Scores Gymnast #2 Scores
8.8 9.4
8.7 9.6
8.6 6.0
8.8 8.0
6.5 9.2
9.7 9.2
9.9 9.1

a. Without dropping the high and low score, calculate the three measures of central tendency for each
gymnast.
b. Which gymnast would win the gold if the mean was used? The median? The mode?
c. Drop the high and low scores, recalculate the mean, median and mode for each gymnast.
d. In real Olympic competition, the mean is used to decide the medal winners. Which gymnast would
win?
e. Why do you think the high and low scores are dropped?

Solution:
8.8+8.7+8.6+8.8+6.5+9.7+9.9 61
a. πΊπ¦ππππ π‘ 1: ππππ = π₯Μ = = = 8.7
7 7
ππππππ: 6. 5 8.6 8.7 8.8 8.8 9.7 9.9 π π ππππππ ππ  8.8
ππππ = 8.8

9.4 + 9.6 + 6.0 + 8.8 + 9.2 + 9.2 + 9.1 60.5
πΊπ¦ππππ π‘ 2: ππππ = π₯Μ = = = 8.6
7 7
ππππππ: 6.0 8.0 9.1 9.2 9.2 9.4 9.6 ππππππ ππ  9.2
ππππ = 9.2

b. Gymnast ο£1 would win is Mean was used. Gymnast #2 would win if Median or Mode were used.
8.8+8.7+8.6+8.8+9.7 44.6
c. πΊπ¦ππππ π‘ 1: ππππ = π₯Μ = = = 8.92
5 5
ππππππ: 8.6 8.7 8.8 8.8 9.7 π π ππππππ ππ  8.8
ππππ = 8.8

9.4 + 8.8 + 9.2 + 9.2 + 9.1 44.9
πΊπ¦ππππ π‘ 2: ππππ = π₯Μ = = = 8.98
5 5
ππππππ: 8.0 9.1 9.2 9.2 9.4 ππππππ ππ  9.2
ππππ = 9.2
d. Gymnast #2 would win

e. The high and low scores are dropped to get rid of outliers that affect the mean.

6
Statistics
Example:
Greenwood Manufacturing is trying to recruit new employees so they can expand their company. In the
advertisement they claim that the average salary of an employee is \$44,000 a year. Below is a chart showing the
payroll information for Greenwood.
a. Determine the mean, median and mode.
b. Is the company falsely advertising when they said that the average salary is \$44,000?
c. What measure (mean, median, mode) is most appropriate to show what a typical salary is for an
employee? Why?

Job Title Number of Employees Salary
President 1 \$250,000
Vice β President 1 \$130,000
Plant Manager 2 \$75,000
Supervisor 10 \$50,000
Labourer 30 \$37,000
Sales Clerk 10 \$24,000

Solution: πππ‘πππ
π‘βππ‘ π‘βππ  π ππ‘ ππ πππ‘π βππ  ππππ π‘βππ‘ πππ ππππ ππ πππππππ πππβ ππ π‘βπ π πππππππ . πΉππ
π‘βπ π πππππππ  π‘βππ‘ βππ£π ππ’ππ‘ππππ ππππππ πππππππ π‘βππ ππ‘ ππ  πππ πππ π‘ π‘π ππ’ππ‘ππππ¦ ππ¦ βππ€ ππππ¦ ππππππ ππππ
π‘βππ‘ π πππππ¦ π‘π πππ‘ π‘βπ πππ‘ππ ππππ’ππ‘. πΉπππππππ
ππππ ππ πππ ππππππππππ ππππππππ ππ πππππππ ππππππππ ππ ππππ πππππππ πππ πππ ππππ πππ ππ
πππππ πππ πππππππ.

Job Title Number of Employees Salary Total
President 1 \$250,000 \$250 000
Vice β President 1 \$130,000 \$130 000
Plant Manager 2 \$75,000 \$150 000
Supervisor 10 \$50,000 \$500 000
Labourer 30 \$37,000 \$1 110 000
Sales Clerk 10 \$24,000 \$240 000
Total Employees 54 \$2 380 000

\$2 380 000
a. ππππ = π₯Μ = = \$44 074.07
54
ππππππ = \$37 000
ππππ = \$37 000

b. π‘βπ πππππππ¦ π’π ππ π‘βπ ππππ

c. πβπ ππππ ππ  π‘βπ πππ π‘ ππππ’πππ‘π ππ π‘βπ π‘βπππ ππππ π’πππ  ππ ππππ‘πππ π‘πππππππ¦. πβπ ππππ ππ  ππππππ‘ππ
ππ¦ π‘βπ ππ’π‘πππππ  πππ ππ  π‘π βππβ

7
Statistics
Curriculum Outcomes:
11E3.S.1 Develop statistical reasoning

Lesson 1 Assignment: Measures of Central Tendency

See your teacher for Lesson 1 Assignment

8
Statistics
LESSON 2: ORGANIZING DATA GRAPHICALLY

There are many different graphs that can be created. But, depending on the type of data you have collected,
some graphs are better than others to display the data.

Bar Chart: A bar chart is a way of summarizing a set
of categorical data.

Other names: Column Categorical data is data that consist of only small
number of values, each corresponding to a specific
60 category value or label. Data that is usually better as
labels than numbers. The label may describe a
classification, category, or group of the item of
50
interest.

For example, for data on reasons people were absent
40 from work, the classifications might include
categories such as illness, vacation, holiday, or
funeral leave.
30
A Bar Chart displays the data using a number of
rectangles, of the same width, each of which
20 represents a particular category. The length of each
rectangle is the number of cases in the category it
represents.
10
Notice the difference between bar charts where the
bars are drawn with a gap between them and a
Apples

Oranges

histogram where the bars are drawn immediately
Star Fruit

Mango
Papaya

next to each other.

Variations:

Clustered bar, zero-line

9
Statistics

Line Graph: A graph of ordered pairs, (x,y), where the
points are connected, in order, by a line segment.

Other names: Time series Good for comparing one set of values to another.
Also good for displaying trends.

A line graph is a way to summarize how two pieces
of information are related and how they vary
depending on one another. The numbers along a side
of the line graph are called the scale.

Line graphs show interpolated points and slopes
well.

Variations:

In finance, high/low/close (in commodities field,
also called "bar"); candle charts.

Histogram: A histogram is a way of summarising data
that are measured on an interval scale.

Other names: Step Good for comparing counts.

Shows frequency distributions as steps or bars.
Test Scores
Good when values fall into discrete sets and not
10 good when they don't.
Number of Students

8 Note: Histogram bars always touch. Bars (or sets of
bars) on bar charts do not touch.
6
The histogram is only appropriate for variables
4 whose values are numerical and measured on an
interval scale.
2
Variations: Pyramid histogram
0
0 - 49%
50 - 59%
60 - 69%
70 - 79%
80 - 89%
90 - 100%

Score

10
Statistics

Scatterplot: provide a visual representation of data and
allow us to look for any trends or patterns in the data
and are used for to graphically represent numerical
data

Other names: Scattergram, XY scatter Good for spotting clusters or out-of-range points.

Each data point is the intersection of two variables
plotted against the two axes.

Variations:

Bubble chart

Pie Graph: A circle graph (or pie chart) is a way of
summarising a set of categorical data. It is a circle
which is divided into segments and each segment size
represents how much that category makes up the whole.

Other names: Circle, cake, sector Good for showing snapshots of proportional
relationships, one snapshot per period of time. One
pie is one whole (100 percent).

Bad for comparing two or more relationships. Most
people find it hard to compare wedge-shaped areas
from one pie chart to the next.

11
Statistics
LESSON 3: BAR CHARTS AND HISTOGRAMS:
Bar Charts and Histograms are useful to display data that falls into specific categories.

Bar Charts: the bars in the graph do NOT touch
Histograms: the bars in the graph do touch

Each is constructed in a similar manner. When constructing graphs it is important that we first identify the
independent and the dependent variables. The Independent variable it ALWAYS on the X-Axis (the horizontal
axis along the bottom of our graph) and the Dependant variable is on the Y-axis (the vertical axis along the side
of the graph).

Example:
The table below represents the number of incidence of various types of crimes for the town of Thompson.
Year 1999 2000 2001 2002 2003
Number
1109 1200 1287 1350 1443
of Crimes

Construct a histogram to represent the above data.

Solution:
The Independent variable is the Year and the Dependent variable it the Number of Crimes.

Number of Crimes
1500

1450

1400

1350

1300
Number of Crimes

1250

1200

1150

1100

1050

1000

950
1999 2000 2001 2002 2003
Year

12
Statistics
Example:
The following data was collected about which introductory courses first year university students take. Draw a
bar graph to represent the data.

Course Chemistry Physics Math Psychology Economics
Number
155 120 200 300 250
of Students

Solution:
The Dependent variable is the Number of Students and the Independent variable is the Course

320
300
280
260
240
220
Number of Students

200
180
160
140
120
100
80
60
40
20
0
Chemistry Physics Math Psychology Economics
Course

13
Statistics
Example:
Councils in two BC towns conducted a survey to determine how people feel about the different options for
protecting the bears that live in the area but still keep the communities safe. The results are shown below; create
a bar graph to represent this data.

Bear Smart Program
Use safe electric fences around the landfill 1020 711
Remove brush in town 294 47
Use bear-proof garbage bins 701 710
Move problem bears to the wild 773 479
Put out garbage on pickup day only 948 518
Lock commercial garbage bins 60 76

Solution:

Bear Smart Program
1100

1000

900

800

700

600

500
Town 1
400
Town 2
300

200

100

0
Use safe Remove brush Use bear- Move Put out Lock
electric fences in town proof garbage problem bears garbage on commercial
around the bins to the wild pickup day garbage bins
landfill only
Suggestions

14
Statistics
Example:
The data below shows the number of acres on 32 beet farms. Quinnβs family wants to grow beets on their farm
this year. How can Quinn use this data to help them decide how many acres they should dedicate to growing
sugar beets?

139 61 358 169
126 350 62 159
502 290 150 74
61 462 59 122
187 72 76 66
123 66 150 191
130 145 150 231
398 800 208 420

Solution:

Step 1: Determine the range of the data. This is the highest value subtract the lowest value
πππππ = 800 β 59 = 741

If the range is Large then we must create larger intervals for our data to be grouped into.
If the range is small then we can create smaller intervals.

Since 741 is a fairly large range we will use larger intervals to organize our data.

Step 2: Determine how many intervals you would need for the size you choose:

πππππ ππ πππ‘π
# ππ πππ‘πππ£πππ  =
π ππ§π π¦ππ’ πβπππ π

Let us choose intervals of 50. How many intervals would we need?
741
# ππ πππ‘πππ£πππ  = = 15 π‘ππ ππππ¦ π‘π ππππ€
50

If we choose intervals of 100:
741
# ππ πππ‘πππ£πππ  = = 7.4 π π ππππ’π‘ 8 πππ‘πππ£πππ
100

Step 3: Organize the data into intervals to create a Frequency Table. For each interval count the number of
farms that fall into that category size.

# of Acres 0 β 100 101 β 200 201 β 300 301 β 400 401 β 500 501 β 600 601 β 700 701 β 800
# of Farms 9 13 3 3 2 1 0 1

Step 4: Draw the graph using the Frequency Table

15
Statistics

Number of Sugar Beet Farms
14

13

12

11

10

9

8
# of Farms

7

6

5

4

3

2

1

0
0 β 100 101 β 200 201 β 300 301 β 400 401 β 500 501 β 600 601 β 700 701 β 800
Size of Farm (acres)

16
Statistics
Curriculum Outcomes:
11E3.S.1 Solve problems that involve creating and interpreting graphs, including: bar graphs, histograms, line graphs, and circle
graphs
Lesson 3 Assignment: Bar Charts and Histograms
See your teacher for Lesson 3 Assignment

17
Statistics
LESSON 4: CIRCLE GRAPHS:
Circle graphs are useful to display data that are in percentages.

Example:
Complete the chart and then create a circle graph that represents the data.

Pet Survey
Pet Number Percentage (%) Part of the Circle
No Pets 420 %
Dog 240 %
Cat 200 %
Bird 50 %
Other 90 %

Solution:
To draw a circle graph we first need to convert the data into a part of a circle. Recall that circles have 360Β° so
we need to convert our data to degrees.

To do this we first need to know the percent that each category is out of the total.

# ππ π‘βππ‘ πππ‘πππππ¦
πππππππ‘ = π₯100%
π‘ππ‘ππ ππ’ππππ
420 50
πππππππ‘ ππ πππ‘π  = π₯100% = 42% πππππππ‘ π΅πππ = π₯100% = 5%
1000 1000
240 90
πππππππ‘ π·ππ = π₯100% = 24% πππππππ‘ ππ‘βππ = π₯100% = 9%
1000 1000
200
πππππππ‘ πΆππ‘ = π₯100% = 20%
1000

Next we have to change the percent to the number of degrees it is out of the whole circle.

ππππ‘ ππ πΆπππππ = (πππππππ‘ ππ  π πππππππ)(360Β°)

πππππππ‘ ππ πππ‘π  = (0.42)(360Β°) = 151Β° πππππππ‘ π΅πππ = (0.05)(360Β°) = 18Β°
πππππππ‘ π·ππ = (0.24)(360Β°) = 86Β° πππππππ‘ ππ‘βππ = (0.09)(360Β°) = 32Β°
πππππππ‘ πΆππ‘ = (0.20)(360Β°) = 72Β°

18
Statistics

Pet Survey
Pet Number Percentage (%) Part of the Circle
No Pets 420 42% 151Β°
Dog 240 24% 86Β°
Cat 200 20% 72Β°
Bird 50 5% 18Β°
Other 90 9% 32Β°
Total 1000 100% 359Β°

Notice that the Part of the Circle does not add up to 360Β° but only to 359Β°. This is because we didnβt carry any
decimal places and therefore we have rounding errors. As long as it adds up to be close to 360Β° it is fine.

Now we use the degrees and a compass to draw the circle graph.

Pet Survey

Other

Bird

No Pets
Cat

Dog

19
Statistics
Example:
The following data was collected from a telephone poll of 1000 Canadians. Each person was asked to name
their favourite sport to watch. Draw a circle graph that represents the data.

Sport Number
Hockey 450
Football 240
Baseball 120
Soccer 58
Volleyball 24
Other 19

Solution:

Sport Number Percent Degrees
Hockey 450
Football 240
Baseball 120
Soccer 58
Volleyball 24
Other 19
Total

20
Statistics
Line Graphs
A line graph is a way to summarize how two pieces of information are related and how they vary depending on
one another. They are also used to show trends in data.

Example:
The table below shows daily temperatures for New York City, recorded fro 6 days. Draw a line graph to
represent the data.

Temperature in NY City
Day Temperature
1 43Β°F
2 53Β°F
3 50Β°F
4 57Β°F
5 59Β°F
6 67Β°F
Solution:

Temperature in New York
80

70

60
Temperature Β°F

50

40

30

20

10

0
1 2 3 4 5 6
Day

21
Statistics
Example:
Sarah bought a new car in 2001 for \$24,000. The dollar value of her car changed each year as shown in the table
below. Construct a line graph to represent the data. Explain any trend that the graph shows. In 2009 what would
you predict the value of her car to be?
Value of Sarah's Car
Year Value
2001 \$24,000
2002 \$22,500
2003 \$19,700
2004 \$17,500
2005 \$14,500
2006 \$10,000
2007 \$ 5,800
Solution:

Value of Sarah's Car Value
\$26,000
\$24,000
\$22,000
\$20,000
\$18,000
\$16,000
Value in \$

\$14,000
\$12,000
\$10,000
\$8,000
\$6,000
\$4,000
\$2,000
\$0
2001 2002 2003 2004 2005 2006 2007
Year

22
Statistics
Example:
A tourist resort in Jasper, Alberta, has made a graph of the nationalities of their visitors over the past year, to
help them direct their marketing campaign for next year.

Visitors to Jasper Resort,
by Nationality

American 21%
M exican 13%
British 17%
Other European 5%
Australian 15%

a. After Canadians, visitors of what nationality are most common?

American, 21%

b. What percentage of visitors are Canadian?

c. If there were 1500 British visitors, how many Americans visited the resort?

# ππ π£ππ ππ‘πππ  = (% ππ π£ππ ππ‘πππ  ππ  π πππππππ)(π‘ππ‘ππ # ππ π£ππ ππ‘πππ ) πππ‘ππ

ππ’ππππ ππ π£ππ ππ‘πππ :

ππ π€πππ π’π π π‘βπ ππππππππ‘πππ ππππ’π‘ π‘βπ π΅πππ‘ππ β π‘π πππππ’πππ‘π π‘βπ π‘ππ‘ππ ππ’ππππ ππ π£ππ ππ‘πππ .

1500 = (0.17)(πππ‘ππ)
1500
πππ‘ππ = = 8824
0.17

So,
# ππ π΄πππππππ π£ππ ππ‘πππ  = (0.21)(8824) = 1853

23
Statistics
Curriculum Outcomes:
11E3.S.1 Solve problems that involve creating and interpreting graphs, including: bar graphs, histograms, line graphs, and circle
graphs

Lesson 4: Assignment: Circle Graphs and Line Graphs

See your teacher for Lesson 4 Assignment

24
Statistics
LESSON 5: MANIPULATION OF DATA
The manipulation of data was introduced in the lesson for the measures of central tendency.

Recall that the mean is influenced by outliers in the data and therefore may not be the best representation for the
data when the data contains outliers. If an individual wanted to use the data to mislead others they may choose
the mean over the median or mode.

Manipulating data can also be done using graphs. One method of manipulating the graph is changing the y-axis.
By changing the y-axis the graph can be made to appear different and therefore be used to support a specific
point of view or opinion. Thus, it is important to be aware of how the graph is created to be able to understand
the data.

Example:
The following chart lists the annual sales at a local company.

Year 2005 2006 2007 2008 2009
Sales (in Thousands) 136 140 144 148 155

a. Draw a bar graph to represent this data accurately.
b. The president of the company would like the increase in sales to appear a large as possible in the graph.
Draw a graph to represent the data so that the increase appears much greater than it actually is.
c. A competing company wants to make sales over the past five years appear as small as possible. Draw a
graph that would represent the data so that the increase in sales appears much smaller.

25
Statistics
Solution:
a. Choose a graph with a vertical axis beginning at 0; to create the most accurate graph the vertical axis
should begin at zero (although, this sometimes is difficult depending on the range of the data)

160

140

120
Sales (in thousands)

100

80

60

40

20

0
2005 2006 2007 2008 2009
Year

b. To make the sales increase over the past five years to appear larger choose a vertical scale that begins at
130 and a smaller interval of increase for the vertical axis.

Note: by truncating the vertical axis the difference between the bars height in the graph is exaggerated
creating the impression that the sales increases were larger. By paying close attention to the scale on the axis
you would know that the increase over the past five years has only been about 5 thousand each year.

160

155
Sales (in thousands)

150

145

140

135

130

125
2005 2006 2007 2008 2009
Year

26
Statistics
c. To make the sales increase appear much smaller choose a larger scale and a large maximum value so
the difference in heights of the bars is reduced and the appearance of a sales increase is minimized.

500

400
Sales (in thousands)

300

200

100

0
2005 2006 2007 2008 2009
Year

27
Statistics
Example:
An advertisment for the Widget Company claims that the price of widgets has dropped dramatically over the
last six months. The company supports its claim with the following graph.
a. Explain how the graph distorts the data.
b. Construct a more accurate graph of the data.

Price of Widgets
79
78
77
76
75
74
Price

73
72
71
70
69
68
Apr May Jun Jul Aug
Month

Solution:
a. The graph distorts the data because the y-axis does not start at 0. Because it does not start at 0 the
decrease in price appears to the larger from month to month. Also, the scale on the y-axis is small (only
increases by 1) which exaggerates the difference between months.
b. A more accurate graph illustrating the decrease in price would have a y-axis beginning at 0 and a scale
of 10.

Price of Widgets
90
80
70
60
50
Price

40
30
20
10
0
Apr May Jun Jul Aug
Month

28
Statistics
Manipulation of bar graphs and histograms can also be achieved by changing the width of the bars. To draw an
accurate bar graph or histogram the bars must be the same size. If one bar is larger than another, the impression
is that there is more data collected for that category. Thus, changing the x-axis also changes the appearance of
the graph and may be used to support a specific point of view.

Example:
a) Draw an accurate double bar graph for the data below.
b) Draw a second double bar graph so that it appears that mothers buy more books than fathers.
c) Draw a third graph so that it appears that very few mothers buy sporting equipment as gifts.

Item Video Games Art Supplies Books Sport
Equipment
Mothers 12 29 21 15
Fathers 25 16 22 26

Solution:
a.

32
30
28
26
24
22
20
18

16
Mothers
14
Fathers
12
10
8
6
4
2
0
Video Games Art Supplies Books Sport Equipment

29
Statistics
b. If we want it to look like Fathers bought significantly more books than Mothers than we make that bar
wider and the bar representing Mothers purchases thinner.

32
30
28
26
24
22
20
18

16
Mothers
14
Fathers
12
10
8
6
4
2
0
Video Games Art Supplies Books Sport Equipment

30
Statistics
Curriculum Outcomes:
E-3 manipulate the presentation of data to represent a point of view

Lesson 5 Assignment: Misuse of Statistics

See your teacher for Lesson 5 Assignment

31
Statistics
LESSON 6: GRAPHING ON EXCEL:
Using Excel create graphs for the following data.

1. The following data was collected from a poll of 100 Canadians. Each person was asked to name their
favourite sport to watch. Draw
a. a bar chart and
b. a circle graph for the following data.
Sport Number (out of 1000)
Hockey 450
Football 240
Baseball 120
Soccer 58
Volleyball 24
Other 19

2. Create a Line Graph for the following data:
Year Johnβs Weight (kg)
1991 68
1992 70
1993 74
1994 74
1995 73

3. Create a line graph to represent the depreciation of a Carβs Value versus the mileage of the car.
Carβs Value (\$) Kilometers on Odometer
\$14,000 0
\$12,000 20,000
\$8,000 40,000
\$5,000 60,000
\$4,000 80,000
\$3,000 120,000

4. Create a circle graph to represent the toppings that people like on their pizza. The data below was collected
from 1000 people.
Topping Number of People Part of the Circle
Sausage 75
Cheese 250
Tomato 125
Mushroom 50
Pepperoni 250
Meatlovers 250

32
Statistics
5. Bob is a store manager and wants to create a bar graph to track the number of hours each of his employeeβs
works. Below is the schedule for the week. Create a bar graph showing each employeeβs hours for each day.

Employee Sun Mon Tues Wed Thrus Fri Sat Total
Chantel 7 8 0 0 7 8 4
Chris 0 6 4 4 4 4 0
John 0 4 8 7 0 0 8
Dawn 4 4 4 0 4 4 0
Total hours /day

6. Larry researched the number of people working in the different industries in Winnipeg and recorded the data
in the following table.
1. Create a spreadsheet and graph for this data.
2. Why did you choose the type of graph that you have in part a?
Number of People Working in Industries in Winnipeg
Agriculture Construction Manufacturing Wholesale Retail Finance Health Education Business Other
services
47 595 32 310 62 580 23 040 65 475 31 505 75 915 47 365 95 353 121 030

7. Go to the following website (http://www.vancouver2010.com/) to collect data from the 2010 Winter
Olympic Games.
a. Create a Bar Chart to show the top 12 winning countries for TOTAL medal count.
b. Create separate Bar Charts to show the top 5 countries for
ο· Gold,
ο· Silver and
ο· Bronze Medals.
c. Recreate the following graph BUT be sure to include the titles for each axis as well as a legend.

40

35

30

25

20

15

10

5

0
d. Create two Line Graphs to show 1) the total medal count and 2) the total Gold medals won for Canada
from the 1998 Nagano, 2002 Salt Lake City, 2006 Torino, and the 2010 Vancouver Winter Olympic
Games.
e. What can you say about the total number of medals won and the total number of Gold medals won by
Canada from the 1998 to the 2010 Winter Olympics?
33