You are on page 1of 96

Edexcel GCSE (9 – 1)

Statistics
Mr M Dominguez
mdominguez@kegs.org.uk
Chapter 2: Processing and Representing
2.1 Tables
Data
2.2 Two-way tables
2.3 Pictograms
2.4 Bar charts
2.5 Stem and leaf diagrams
2.6 Pie charts
2.7 Comparative pie charts
2.8 Population pyramids
2.9 Choropleth Maps
2.10 Histograms and frequency polygons
2.11 Cumulative frequency charts
2.12 The shape of the distribution
2.13 Histograms with unequal class-widths
2.14 Misleading diagrams
2.15 Choosing the right format
Lesson 1: 2.1, 2.2, 2.3, 2.4, 2.5
Lesson 2: 2.6, 2.7
Lesson 3: 2.8, 2.9
Lesson 4: 2.10, 2.11
Lesson 5: 2.11, 2.12
Lesson 6: 2.13,
Lesson 7: 2.14, 2.15
§ 2.1 Tables

Tables or database or spreadsheet is a collection of


data.
We can use tables to construct different charts and
graphs.
If there are gaps in the table we can write “not
available”
We need to be aware that out of date data may be
unreliable.
Page 60/61 Question 6 & 7
§ 2.2 Two-way tables

• Is a way of representing bivariate data (two


categories)
• We will use them when finding a stratified sample
• They may be used for probability, in particular
conditional probability.
1) The two-way table
shows some information
about the numbers of
boys, girls and teachers
at three schools.

2) Felicity asked 100 students how they came to school one day.
Each student walked or came by bicycle or came by car.
49 of the 100 students are girls.
10 of the girls came by car.
16 boys walked.
21 of the 41 students who came by bicycle are boys.
Construct a two-way table for this data.
168

93

9 27

65 110

Car Walk Cycle Total

Girls 10 19 20 49

Boys 14 16 21 51

Total 24 35 41 100
§ 2.3 Pictograms

• Uses symbols or pictures to represent a certain


number of items.
• It must have a key to tell you the number of items
represented by a single symbol or picture.
• Used to represent the frequency of qualitative
data.
What are the errors with this Frequency
pictogram? Butter 56
Marmalade 72
What do you have on your toast? Jam 60
Butter Marmite 66

Marmalade

Jam

Marmite

= 8 people
§ 2.4 Bar Charts

• Bar charts can be used to represent the frequencies


of quantitative and discrete data.
• Axes must be labelled
• There should be gaps between bars, as data is not
continuous.
• Bars should be of equal width
You also need to be aware of

Dot plots. Vertical line graphs


Multiple (Comparative) Bar Charts

• We can compare two sets of data using bar charts

• Place the bars for the two sets of data next to each
other for each category
Composite (stacked) bar chart

• Shows the total frequency for each category.


• Shows how each category is made up from
separate groups
• Can compare total frequencies and frequencies of
each group.
§ 2.5 Stem and Leaf Diagrams

Find the following using the stem and leaf diagram below
(The total is 25)
1. Median
2. Range
3. Mode
4. Modal group
Back-to-back stem and leaf diagrams
§ 2.6 Pie charts

A drinks machine dispenses 540 drinks on a Monday. The


information is displayed in the pie chart. Use the information to
find the number of each drink sold.

Tea:
Tea
Milk: Coffee
84o 108o
Chocolate: 54
24o
60
o Milk
Squash 72 Cola 48o
36o
Chocolate
Cola: 90
Squash
Coffee: 126
The pie charts show some information about the numbers of matches won,
drawn and lost by a cricket team and by a hockey team last year.

                                   

The cricket team won 15 matches.                                

(a)     How many matches did the cricket team lose? 


 
(b)     Which team won the most matches last year?
(Tick one box to show your answer.)

Cricket                                 Hockey                 Not enough information

Explain your answer.


The pie chart gives information about the mathematics exam grades of some
students.

Diagram NOT accurately drawn


(a)     What grade was the mode? 
(b)     What fraction of the students got grade D? 
8 of the students got grade C.
(c)  (i)      How many of the students got grade F?
(ii)     How many students took the exam? 
This accurate pie chart gives information about the English exam grades for a
different set of students.

Sean says “More students got a grade D in English


than in mathematics.”
(d)     Sean could be wrong.
Explain why.
§ 2.7 Comparative pie charts

• Compare different sets of data


• The area of the circle is proportional to the total
frequency in each data set
• Hence we will need to calculate the radius.
𝑟 22 𝐹2
=
𝑟 1 𝐹1
Two comparative pie charts are constructed. 2

1. Chart 1 has a frequency of 8 and a radius of 5cm, chart 2 has


a frequency of 18. Find the radius of chart 2.

2. Chart 1 has a frequency of 125 and a radius of 4cm, chart 2


has a frequency of 64. Find the radius of chart 2.

Then do Q 6 & 7 on page 83


§ 2.8 Population Pyramids

• Population pyramids have the same shape as a


back-to-back stem and leaf diagram. But can be
used for continuous data and is more like a back-to-
back histogram.
• Is often used to easily compare by gender and age
of a population
§ 2.9 Choropleth Maps

• Is a thematic map widely used by geographers.


• The shading ranges from light to dark in proportion
to the density of the variable
• A key shows what each shade represents
• Examples of use are: population density, income
from tourism, land use, and food produced.
N

O L
M

J
§ 2.10 Histograms and Frequency Polygons

Important note is that these are not histograms. More similar to


Bar charts for continuous data.

This method is only used when all the class widths are the same,
which is very unlikely in an exam.

Key points:
The x-axis is in groups but needs to be plotted as a continuous
scale. Each bar starts at the lower bound of the group and ends at
the upper bound: hence there should be no gaps between bars.
A frequency polygon is constructed by joining the midpoints of the
top of the bars together.
Draw a bar chart for this data
Draw a frequency polygon for this data
§ 2.11 Cumulative frequency charts

Cumulative frequency can be calculated by a running total of the


frequencies of each group.
We can then plot the points on a graph, we ALWAYS plot the points on the
upper bound of each group.

Continuous data
• For a cumulative frequency graph we join the points up in a smooth
curve.
• For a cumulative frequency polygon we join the points one by one with a
ruler.

Discrete Data
• We can draw a cumulative frequency step polygon.
Draw a Cumulative Frequency Curve of the following information
Cumulative Frequency is also called a ‘running total’
Temperature (˚F) Frequency Cumulative
Frequency
17 17
46 63
73 136
52 188
12 200
Estimate the number of days with a temperature below 71°F
Find an estimate of the median number of days.
Draw a Cumulative Frequency Curve of the following information
Estimate the number of days with a temperature below 70°F.
Find an estimate of the median number of days.
Temp <60 <70 <80 <90 <100
(°F)
C.F 17 63 136 188 200 200
180
Key points to always 160
remember:

Cumulative Frequency
140
1) ALWAYS plot at the upper
120
boundary of each group.
i.e 60, 70, 80, 90, 100 100
2) Connect the points with a 80
single smooth curve 60

• Approximately 63 days 40
20
0
40 50 60 70 80 90 100
Temperature (°F)
Draw a Cumulative Frequency Curve of the following information
Estimate the number of students who achieved more than 65 marks

Mark < 20 < 40 <60 <80 <100


200
C.F 7 31 114 166 200
180
160

Cumulative Frequency
140
120
Careful!
200 – 123 = 77 students 100
80
60
40
20
0
0 20 40 60 80 100
Marks
The table shows how many items
of junk mail Kavina’s parents get
each day.
We can draw a cumulative
frequency step polygon for this
data.
A cumulative frequency
step polygon can be
thought of as a ‘less than
graph’

Use the graph to estimate


the median number of
items of junk mail a day.
Work out the cumulative
frequencies for Kavina’s
parents’ personal mail
20
18
16

Cumulative Frequency
14
12
10
8
6
Plot the points, number of goals
against cumulative frequency. 4
2
Join the points up by going 0
across then up. 0 1 2 3 4
Goals
§ 2.12 The shape of a distribution

The Skewness of data can be described using diagrams, measures of location


and measures of spread.
Symmetrical Positive Skew Negative Skew

Q1 Q2 Q3 Q1 Q2 Q3
Q1 Q2 Q 3

Data which is spread evenly  Symmetrical


Data which is mostly at the lower values  Positive Skew
Data which is mostly at the higher values  Negative Skew
A histogram show how the data is distributed across the class intervals.
There are three types of distribution.
Interpret your answer in context
The distribution is symmetrical. It
has no skew
Most of the data is near the median
• The spread of data above and
below the mean is the same

The distribution has positive skew.


• Most of the data values are at
the lower end.
• More of the data is less than the
mean
• The data above the mean has a
greater spread than the data
below the mean

The distribution has negative skew.


• Most of the data values are at
the upper end.
• More of the data is above the
mean
• The data below the mean has a
greater spread than the data
above the mean
Describe and interpret the shape of the distribution

10
Frequency Density

8
6
4

0
0 10 20 30 40 50 60 70 80
Speed (mph)

Negative skew. The data below the mean is more


spread out than the data above the mean
Describe and interpret the shape of the distribution

Positive skew. The data above the mean is more


spread out than the data below the mean
§ 2.13 Histograms with unequal class widths
In a true histogram the
area of each bar is
proportional to its freq.
In GCSE maths and
statistics we can assume
that: the area of the bar
is equal to its frequency.
For this reason the y-axis
becomes frequency
density.
𝐹𝑟𝑒𝑞
𝐹𝑟𝑒𝑞. 𝐷=
𝐶𝑙𝑎𝑠𝑠 𝑤𝑖𝑑𝑡h
Class width is the
difference between the
lower and upper bound
of each group.
Some times it is easier to remember the formula below. This helps us
remember that the area of each bar is equal to the frequency.

Frequency

Which can then be shown in


a triangle:
Frequency Class
density width
Time taken
10 < t ≤ 30 30 < t ≤ 35 35 < t ≤ 40 40 < t ≤ 50 50 < t ≤ 70
(t seconds)
Frequency 5 4 8 27 24

Time taken Class Width Frequency Frequency


(t seconds) Density
10 < t ≤ 30 20 5 0.25
30 < t ≤ 35 4
5 0.8
35 < t ≤ 40 8
5 1.6
40 < t ≤ 50 27 2.7
10
50 < t ≤ 70 24
20 1.2
Time taken (t Class Width Frequency Frequency
seconds) Density
10 < t ≤ 30 20 5 0.25

30 < t ≤ 35 5 4 0.8

Frequency 35 < t ≤ 40 5 8 1.6


density 40 < t ≤ 50 10 27 2.7

3 50 < t ≤ 70 20 24 1.2

2.5

1.5

0.5

0 10 20 30 40 50 60 70
Time t
What if you were asked;

Frequency Density
Estimate the number of people 8
who took between 30 and 90
seconds to complete the test?
6

Rectangle 1  0.5 by 30 30 to 90
= 15 people 4 seconds
Rectangle 2  1.5 by 30
= 45 people
2

Total = 60 people! 2
1
0
0 40 80 120 160 200
Seconds
Estimate the number of people who took between
45 and 60 seconds.
Frequency
density
45 to 60
3
seconds
(1)+(2)
2.5

1.5

1
1

0.5 2

0 10 20 30 40 50 60 70
Time t
Frequency Density

10 x 8 = 80
4

10 x 5 = 50
30 x 6 = 180
2
20 x 2 = 40
0 20 x 1 = 20
0 10 20 30 40 50 60 70 80 90

Speed Frequency Speed (mph)


Use the Histogram to
complete the table 0 – under 20 40
20 – under 50 180
50 – under 60 80
60 – under 70 50
70 – up to 90 20
10 to 40 mph
Frequency Density

4
2
2
1
0
0 10 20 30 40 50 60 70 80 90

Speed (mph)
Estimate the Rectangle 1:
number of people 10 x 2 = 20 people
whose average
speed was 10 to 40 Rectangle 2:
mph 20 x 6 = 120 people So 140 people in
total!
45 to 65 mph
Frequency Density

4 2
1
3
2
0
0 10 20 30 40 50 60 70 80 90

Rectangle 1: Speed (mph)


Estimate the number
of people whose 5 x 6 = 30 people
average speed was 45 Rectangle 2:
to 65 mph
10 x 8 = 80 people
So 135 people in total!
Rectangle 3:
5 x 5 = 25 people
Frequency Density

16

12

4
?
100
0
0 50 100 150 200 250 300 350 400 450
Gallons of Milk produced by Farm

100 Farms produced 400-450


gallons of milk. How many 100xx?10
50 = 100
produced 150-250 gallons? 1000 Farmsmust be 2 units,
The height
making the height of the
gridlines 4 units!
Frequency Density

200 to 300
24

18

12
1 2
6
900
0
0 50 100 150 200 250 300 350 400 450
Gallons of Milk produced by Farm

900 Farms produced up to Rectangle 1:


150 Gallons. Estimate the 150 x ? = 900
50 x 15 = 750 Farms
number that produced 200- The height must be 6 units! Total
300 Gallons Rectangle 2:
= 1650 Farms
50 x 18 = 900 Farms
Frequency Density

24

18

12
1800
1200
6 450
900
900 2100 3900 150
0
0 50 100 150 200 250 300 350 400 450
Which interval will the Gallons of Milk produced by Farm
median be in?
After the first 2 groups, we have had 2100
Median = (n+1) ÷ 2 farms from the total. After the 250-350
group, we have had 3900 farms. The middle
= (4501) ÷ 2 farm must therefore be in the 250-350
= 2250.5 group.
A shoe manufacture measured the length l mm of 200 people’s feet.
The results are summarised in the table below.

The incomplete histogram shows information


about the data.
Drawing Conclusions From Histograms

What percentage of people in the survey below


watched 20 hours or more of TV?
What do we do first?
Calculate how many people are in each group:
Now find the percentage:
The Histogram represents the birth weights of 150
babies.
Rectangle 1
60
Frequency Density

50 1 x 12
40 = 12 babies
30
20 Rectangle 2
10 2 30 30 ÷ 1.5
1 0.5 x 16
= 20
1.5 = 8 babies
Weight (kg)

Thirty babies weighed over 4.5kg. Babies weighing under 2kg Total
are taken to a Special Care unit.
= 20 babies
Calculate the number of babies taken to the Special Care
unit.
§ 2.14 Misleading Diagrams
Key points:
Make sure that scales are consistent.
Axes should be labelled
3D graphs can be misleading.
The scale should be appropriate.
When looking at changes over time, proportions or
percentages may be more useful.
The source of the data should be clear and reliable.
§ 2.15 Choosing the right format

The graph or chart we choose first depends on the


type of data we are looking at.
We can then split the types of graphs into two lists;
those that show trends in data and those that show
proportions.
Advantages Disadvantages
Does not demonstrate patterns or
Tables Shows exact values
trends clearly
Bar charts Data can only be read if the
Shows trends and patterns
line graphs changes are small
Pie charts Shows proportions Does not show accurate data
values.
Ways to improve data presentation.
• Use a graph that reflects proportions or different
totals. Such as comparative pie charts.
• Ensure that the scale on the y-axis is not too large.
You do not need to start at 0 if looking at trends
over time.

You might also like