Professional Documents
Culture Documents
YOUR NOTES
CONTENTS
6.1 Large Data Set
6.1 Large Data Set
Page 1 of 5
© 2015-2023 Save My Exams, Ltd. · Revision Notes, Topic Questions, Past Papers
Head to savemyexams.co.uk for more awesome resources
6.1 Large Data Set
Page 2 of 5
© 2015-2023 Save My Exams, Ltd. · Revision Notes, Topic Questions, Past Papers
Head to savemyexams.co.uk for more awesome resources
Page 3 of 5
© 2015-2023 Save My Exams, Ltd. · Revision Notes, Topic Questions, Past Papers
Head to savemyexams.co.uk for more awesome resources
Size (capacity) of the engine measured in cubic centimetres (cc) YOUR NOTES
Year registered
Vehicles included in the extract were either first registered in 2002 or 2016
The introduction says the precise dates are
3 June 2002 – 9 June 2002
6 June 2016 – 12 June 2016
Knowing that only a few days from each year are included gives an idea of the enormity
of the full database
Mass
Measured in kilograms (kg)
the mass of an average driver (75 kg) is included in the figures quoted
Emissions
The remaining data values centre around the emissions from the vehicles
CO2 – Carbon dioxide emissions, measured in g/km
CO – Carbon monoxide emissions measured in g/km
NOX – Oxides of nitrogen emissions measured in g/km
part – Particulate emissions measured in g/km
(this measure only applies to diesel cars)
hc – hydrocarbon emissions measured in g/km
Random number
A random number is generated by the spreadsheet for each vehicle so is not part of the
data set but can be used to randomly select vehicles in sampling
Be aware that the random number refreshes each time the spreadsheet is refreshed
Is the data complete?
Various data values are blank within the spreadsheet; others are 0 where this makes no
sense (such as the mass of the car)
There is no information as to why these occur but be aware they exist
Under the “Definition of fields” tab there is some extra information about the emissions
data
CO2 emissions are known for 83% of vehicles in the whole database
CO emissions are known for 82% of vehicles in the whole database
NOX emissions are known for 81% of vehicles in the whole database
Part – only for diesel vehicles (24% of the whole database)
Hc emissions are known for 51% of vehicles in the whole database
The above means that the data should be cleaned before samples are taken
What are the key features I need to know about the data set?
These have been mentioned in the lists above but here is a summary of those we have seen
used in exam and practice papers
There are only five makes, and Ford was the most frequently registered
There is only one electric vehicle in the database
Data is from a few days in summer and only in two years – 2002 and 2016
The mass of a vehicle includes an average 75 kg driver
Emissions data (CO2, CO and NOX) is only known for around 80% of the whole
database
Particulate emissions are only applicable to diesel cars
Page 4 of 5
© 2015-2023 Save My Exams, Ltd. · Revision Notes, Topic Questions, Past Papers
Head to savemyexams.co.uk for more awesome resources
YOUR NOTES
Worked Example
Jay collects data on the masses of vehicles first registered in 2002 taking a random
sample of size 30.
(a)
Use your knowledge of the large data set to explain why Jay should clean the data
before taking a sample
(b)
Jay’s calculations show the mean mass of a vehicle in his sample is 1340 kg.
Using your knowledge of the large data set write down an estimate for the mean
mass of an empty vehicle in the whole database, justifying your answer.
(a)
Use your knowledge of the large data set to explain why Jay should clean the data before
taking a sample
(b)
Jay’s calculations show the mean mass of a vehicle in his sample is 1340 kg.
Using your knowledge of the large data set write down an estimate for the mean mass of an
empty vehicle in the whole database, justifying your answer.
Exam Tip
As vehicle emissions are frequently mentioned in news articles be wary of
confusing popular opinion with what can be justified using the information
contained within the large data set.
Page 5 of 5
© 2015-2023 Save My Exams, Ltd. · Revision Notes, Topic Questions, Past Papers