Professional Documents
Culture Documents
What is statistics?
by Professor David Hand FBA
23 OCT 2020
Statistics is sometimes seen as a part of mathematics, but in a real sense the two disciplines
have diametrically opposed aims. In caricature, mathematics takes an artificial world (a set of
axioms) and aims to deduce the consequences – what the world would look like. In contrast,
statistics takes the consequences (the data) and aims to deduce what kind of world could have
produced those data. Of course, statistical tools are described in mathematical terms, but so
are the tools of surveying, accountancy, physics, economics and so on, and they are not seen
as part of mathematics.
Data and AI
Projects on data governance
and on the impact of AI on the
future of work.
The core of data science is statistics, supplemented with some computer science for
manipulating data along with domain knowledge about the problems and properties of the
data being dealt with. Likewise, statistical concepts and methods lie at the heart of machine
learning and artificial intelligence, which can be thought of as statistical systems which adapt
to incoming data.
Probability plays a central role in modern statistics. This is because data are seldom perfect,
having associated uncertainty, and also because the aim is often to make an inference to a
population from a sample of values – for example, to estimate the average income within a
country based on observing only some incomes, or to see if treatment A is better than
treatment B in a clinical trial based on only a few hundred people. It will be obvious that there
are dangers in this: if you collect income data solely from people who work in the City of
London you are likely to obtain a biased result, and likewise if you give treatment A
preferentially to the sicker people. Separate branches of statistics, notably survey sampling
and experimental design, are concerned with how best to collect data to avoid such problems
– who you should approach in a social survey, who should get which treatment in a clinical
trial and so on.
Statistics is often seen as merely concerned with aggregate phenomena, summarising masses
of data, but many applications are very much about the individual: data collected from the
many are certainly summarised, but then the summary is combined with data about the
individual to inform decisions about that individual. For example, to determine which
treatment is most likely to benefit someone, or to determine whether someone is in a high-
risk category for insurance.
Statistics has sometimes suffered from a bad press in the past – you will have heard the old
remark about “lies, damned lies, and statistics”. The truth, however, is that, while yes, it is
possible to lie with statistics, it is a damned sight easier to lie without them.
David Hand FBA is Emeritus Professor of Mathematics at Imperial College London. He was
elected a Fellow of the British Academy in 2003. His books include Statistics: A Very Short
Introduction, The Improbability Principle and Dark Data.
Comments
Related blogs
What is musicology?
What is postcolonial literature?
Sign Up
Site Map
Home Press and media
About us Support us
Our Fellows Prizes and Medals
Funding Publishing
Follow Us
Twitter
Facebook
YouTube
Soundcloud
Visiting Us
10–11 Carlton House Terrace
London
SW1Y 5AH
View on a map
Website accessibility
Privacy and cookies
Copyright