Professional Documents
Culture Documents
Digital Service
2
Introduction
The Data Ethics Framework guides Teams should work through the The framework is split into overarching
appropriate and responsible data use in framework together throughout the principles and specifc actions.
government and the wider public sector. process of planning, implementing, Overarching principles are applicable
It helps public servants understand and evaluating a new project. Each throughout the entire process and
ethical considerations, address these part of the framework is designed to underpin all actions and all aspects
within their projects, and encourages be regularly revisited throughout your of the project. Specifc actions will
responsible innovation. project, especially when any changes guide you through different stages
are made to your data collection, of the project and provide practical
storage, analysis or sharing processes. considerations.
Questions in each section will guide you In addition, the framework provides
Who is it for? through various ethical considerations specifc actions you can take at
in your project. Record your answers each stage of the project to advance
to create a data ethics assessment. transparency, accountability, and
The self-assessment scoring scale is fairness. These are marked within each
This guidance is aimed at anyone designed to help you identify which section.
working directly or indirectly with data aspects of your project need further
in the public sector, including data work.
practitioners (statisticians, analysts
and data scientists), policymakers,
operational staff and those helping to
produce data-informed insight.
1 4
Defne and Review the quality
Transparency understand public and limitations of
beneft and user need the data
2 5
Evaluate and
Involve diverse
Accountability expertise
consider wider
policy implication
Fairness
3 Comply with the law
Overarching principles
Transparency
Transparency
Score
0 1 2 3 4 5
Transparency means that your actions,
processes and data are made open to
inspection by publishing information
about the project in a complete, open, Information about the Information about the
understandable, easily-accessible, and project, its methods, and project, its methods, and
free format. In your work with and on outcomes is not publicly outcomes is widely available
available to the public
data and AI, use the available guidance,
e.g. the Open Government Playbook,
to ensure transparency throughout the
Key:
entirety of your process.
Accountability
Accountability
Score
0 1 2 3 4 5
Accountability means that there are
effective governance and oversight
mechanisms for any project. Public
accountability means that the public or Mechanisms for scrutiny,
Long-term oversight and
its representatives are able to exercise governance, or peer review
public scrutiny mechanisms
effective oversight and control over for the project haven’t been
are built into the project cycle
established
the decisions and actions taken by
the government and its offcials, in
order to guarantee that government Key:
initiatives meet their stated objectives
and respond to the needs of the
communities they are designed to
beneft.
Start During After
Fairness
Fairness
Fairness — It is crucial to eliminate your Score
project’s potential to have unintended
discriminatory effects on individuals
and social groups. You should aim to 0 1 2 3 4 5
mitigate biases which may infuence
your model’s outcome and ensure that
the project and its outcomes respect
the dignity of individuals, are just, non-
discriminatory, and consistent with the There is a signifcant risk The project promotes just
public interest, including human rights that the project will result and equitable outcomes,
and democratic values. in harm or detrimental and has negligible detrimental
discriminatory effects for effects, and is aligned with
the public or certain groups human rights considerations
You can read more about fairness and
its different types in the Understanding
artifcial intelligence ethics and safety Key:
guide developed by the Government
Digital Service and the Offce for
Artifcial Intelligence.
Start During After
You can read more about the
standards regarding human rights in
AI and machine learning in the Toronto
Declaration developed by Access Now
and Amnesty International.
Data Ethics Framework 7
Introduction
Key:
Score
Principles 0 1 2 3 4 5
Transparency
Accountability
Fairness
Ex
Please use the following self assessment to mark, track and Start During After
share your overall progress over time against the following 5
specifc actions.
Score
Specifc actions 0 1 2 3 4 5
Key:
Score
Principles 0 1 2 3 4 5
Transparency
Accountability
Fairness
Please use the following self assessment to mark, track and Start During After
share your overall progress over time against the following 5
specifc actions.
Score
Specifc actions 0 1 2 3 4 5
0 1 2 3 4 5
Using the circles move them to where
your project is on the spectrum. Do this
at the start of a project, during a project
and after a project has completed. This
will help you track, communicate and Public beneft and user
Public beneft and user need
need are well defned and
share your progress. are not clearly defned or
understood by all team
understood
members
Key:
1.1 1.2
● What are direct benefts for ● What would be the harm in not using data? What social
individuals in this project? (e.g. saving outcomes might not be met?
time when applying for a government
service) ● What are the potential risks or negative consequences of the
project versus the risk in not proceeding with the project?
● How does the project deliver positive
social outcomes for the wider public? ● Could the misuse of the data/algorithm or poor design of the
project contribute to reinforcing social and ethical problems
● How can you measure and and inequalities?
communicate the benefts of this
project to the public? ● What kind of mechanisms can you put in place to prevent this
from happening?
● What are the groups that would be
disadvantaged by the project/ that ● What specifc groups beneft from the project? What groups
would not beneft from the project? can be denied opportunities/face negative consequences
What can you do about this? because of the project?
Human rights
considerations
1.4
● How does the design and implementation of
Justify the beneft for the tax- the project/ algorithm respect human rights and
democratic values?
payers and appropriate use of
public resources in your project ● How does the project/algorithm work towards
advancing human capabilities, advancing
inclusion of underrepresented populations,
reducing economic, social, gender, racial, and
● How can you demonstrate the value for other inequalities?
money of your project?
● What are the environmental implications of the
● Is there effective governance and project? How could they be mitigated?
decision-making oversight to ensure
success of the project?
1.5
● Where can you publish information on how
the project delivers positive social outcomes
Make your user need and for the public?
public beneft transparent
● How have you shared your understanding
of the user need with the user?
User need
1.7 1.6
1.8
1.9
Does everyone in your team understand
the user need?
Repeatedly revisit your user
need throughout the project.
● Often projects involving data analysis
are requested by non-practitioners
- people with an ill-defned problem
they would like to understand better. Consider these questions as the project
Reframing their request as a user evolves:
need will help you understand what
they’re asking for and why, or expose ● What is the overall problem you’re
what you don’t know yet. trying to solve?
Key:
Involve diverse expertise
Score
Start During After
0 1 2 3 4 5
2.1 2.2
● Beyond data scientists, who are the relevant ● How have you ensured diversity in your team?
policy experts and practitioners in your team? Having a diverse team helps prevent biases and
encourages more creativity and diversity of thought.
● What other disciplines and subject matter
experts need to be involved? What steps have ● Avoid forming homogenous teams, embrace
been taken to involve them? diversity of lived experiences of people from
different backgrounds. If you fnd yourself in a
● What should their roles be? Have you defned homogenous team, challenge it.
who does what in the project?
2.3
Effective governance
● How have you engaged external domain experts in
structures with experts your project (e.g. academics, ethicists, researchers)?
2.5
● Wherever appropriate, publish information
Transparency on expert consultations and the project
team structure
Key:
Comply with the law
Score
Start During After
0 1 2 3 4 5
3.1 3.2
3.3
Data protection by
design and DPIA
3.4
3.6
Data analysis or automated decision making must not Ensure effective governance
result in outcomes that lead to discrimination as defned of your data
in the Equality Act 2010.
3.8
Key:
Review the quality and limitations of the data
Score
Start During After
0 1 2 3 4 5
● Are individuals and/or organisations providing the data aware of how it will be used? If the
user is repurposing data for analysis without individual consent, how have you ensured
that the new purpose is compatible with the original reason for collection (Article 6 (4)
GDPR)?
● Do you understand how the data for the project is generated? Remember that
depending on where the data came from, the feld may not represent what the feld
name/metadata indicates.
● What processes do you have in place to ensure and maintain data integrity?
● What are the caveats? How will the caveats be taken into account for any future policy or
service which uses this work as an evidence base?
● Would using synthetic data be appropriate for the project? Synthetic data is entirely
fabricated or abstracted from real data through various processes, e.g. anonymisation or
record switching. It is often created with specifc features to test or train an algorithm.
4.3 4.2
● How has the data being used to train a model You must use the minimum ● Would the proposed
been assessed for potential bias? You should data necessary to achieve your use of data be deemed
consider: desired outcome (Article 5(1) inappropriate by those
(c) of the GDPR). Personal data who provided the data
● Whether the data might (accurately) refect should be adequate, relevant (individuals and/or
biased historical practice that you do not and limited to what is necessary organisations)?
want to replicate in the model (historical bias) in relation to the purposes for
● The data might be a biased misrepresentation which they are processed. ● Would the proposed use
of historical practice, e.g. because only of data for secondary
certain categories of data were properly ● How can you meet the purposes make it less likely
recorded in a format accessible to the project project aim using the that people would want
(selection bias) minimum personal data to give that data again for
possible? the primary purpose it was
● If using data about people, is it possible that collected for?
your model or analysis may be identifying proxy ● Is there a way to achieve
variables for protected characteristics which could the same aim with less ● How can you explain why
lead to a discriminatory outcome? Such proxy identifable data, e.g. you need to use this data to
variables can potentially be a cause of indirect pseudonymised data? members of the public?
discrimination; you should consider whether the
use of these variables is appropriate in the context ● If using personal data ● Does this use of data
of your service (i.e. is there a reasonable causal identifying individuals, what interfere with the rights of
link between the proxy variable and the outcome measures are in place to individuals? If yes, is there
you’re trying to measure?; do you assess this to be control access? a less intrusive way of
a proportionate means to achieve a legitimate aim achieving the objective?
in accordance with the Equality Act 2010?) ● Is the input data suitable
and necessary to achieve
● What measures have you taken to mitigate bias? the aim?
4.4
Data anonymisation
4.5
Robust practices
If you plan to anonymise or pseudonymise
personal data before linking or analysis, make
sure you follow the ICO’s Anonymisation:
managing data protection risk code of practice
and document your methods. You can fnd ● If necessary, how can you (or external scrutiny) validate that
more technical advice in the UK Anonymisation the algorithm is achieving the correct output decision when
Networks anonymisation guidance. new data is added?
If the assumption is that data in the project is ● How can you demonstrate that you have designed the
anonymised, consider the following: project for reproducibility?
● How can you demonstrate that the data has ● Could another analyst repeat your procedure based
been de-identifed to the greatest degree on your documentation? Have they tried?
possible? ● Have you followed the 3 requirements for
reproducibility? (Applying the same logic (code,
● Can the data be matched to any other datasets model or process), to the same data, in the same
that will make individuals easily identifable? environment).
What measures have you taken to mitigate ● How have you ensured that high quality
this? Have you considered any determined documentation will be kept?
intruder testing? ● Have you considered Reproducible Analytical
Pipelines? (RAP)
● How confdent are you that the algorithm is robust, and that
any assumptions are met?
● What is the quality of the model outputs, and how does this
stack up against the project objectives?
4.6
The second is the trained model itself (the result of applying the
methodology to the dataset). Releasing this model allows others
to scrutinise and test it, and may highlight issues that you can fx
as part of your continual improvement.
4.8
Key:
Evaluate and consider wider policy implications
Score
Start During After
0 1 2 3 4 5
5.3
5.6
Public scrutiny
5.7
● Do members of the public/ end users of the
Share your learnings project have the ability to raise concerns and
place complaints about the project? If yes,
how? If not, why?