You are on page 1of 10

THE DUBAI DATA MANUAL

DATA
INVENTORIES
Version 1.2 (November 2020)
DATA INVENTORIES
Purpose: This document describes how to create a list of datasets
which are collected, managed or maintained by the Entity.
While it may not be possible to create a complete list in one
step, this standard helps Entities ensure that the most
valuable data assets are listed as an initial priority, and then
to expand the Inventory over time.
Who has lead Data Administrator, reporting to the Data Leader.
responsibility for
implementing this
standard in each
Government Entity:

When to use this After setting up Governance Roles and Processes.


module of the Dubai
Data Manual: Before Data Prioritization and Data Cataloguing processes.

Requirements: It is required that you produce an inventory listing datasets


managed by your Entity, together with a brief description of
each.
It is recommended that you review your inventory regularly
(e.g. annually) to identify additional datasets that can be
included.

Document owner: Dubai Data Establishment

Definitions: Terminology in this standard follows the definitions set out in


the Dubai Data Manual Glossary.

Licensing: This document is published under the terms of a Creative


Commons Attribution 4.0 International Licence in order to
facilitate its re-use by other governments and private sector
organisations. In summary this means you are free to share
and adapt the material, including for commercial purposes,
provided that you give appropriate credit to the Dubai Data
Establishment as its owner and do not suggest the Dubai
Data Establishment endorses your use.

Version: Version 1.2, dated 15 November 2020.

© 2020, Dubai Data Establishment. All Rights Reserved. This document forms part of the
Dubai Data Manual, and is freely available for reuse under the terms of a
Creative Commons Attribution 4.0 International License
3

OVERVIEW
Before deciding what data to publish or share, each Entity needs to make an inventory to
understand what data it holds. This document helps Government Entities undertake this
inventory by setting out:

• What types of data should be considered;


• The process for producing the initial version of the inventory;
• How your Entity can demonstrate that this initial inventory is in conformance with
this module of the Dubai Data Manual
• The process for expanding and enriching the inventory over time (annual review).

TYPES OF DATA
You should list existing data1 which your Entity uses, maintains or collects. This could be
data frequently used by the departments within this Entity which might not have a clear
owner. It also includes all data where the Entity is responsible for collecting and updating
the data, even if this work is done by someone else (such as a private company) on behalf
of the Entity.
There is no need to change or rearrange data before adding it to the list, or to collect any
data which is not already held.
Data should be listed in the form of datasets. A dataset consists of data with its metadata.
The metadata provides context and information about the data. Therefore, a dataset
should be an individual object that makes sense as a whole by itself.
A dataset may be a database or spreadsheet. It could also be a map, a published table or
even a paper file.
It may be more practical to count a collection of data, such as a database, as one dataset
or as several. You should count it as one dataset if the data within a database is:
• thematically related
• easiest to describe as a whole
• interrelated.
Otherwise, it likely consists of several datasets. The split depends on the existing and
potential use of the data. It’s up to the data owners who understand the data best to make
the judgement decisions on how data should be listed as datasets.

1
The requirements, standards and processes described here only refer to structured data. Although the Dubai Data programme in the
longer term also embraces unstructured data, the current focus of initiatives and the Dubai Data is on structured data only (see Module 1
Overview for more information).

© 2020, Dubai Data Establishment. All Rights Reserved. This document forms part of the
Dubai Data Manual, and is freely available for reuse under the terms of a
Creative Commons Attribution 4.0 International License
4

PROCESS FOR DEVELOPING THE INITIAL


INVENTORY

The diagram below summarises the process you should follow for developing the initial
version of your Entity’s Data Inventory, illustrating which of the key data governance roles
will normally have lead responsibility for each step of the process. Each step is then
described in more detail.

1. IDENTIFY A DATA REPRESENTATIVE PER


DEPARTMENT WITHIN THE ENTITY
Each department or business unit within the Entity should have a named responsible Data
Steward or Specialist who has a good understanding of the data their department
produces, uses and manages. This person should be someone who is in a senior role (or
appointed directly by a senior role) and regularly deals with data and is aware of the
variety of data which exists within their department. For larger departments that handle a
lot of data, this role may be covered by more than one person. Further guidance on the
role of Data Stewards and Data Specialists is set out in Data Governance Roles and
Processes.

© 2020, Dubai Data Establishment. All Rights Reserved. This document forms part of the
Dubai Data Manual, and is freely available for reuse under the terms of a
Creative Commons Attribution 4.0 International License
5

2. ENSURE EACH DEPARTMENT PRODUCES A DRAFT


LIST OF DATASETS, WITH INITIAL
PRIORITISATION
Under the co-ordination of the Data Administrator, the Data Steward / Specialist within
each department should go through the following process to develop an initial list of what
data they know or expect to be managed within their department, whether or not it
should be Open Data.

2.1 Look at existing data lists

Draw together existing lists of datasets that are collected, maintained or managed by the
Entity. These may include:

• The list of Dubai Registers identified by the Dubai Data Establishment

• All critical information assets listed on your Information Asset Register (as required
by the Information Security Regulation)

• Any datasets listed previously within the Entity for other purposes: for example,
documentation of previous information audits, configuration management
databases or software asset lists.

2.1.1 About Dubai Registries

The Dubai Data Policy defines Registers as “any register of data sets identified by the
Dubai Data Establishment as being essential core data sets, in accordance with Article (7)
Of The Dubai Data Law.”
A Register is an authoritative and a digital list that has been officially recognized by the
Dubai Data Establishment as an official reference register and unique identifier for the
subject it represents.
It acts as a single source of truth, for the emirate of Dubai's most important aspects of city
life – people, economy, land, and assets.
Examples of a register are country codes or a list of officially recognized schools.

2.2 Brainstorm other datasets

Brainstorm any data that your department:


• collects
• stores
• maintains and updates
• commissions externally
You should aim to be as comprehensive as possible, but it is not expected that all datasets
will be captured in the first inventory. You should:
First, identify any datasets that have already been openly published or are currently
shared with other Government Entities.
Second, use the example table below to prompt thoughts about the other data you use.
Focus in particular on high value datasets: data that might be used to help make Dubai
more efficient, provide a service to citizens or businesses (for example, by helping them
catch the right bus or plan their finances), or make government more transparent.

© 2020, Dubai Data Establishment. All Rights Reserved. This document forms part of the
Dubai Data Manual, and is freely available for reuse under the terms of a
Creative Commons Attribution 4.0 International License
6

Finally, ‘sanity check’ your list: are there any obvious datasets missing? That is, have you
included all the key datasets that an informed citizen or business in Dubai expect the
Entity to hold?
You don’t need to include core reference data that you use which is provided by another
department in your Entity or by a different government Entity – when you are confident
that the data is being included within the Inventory of the relevant body. Where it is not
clear where ultimate responsibility for such data lies, please include it within your list for
the time being – so that ownership can be clarified at Step 3 of the process by your Data
Administrator, if necessary, in consultation with the Dubai Data Establishment.

TYPE OF DEFENITION EXAMPLES METRO SPECIFIC


DATASET EXAMPLE

REAL-TIME DATA Constantly updating data - Weather data; football • Current location of metro
often high volume and high through airport; cars passing trains
velocity. toll booths; pollution levels; • Current passenger
real-time location data; numbers
electricity usage

OPERATIONAL These are the records that Entity organization chart; • Staff numbers
DATA are made as part of an forecast or modelling data; • Duty rota for staff at
organization carrying out its buildings owned/maintained; metro stations
day-to-day business. budget; spending; staff
levels; performance against
metrics

REFERENCE DATA Authoritative or definitive Timetables; names and • Metro timetables


data that rarely changes locations of schools, • Capacity per metro line
about things, and that is hospitals, bus stops, metro • Metro station locations
necessary to help stations; tax codes; land
understand other data - holdings; mapping data;
often because it includes indicators; address data;
identifiers for those other citizen id
things. Often produced by
the public sector as a service
in itself due to its high
importance and value.

AGGREGATED Analyzed and summarized Hospital operation success • Metro usage per month
DATA data, which provides rates; school exam pass • Metro trends over time
overview information of rates; population statistics;
reference or administrative housing; tourist numbers by
data month / year; nationalities of
visitors

Note: this table and examples are illustrative; not all types of data may apply to your Entity or be collected. You may
have other types of data not included, and some of the types may overlap.

© 2020, Dubai Data Establishment. All Rights Reserved. This document forms part of the
Dubai Data Manual, and is freely available for reuse under the terms of a
Creative Commons Attribution 4.0 International License
7

2.3 Create a table of datasets, including initial prioritisation

For each identified dataset, within a spreadsheet or table, list:

• A name for the dataset. If it doesn’t have an existing name that you are aware of,
choose a short, descriptive name.
• A brief description to clarify what data is being referred to and its scope.
• The department or business unit responsible for managing the data
• The list of data attributes (normally column headings for tabular data) that are
used in this dataset.
• The data steward, if known, i.e. the role or person within the department
responsible for the data
• The data steward’s initial assessment of the extent to which this dataset should be
a priority for initial publication as open or shared data (using the process and
prioritization tool set out in the module of the Dubai Data Manual on prioritization).
Example

An example of how the table might look is shown below.

DATASET DESCRIPTION RESPONSIBLE DATA PRIORITY DATA


/ DOTES DEPARTMENT ATTRIBUTES SCORE STEWARD

TRANSPORT All current bus Operations dept Bus number, bus 90% Mark Jones
and metro stop location, arrival
TIMETABLES timetables time, departure
time, frequency

STATION Street addresses Estates dept Station name, Bus 76% Unsure
and geodata of stop reference,
LOCATIONS stations and bus station address, bus
stops stop address

RTA BUDGET Annual budget of Finance dept Expenditure 34% Maria Vanski
Dubai Roads & category, value
Transport (AED)
Authority

... ... ...

3. INTEGRATE AND VALIDATE


Each draft list from each department should be reviewed by the head of that department,
and then combined into a single Entity-wide list and reviewed for duplication and
accuracy by the Data Administrator.
The Data Administrator should:

• Repeat steps 2.2 and 2.3 to identity any missing datasets that may not sit within
any individual department or which have been missed.
• Check and ensure dataset names are unique and their descriptions are clear.
• If any information is missing, contact the relevant person to get this resolved.
Ensure all datasets have an assigned Data Steward.
• Ask department data leads to make any corrections necessary, and to confirm that
in their view, the inventory includes all existing open datasets, all obvious datasets
and all high-value datasets held by their unit (see types of data above).

© 2020, Dubai Data Establishment. All Rights Reserved. This document forms part of the
Dubai Data Manual, and is freely available for reuse under the terms of a
Creative Commons Attribution 4.0 International License
8

• Identify any core reference data used by the Entity where you are not the owner.
Take these out of the Inventory and forward to the Dubai Registers team within the
Dubai Data Establishment for resolution. The Dubai Data Establishment will then
decide which Entity has ultimately responsible for maintaining these datasets and
will be the authoritative source of the data. If this Entity is identified, those
datasets will go back in the Inventory.
• Ensure there are no duplicates in the inventory.

4. PRIORITISE
The Data Inventory should then follow the Prioritization Process. This will build on the
initial prioritization scores given by Data Stewards and Data Specialists in Step 2.3 above,
ensuring that all datasets listed on the Inventory have been prioritized on a consistent
basis.

For Dubai Registers, Dubai Data Establishment team will validate the candidates and
prioritize those qualified.

5. ASSESSMENT AND APPROVAL

The full initial Prioritized Inventory should then be reviewed by the Data Leader who
should verify that:

• the inventory contains a reasonably comprehensive list of data held by the Entity
• no key datasets are missing
• it was carried out by the appropriate staff members
• it contains the prioritization information specified in the Data Prioritization process
The Data Leader should then seek sign-off for the combined, prioritised Inventory at the
highest level of the Entity, before it is sent to the Dubai Data Establishment.

© 2020, Dubai Data Establishment. All Rights Reserved. This document forms part of the
Dubai Data Manual, and is freely available for reuse under the terms of a
Creative Commons Attribution 4.0 International License
9

6. DUBAI DATA ESTABLISHMENT SIGN OFF


The Dubai Data Establishment will review whether the Prioritised Data Inventory conforms
to this standard (using the Conformance checks detailed below) and send it back for
additional work if they judge key datasets to be missing or priorities mislabelled.
The Dubai Data Establishment has ultimate responsibility for approval of the prioritised list
and the power to re-prioritise or add datasets it believes are missing.
Once approved, the top 10 datasets (or top set of datasets assessed as doable depending
on ease of publication ranking) should now be Catalogued. Following completion of the
cataloguing process, the Data Inventory should be expanded with the additional
cataloguing information now available for these high priority datasets.
Note: The Cataloguing process may mean that some datasets end up being child datasets
(for example, with only ‘open’ classification attributes). These should be added to the
inventory with the same priority as the parent dataset.

CONFORMANCE
Each Entity’s Data Inventory will be assessed by the Dubai Data Establishment to ensure it
conforms to this module of the Dubai Data Manual. To demonstrate conformance, the
Entity must provide:

• The Data Inventory spreadsheet itself


• A priority ranking of the datasets on the Inventory, that follows the process
specified in the module of the Dubai Data Manual on Prioritization
• Written confirmation from the Entity’s Data Leader that:
- they have reviewed and approved this list;
- confirm it includes all obvious and high value datasets (e.g. business critical
data, operational, reference, real-time and aggregated data), already
published data and data shared with other organizations, and all existing
identified lists of this Entity’s data (including identified Dubai Registers and
the Entity’s Information Asset Register datasets); and
- the process was followed and checked by the Data Administrator.
• List of all departments involved in creating the draft lists and name / role of data
lead responsible.

© 2020, Dubai Data Establishment. All Rights Reserved. This document forms part of the
Dubai Data Manual, and is freely available for reuse under the terms of a
Creative Commons Attribution 4.0 International License
10

ANNUAL REVIEW
– EXPANDING THE INVENTORY
The inventory process should be repeated at regular intervals (for example, annually) in
order to:
• Identify new datasets managed by the Entity (these could be completely new or
extensions and reformulations of existing data)
• Respond to user demand for data
o For Dubai Registers candidates the inventory update should be immediate.
o Although we might think Dubai Registers as consolidated data sets with
limited chance to be extended, Dubai presents a social, technical, and
geographical landscape bringing continuous innovation and creativity. In
retrospect, Government entities develop new data sets that can be
considered Dubai Registers even with no previous existence. We find
examples on data sets for identifying licenses for Drone piloting, Device
licenses for self-driving vehicles, new Land island demarcation, new roads,
etc.
• Review the existing inventory in light of publication and sharing: both lessons
learned and feedback received from other Entities, the Dubai government, external
stakeholders and internal staff.
The process to follow should be similar, but instead of listing all possible datasets it should
involve using the existing inventory as a basis and using each step of the process to see
how the inventory can be expanded or amended. Expansion should cover both:
• Extending the inventory, by adding new datasets
• Enriching the inventory, by increasing the proportion of datasets that have been
catalogued (using metadata, quality standards and common data formats).

© 2020, Dubai Data Establishment. All Rights Reserved. This document forms part of the
Dubai Data Manual, and is freely available for reuse under the terms of a
Creative Commons Attribution 4.0 International License

You might also like