Professional Documents
Culture Documents
Data Governance
Data Governance
Micr
oso
ORIGINAL
Microsoft 365
DATA GOVERNANCE
Electronic Version is Current—Uncontrolled Hard Copy Valid Only at the Time of Printing
Security Class
Retention
Next Review Date
Storage Type
OMF
Approved
All major changes to the document at this current revision are recorded below.
(Does not include correction of typos or formatting changes)
0.2
0.3
0.3
0.4
Data governance (DG) is the process of managing the availability, usability, integrity and
security of the data in enterprise systems, based on internal data standards and policies
that also control data usage. Effective data governance ensures that data is consistent
and trustworthy and doesn't get misused. It's increasingly critical as organizations face
new data privacy regulations and rely more and more on data analytics to help optimize
operations and drive business decision-making.
Poor data governance can also hamper regulatory compliance initiatives, which could
cause problems for companies that need to comply with new data privacy and
protection laws, such as the European Union's GDPR and the California Consumer
Privacy Act (CCPA). An enterprise data governance program typically results in the
development of common data definitions and standard data formats that are applied in
all business systems, boosting data consistency for both business and compliance uses.
A key goal of data governance is to break down data silos in an organization. Such silos
commonly build up when individual business units deploy separate transaction
processing systems without centralized coordination or an enterprise data architecture.
Data governance aims to harmonize the data in those systems through a collaborative
process, with stakeholders from the various business units participating.
Another data governance goal is to ensure that data is used properly, both to avoid
introducing data errors into systems and to block potential misuse of personal data
about customers and other sensitive information. That can be accomplished by creating
uniform policies on the use of data, along with procedures to monitor usage and
Besides more accurate analytics and stronger regulatory compliance, the benefits that
data governance provides include improved data quality; lower data management costs;
and increased access to needed data for data scientists, other analysts and business
users. Ultimately, data governance can help improve business decision-making by giving
executives better information. Ideally, that will lead to competitive advantages and
increased revenue and profits. Read more about the benefits of a successful data
governance strategy and how to build one in an article by data management consultant
Andy Hayler.
In most organizations, various people are involved in the data governance process. That
includes business executives, data management professionals and IT staffers, as well as
end users who are familiar with relevant data domains in an organization's systems.
These are the key participants and their primary governance responsibilities.
Chief data officer The chief data officer (CDO), if there is one, often is the senior
executive who oversees a data governance program and has high-level responsibility for
its success or failure. The CDO's role includes securing approval, funding and staffing for
the program, playing a lead role in setting it up, monitoring its progress and acting as
an advocate for it internally. If an organization doesn't have a CDO, another C-suite
executive usually will serve as an executive sponsor and handle the same functions.
Data governance manager and team In some cases, the CDO or an equivalent
executive -- a director of enterprise data management, for example -- may also be the
hands-on data governance program manager. In others, organizations appoint a data
governance manager or lead specifically to run the program. Either way, the program
manager typically heads a data governance team that works on the program full time.
Sometimes more formally known as the data governance office, it coordinates the
Data governance committee the governance team usually doesn't make policy or
standards decisions, though. That's the responsibility of the data governance committee
or council, which is primarily made up of business executives and other data owners.
The committee approves the foundational data governance policy and associated
policies and rules on things like data access and usage, plus the procedures for
implementing them. It also resolves disputes, such as disagreements between different
business units over data definitions and formats.
Data stewards the responsibilities of data stewards include overseeing data sets to
keep them in order. They're also in charge of ensuring that the policies and rules
approved by the data governance committee are implemented and that end users
comply with them. Workers with knowledge of particular data assets and domains are
generally appointed to handle the data stewardship role. That's a full-time job in
some companies and a part-time position in others; there can also be a mix of IT and
business data stewards.
The initial step in implementing a data governance framework involves identifying the
owners or custodians of the different data assets across an enterprise and getting them
or designated surrogates involved in the governance program. The CDO, executive
sponsor or dedicated data governance manager then takes the lead in creating the
program's structure, working to staff the data governance team, identify data stewards
and formalize the governance committee.
Data mapping and classification. Mapping the data in systems helps document data
assets and how data flows through an organization. Different data sets can then be
classified based on factors such as whether they contain personal information or other
sensitive data. The classifications influence how data governance policies are applied to
individual data sets.
Data catalog. Data catalogs collect metadata from systems and use it to create an
indexed inventory of available data assets that includes information on data lineage,
search functions and collaboration tools. Information about data governance policies
and automated mechanisms for enforcing them can also be built into catalogs.
Consultant Anne Marie Smith details the key steps for building a data catalog.
To the extent that data governance may impose strictures on how data is handled and
used, it can become controversial in organizations. A common concern among IT and
data management teams is that they'll be seen as the "data police" by business users if
they lead data governance programs. To promote user buy-in and avoid resistance to
In a report published in October 2019, Gartner analyst Saul Judah listed these seven
foundations for successfully governing data and analytics applications:
Often, the early steps in data governance efforts can be the most difficult because it's
characteristic that different parts of an organization have diverging views of key
enterprise data entities, such as customers or products. These differences must be
resolved as part of the data governance process -- for example, by agreeing on
common data definitions and formats. That can be a fraught and fractious undertaking,
which is why the data governance committee needs a clear dispute-resolution
procedure.
Other common challenges that organizations face on data governance include the
following.
Demonstrating its business value. That often starts at the very beginning: "It can be a
real struggle to get your data governance initiative approved in the first place," data
governance consultant and trainer Nicola Askham wrote in a September 2019 blog post.
To help build a business case for a data governance program, Askham recommended
that proponents document data quality horror stories and tie the expected outcomes of
the program to specific corporate priorities.
Governing big data. The deployment of big data systems also adds new governance
needs and challenges. Data governance programs traditionally focused on structured
data stored in relational databases, but now they must deal with the mix of structured,
unstructured and semi-structured data that big data environments typically contain, as
well as a variety of data platforms, including Hadoop and Spark systems, NoSQL
databases and cloud object stores. Also, sets of big data are often stored in raw form in
data lakes and then filtered as needed for analytics uses. A related article offers more
details on the challenges and advice on best practices for big data governance.
Data governance programs are underpinned by several other facets of the overall data
management process. Most notably, that includes the following:
Data governance tools are available from various vendors. That includes major IT
vendors, such as IBM, Informatica, Information Builders, Oracle, SAP and SAS Institute, as
well as data management specialists like Adaptive, ASG Technologies, Ataccama,
Collibra, Erwin, Infogix and Talend. In most cases, the governance tools are offered as
part of larger suites that also incorporate metadata management features and data
lineage functionality.
The optimal approach to a data governance framework includes a program team, a data
governance council and stewards. Expert Anne Marie Smith explains what role each
group plays.
Many organizations struggle with the design and implementation of their data
governance programs for a variety of reasons. They may not understand the need for an
enterprise approach to data governance and may attempt to implement policies via a
line of business, an application or a functional area division.
A set of data stewardship teams from each business area that helps design
the policies and practices with the data governance team and implement
them in conjunction with the data custodians from IT. Each team of data
stewards is directed by a lead business data steward. The leads then form a
data stewardship coordinating group that settles cross-group differences with
policies and practices and refines definitions and other metadata as needed.
Although these groups may use different names in other organizations, this structure
has been proven to be the most successful approach to implementing enterprise data
governance programs.
Back in the day, business analysts had to go cap in hand to the IT department because
they didn't know how to navigate an IMS or IDMS database and wouldn't have been
granted access even if they could. The IT department printed out monthly reports and
distributed them, like Moses descending from the mountain with tablets of stone. There
was no real notion of, or requirement for, data governance.
With the advent of the PC, the balance of power shifted radically. Suddenly, business
users had access to spreadsheets and could create their own calculations and analyses,
Data analytics could now be done by business analysts and executives. But without
agreement on the legitimacy of the data sources, chaos ensued. The mushrooming data
silos hampered business intelligence (BI) and analytics efforts, and the first glimmers of
a need to better govern data came to light.
Sure, the original data was still stored in different applications and formats. But with
enough effort, a data warehouse could be coaxed into making sense of it all by
providing consolidated data in multiple dimensions, such as customer, product, asset
and location. However, to actually produce consistent sets of data, the inconsistencies of
the underlying systems had to be resolved.
Master data management (MDM) was born and, alongside it, the concept of a data
governance strategy arose. Business users were encouraged or cajoled into deciding
which classifications of customers and products were "golden records" to be held aloft
across the enterprise and which were to be cast into the wilderness of department-
specific, local terminology. This frequently was -- and still is -- an acrimonious process,
with different departments and data owners arguing over the best way to classify data.
It seems clear that, in a lot of companies, the freedom fighters are now in the ascendant.
A sign of this is the growing market for self-service data preparation tools. These
products can access data from a wide variety of sources, including traditional databases,
big data systems, packaged business applications, Excel and systems outside the
corporate firewall. They enable some data quality techniques, such as data profiling, and
empower business users and data scientists to set up data transformations and
manipulate data to their heart's content. Extracts, transformations and data scrubbing
can also be automated via repeatable workflow processes.
Such a market simply wouldn't exist if corporate data warehouses and MDM systems
were doing their job. Data quality checks and transformations are supposed to happen
before data is fed into a data warehouse. The trouble is that the data warehouse has
been stretched beyond its natural limits. Data now comes from so many sources and in
such volumes that traditional data management approaches are breaking down.
E-commerce systems may generate web traffic logs of such size that normal databases
can't handle the processing. Sensors on vehicles and machinery -- airplanes, cars, smart
meters, elevators, pipelines and so forth -- transmit huge volumes of streaming data. All
this is in addition to corporate transaction data, plus data from business partners and
data brokers.
With that much data coming at you, who has time for meetings to discuss the merits of
different customer classification hierarchies or the attributes of a particular data asset as
part of an MDM project?
Nonetheless, corporations need to take back control of this fast-flowing stream of data
if they are to make sense of it for enterprise business analytics uses -- also to help boost
data security and compliance with GDPR, the California Consumer Privacy Act and other
new laws that regulate the use of personal data by companies.
Developing a data governance strategy may not be a sexy topic, but it's at the heart of
what needs to be done from data collection and processing to analytics
applications. Data lakes can become data swamps if there's no way to peer into their
depths and bring some order to the data pooled there. And all the pretty charts that
analysts create with BI and analytics tools mean nothing if you can't agree on what the
underlying data means -- and whether it's trustworthy.
In the absence of some governance structure, we'll just be back to the old days, with
analysts waving their charts at each other and arguing over whose data is correct.
Putting the data genie back in the bottle is difficult, but in all too many organizations,
things now feel chaotic rather than managed. It's not about imposing rules from the top
down but about embedding data management and analytics discipline throughout the
layers of the organization. Otherwise, valuable business insights may be overlooked, and
potential competitive advantages lost.
Not having a single source of truth can be a huge issue for data professionals. But
strong data governance policies can prevent data silos and lead to better-quality data.
One of the leading causes of data quality problems is siloed data. It produces duplicate
data and, in many cases, different understandings of important business processes.
That's why it's important to make breaking down data silos a central component of
any data quality initiative.
Cultural issues. Business units often see data they control as an asset that increases
their importance to the organization. A less dysfunctional reason for one to keep data
siloed is if it feels that sharing the information may lead to a loss of control over data
quality or the need to make unwanted changes to its systems if data definitions are
modified to meet enterprise needs.
If you have been in IT for any length of time, there's a good chance you understand
the problems data silos cause, including:
Different definitions for the same data elements in separate systems, leading
to inconsistencies and poor data quality
Reducing the number of data silos and their negative impact will require that you
develop both prevention and corrective-action strategies. Here are some steps you
should take to do so.
First, build a culture of viewing all corporate data as an enterprise asset. Promote the
benefits of good data quality, data sharing and a single source of truth across IT and
business units.
Once you identify a data silo, meet with the owner, evaluate the data's usage and
determine if the silo should be consolidated, replaced or treated as a system of record
for the organization.
If you don't already have one, evaluate the need for a data warehouse environment to
consolidate data from multiple operational sources. Data warehouses help you to tightly
govern and easily share the information you are duplicating from operational systems.
Also, evaluate master data management software and data quality tools to facilitate
effective data governance and help you build a single source of truth. The alternatives
range from open-source offerings like Talend to commercial platforms from IBM,
Informatica, Information Builders, Oracle, SAP, SAS and other vendors.
A data governance policy outlines the roles, rules, processes and best practices that an
organization follows to ensure the quality and proper use of its data; it also can assist in
breaking down data silos. A suggested outline includes a statement of purpose that
defines the governance policy's mission and goals, backed up by executive sponsor
signatures. Additionally, a policy should cover the following aspects of a data
governance program:
Data creation. There should be policies that control the ownership, storage and
definition of new data elements. These policies should also include security and
Data access. Data request procedures should cover security and regulatory adherence
reviews, read vs. update, data access procedures and tools, and performance impact
metrics.
Data integrity. Instate procedures that protect the quality and lineage of existing data,
including guidelines on modifying data definitions and updating data elements.
There are many tools on the market now that can help you with your data governance
initiative. In particular, there are numerous products that hold and manage your data
glossary, data catalogue and data dictionaries. These have proved very popular and the
number of players in the market has increased over the last few years.
If you are lucky enough to have the budget to purchase such a tool, please make sure
that you're well prepared so that you can choose the right vendor for your
organisation’s needs. If you select the wrong tool, it won’t help your Data Governance
initiative and even worse it could distract from or even derail it!
Let's look at the most common pitfalls first. The three main ones that I've seen are:
Firstly, there's little or no business involvement early in the process. Many people wait
until the tool is purchased and even being implemented before they involve business
users. In my experience, this is a huge problem and should be avoided at all costs.
I have seen a few implementations go wrong because the eventual business users were
not involved in selection. Think about it from their point of view. They have not asked
for such a tool, nor does it help them to do an existing task more quickly or easily. So,
when you come to implement your shiny new tool, the business users feel they're
having some IT tool foisted upon them. Generally, they do not react well and I can recall
one instance when the whole implementation had to go back to the drawing board.
Once the business users understood what they needed to use the tool for, their
requirements were vastly different from what had been delivered.
The second pitfall is being unclear on what you require of the tool. Often someone has
latched on to the fact that a tool could help them and dived straight in and bought one
without being really clear what they want the tool to do. Please make sure that you take
time to work out what your objective is from having the tool. Once you've worked that
out, progress to defining some clear detailed requirements (just a requirement to have a
data glossary is not sufficient).
I've seen implementations of tools fail or the wrong tool selected because of vague or
overly complex requirements (just because the tool does it, does not mean that your
business really needs it).
Now we've looked at what the main pitfalls are. I wanted to share with you a few
questions that would be useful to ask the vendors to ensure they're a good fit for you
and your data governance initiative. Since I've highlighted the need for objectives and
clear requirements, the first question to ask them is, how does their tool meet your
requirements. Notice I say how does it meet… and not does it meet. If you ask, “does
your tool meet our requirements”, most vendors will say yes.
What you want to know is how. Is it simply out of the box functionality that is ready to
go or will there have to be manual workarounds, or even worse a lot of customisation or
configurations in the tool that may make future upgrades very difficult for you.
Secondly, I'd ask what implementation support will be given to you. You have to
remember these tools are by their very nature, flexible, and you need to set them up in a
way that works for your business. This means that you will need some support from the
vendor. So, make sure that you are very clear upfront about what kind of support they
will be giving you. Knowing what is and isn't covered will prevent any nasty surprises in
the future.
Thirdly, ask what training they provide for both you and the team implementing it.
Perhaps they may even support training your business users on how to use their tool.
Definitely work out what training you want and ask what training is available.
I’ve said it already but please remember that to successfully choose the right tool for
your company, it is absolutely vital that you are very clear on what you need the tool to
do before starting a selection process. Clear requirements should be the start of the
process.
Make sure that you understand not only the support arrangements of the tool (as I
mentioned in the last section) but also the upgrade path of the tool. I've come across
more than one situation where an organisation has customised a tool to such a degree
that is not possible to follow the upgrade path. On one occasion they needed a project
to redesign and implement a new data glossary to be able to upgrade and take
advantage of the new functionality.
Lastly, I would say that when you're working with vendors, going through workshops or
maybe an RFP process you are going to meet a whole variety of personalities. Bear in
mind that these are not the people that you will be working with if you choose and
select this tool. Whether you like or dislike them, do not be swayed by the personalities.
They will not be around for the implementation, and the ongoing support will be
provided by other people. So don't let yourself be influenced just because you like or
dislike their sales team!
Just remember that such tools can be great enablers to your data governance initiative,
but they need to be put in place once your data governance initiative is already going
so that you are very clear on what you want.