Professional Documents
Culture Documents
This service was carried out on <##> distinct <Customer> <application name> systems:
- <System #1> (<System #1 location or description>)
- <System #2> (<System #2 location or description>)
for the data object <Object>.
From each system, <Customer> provided SAP with a representative sample of approximately <##>% of the
overall <Object> data. SAP then performed an analysis on this data with regards to
- completeness (measured by fill rates of fields)
- use of placeholders (invalid or inaccurate values)
- correct formats (measured by validity of patterns)
- duplication (multiple records representing the same business entity)
Key Findings
The overall results show there are some significant issues in regard to <to be completed …>
<Summarize the use of data within the systems – fill rates, columns used, keys or patterns used, etc.>
<Summarize duplication within each source system and across all data sets combined.>
<Summarize the potential impact of the identified data quality issues on data based decisions or proposed
changes to business processes.>
Analysis Disclaimer
If the full scope of the project was not completed, provide those caveats here. Else remove this section.
Due to limited <Customer> resource availability and abbreviated access to <Customer> data, this data
quality assessment was halted after about eight hours of analysis. The information provided in this executive
summary is based upon the completed analysis. However not all information was gather. To demonstrate
the types of analysis available, some fictional reports are provided within this document and are noted as
such.
3
4
SCOPE
Outline the systems and data reviewed during this assessment.
This service was carried out on <##> distinct <Customer> <application name> systems:
- System #1 (<System #1 location or description>)
- System #2 (<System #2 location or description>)
for the data object <Object>.
Following our initial analysis, it was agreed up-front between <Customer> and SAP that <##> relevant tables
from the source system(s) would be analyzed. These tables included in the analysis were:
The data provided represents approximately <##>% of the total dataset from each source systems and is
considered representative of the overall data quality. <Table> represents the main table and relates the key
<Object> field to the standard company name, address, and contact fields (among others). The remaining
tables contain additional information about the <object> and also include lookup and check tables.
Each source system was analyzed separately using the SAP BusinessObjects toolset and a comparison
made of the overall data quality with respect to the following criteria
- Field fill rates
- Placeholders
- Patterns
- Duplication
<Note that whereas it makes sense to analyze the fill rates on all the tables concerned, it only makes sense
to analyze the use of placeholders and patterns on the tables that comprise the main attributes of the
<Object> object. These tables are touched by end users while creating and maintaining data.>
The field fill rate analysis shows which of the fields in the respective tables are actually used and to what
extent. This can provide useful information with regards to the underlying business processes and
demonstrates quite clearly which tables and fields are actually required when making data based decision,
for example when determining source of truth for a data object. Note that a fill rate analysis only takes into
consideration whether a field actually contains a value – it does not provide an indication of whether the
value is reasonable or correct.
The analysis of the use of placeholders serves primarily to find out whether values entered into fields are
indeed valid, or merely entered to complete a transaction. For example, a data record must to be entered
but not all business required data is available. Instead of a valid value, a placeholder is specified at the time
of data creation. The person entering the new record may intend to correct the value later, but often this is
forgotten. Typical examples that are often found here include post codes, whereby a combination such as
“11111” is commonly found, or telephone numbers where a dummy number or simply “999” is entered.
The pattern analysis is slightly more complex and consists of establishing whether valid values are entered
that match pre-defined format, standards or patterns. An example of this can be the format of post codes
depending on the country where a business partner is located. In Germany, for example, the only valid
format is a post code with 5 numeric digits (e.g. “69160”), whereas in the UK there are 6 possible formats:
A9 9AA
A99 9AA
A9A 9AA
AA9 9AA
AA99 9AA
AA9A 9AA
(Note: ‘A’ signifies an alphabetic letter and ‘9’ signifies a numeric digit. Variations may also include the use
of a dash (-) or a single space.) Other countries may have similar or different valid formats.
6
QUALITY CRITERIA – GENERAL DESCRIPTION
This section describes general rules used to assess the data. If different criteria were used, adjust this section.
Fill Rates
Experience has shown that fields can be usefully divided into various fill rate ranges and then analyzed
further.
Caution: Even if a field listed in the report may be 100% complete, it is possible that this is merely a system
field and not necessarily relevant for a particular business process.
Common causes for incomplete data in these fields thus include information that is generally difficult to
obtain, and the entry of equivalent data values in different fields. The latter in particular can lead to disruption
in the business processes.
A proportion of these fields represent the structural framework of the data model on which the business
processes run. It should be noted, however, that some fields may be simply system fields that are required
by the application backend, and they have no actual relevance to the business processes themselves.
<Object Name / Description – Field Name / Description, repeat section as necessary for each field>
<Key findings from the analysis of the given object should be described in detail in this section. Some
findings that may be described could include …>
<Summarize the table/column contents and importance of high data quality>
<Provide tables or graphs. Side-by-side comparison of multiple source systems will highlight difference is
placeholder data between multiple sources.>
<Possible stat: # patterns within field and the occurrence of each pattern.>
<Possible stat: % column data that conforms to defined formats or standards.>
<Do the patterns conform to defined formats or standards?>
RESULTS – DUPLICATES
By analyzing key fields for master data, it is possible to identify duplicate entities within a given data set or
across all data sets. Within a single source, duplicate entries are created due to a lack of governance within
the master data create process or by integration processes that lack checks for existing master data entities.
Often duplicates are created during integration of multiple data sets or during merger and acquisition.
Across multiple data sets, duplication (if managed by a master data process) may be acceptable.
13
www.sap.com/contactsap
The information contained herein may be changed without prior notice. Some software products marketed by SAP SE and its distributors contain proprietary software components of other software vendors.
National product specifications may vary.
These materials are provided by SAP SE or an SAP affiliate company for informational purposes only, without representation or warranty of any kind, and SAP or its affiliated companies shall not be liable
for errors or omissions with respect to the materials. The only warranties for SAP or SAP affiliate company products and services are those that are set forth in the express warranty statements
accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty.
In particular, SAP SE or its affiliated companies have no obligation to pursue any course of business outlined in this document or any related presentation, or to develop or release any functionality
mentioned therein. This document, or any related presentation, and SAP SE’s or its affiliated companies’ strategy and possible future developments, products, and/or platform directions and functionality are
all subject to change and may be changed by SAP SE or its affiliated companies at any time for any reason without notice. The information in this document is not a commitment, promise, or legal obligation
to deliver any material, code, or functionality. All forward-looking statements are subject to various risks and uncertainties that could cause actual results to differ materially from expectations. Readers are
cautioned not to place undue reliance on these forward-looking statements, and they should not be relied upon in making purchasing decisions.
SAP and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP SE (or an SAP affiliate company) in Germany and other
countries. All other product and service names mentioned are the trademarks of their respective companies. See http://www.sap.com/corporate-en/legal/copyright/index.epx for additional trademark
information and notices.