Professional Documents
Culture Documents
Value Proposition
Quality of data is considered to be the number one issue facing companies today and is a
prerequisite for all data-driven initiatives (e.g., data warehouse, system conversion).
Data Governance Structure
Data Quality
Identify data owners and Analyze profiling results, data Create a baseline for all data and
stewards. standards, and impact to store in a data quality repository.
Define standards and business downstream applications . Establish monitoring intervals
rules. Extract required data out of and infrastructure.
Perform profiling of data and production into the cleansing Monitor the data as an ongoing
generate data quality score environment. function.
cards. Cleanse the data, test the data,
and restore the data to
production environment.
Platform Overview
Data Quality
Integrations Security
Informatica MDM User Group
Power Center Authorization
Hub Authorization
User Database
Data Services XMap Authentication
Authentication
Dimensions
enable its widespread Data Quality
Management
adoption in business. Driven
by Framework
Linked
to
business Controls strategi
value c
initiativ
es
Enrichment
Data Profiling includes: Service
Levels
▪ Defining Dimensions
▪ Profiling Data Enabled with a
comprehensive
change
management
program
Deloitte Consulting LLP 2012 9
Data Profiling Dimensions
Some of the key data profiling dimensions are as following. Any effective data quality
management program tracks the quality of data against these dimensions.
Attribute Description Example metric
Is the data free from error, with a high Percent of values that are correct when compared
Accuracy assessment corresponding to a small to the actual value. For e.g. M=Male when the
error? subject is Male.
Are values present all in the attributes that Percent of data fields having values entered into
Completeness
require them? them.
Does the data follow an adequate system Percent of data having values that fall within their
Validity of classification? Does the data meet ‘the respective domain of allowable values. For e.g.
rules’? Deloitte Consulting LLP 2012
‘Individual’ or ‘Corporate’ customer types. 10
Benefits of Data Profiling
Data Cleansing Benefits
➢ Provides an initial assessment about the quality of data
➢ Helps discover various data anomalies within data elements, including but not
limited to unclear and multiple definitions of data, completeness of data, and
validity of data
➢ Helps discover level of uniqueness within a data set
➢ Helps discover data types and formats/patterns within data elements
➢ Helps discover relationships among data sets
➢ Data profiling results form a basis for defining the scope and approach of data
cleansing activities
Business Benefits
➢ A comprehensive data profiling ensures successful realization of business goals from
the implementation of MDM (master data management) and Data Quality (DQ)
projects
➢ Helps identify data errors at the source end and determine necessary corrective
action
➢ Reduces cost of managing data by removing inefficiency and redundancy within data
➢ Improves confidence in data among business users and customers
Deloitte Consulting LLP 2012 11
Data Quality Features in Informatica Developer Client
➢ Column Profile: A column profile determines the characteristics of columns in a
data source, such as value frequency, percentages, and patterns.
➢ Column profiling discovers the following facts about data:
▪ The number of unique and null values in each column, expressed as a number
and a percentage.
▪ The patterns of data in each column and the frequencies with which these
values occur.
▪ Statistics about the column values, such as the maximum and minimum
lengths of values and the first and last values in each column.
➢ Column Profile Options: Column profile options can be used to select the
columns on which the profile needs to be run, set data sampling options, and set
drilldown options when a profile is created.
➢ When a profile is created with the Column Profiling option, the profile wizard can
be used to define filter and sampling options. These options determine how the
profile reads rows from the data set.
➢ After completing the steps in the profile wizard, a rule to the profile can be
created.
Property Description
Values List of all values for the column in the profile.
Frequency Number of times a value appears in a column.
Percent Number of times a value appears in a column,
expressed as a percentage of all values in the
column.
Chart Bar chart for the percentage.
Property Description
Patterns Pattern for the selected column.
Frequency Number of times a pattern appears in a
column.
Percent Number of times a pattern appears in a
column, expressed as a percentage of all
values in the column.
Chart Bar chart for the percentage.