You are on page 1of 16

 

(http://www.gartner.com/home) LICENSED FOR


DISTRIBUTION

Critical Capabilities for Data Quality Tools


Published: 18 December 2015 ID: G00275753
Analyst(s): Ted Friedman, Saul Judah

Summary
Data quality tools are important in a wide range of use cases and providers in this market exhibit varying strengths across the functionalities critical to
these use cases. Information leaders need to understand their relative strengths to make optimal selections of vendors and tools.

Overview
Key Findings
Significant growth in the data quality tools market is fueled by demand for an increasingly wide range of use cases from traditional
operational/transactional data quality applications, to contemporary big data and analytics scenarios.

The tools available in the market exhibit significantly different degrees of strength across an increasing range of requirements. While all vendors
support the core capabilities of parsing, standardization and cleansing very well, there is much larger variability in the workflow, profiling, and
multidomain support provided in the tools.
Buyers often struggle to find tools that can be readily applied to ever-changing use cases and functional needs — in particular, those that support use
by business-oriented roles outside of central IT. The functional characteristics of data quality tools are an increasingly important dimension of
evaluation on which buyers need to focus.
If the information a company holds on its customers, products, suppliers and assets (and their interrelationships) is not fit for purpose, efforts to
achieve business objectives are impeded.

Recommendations
Information leaders considering data quality tools should:
Plan for the full range of use cases that need to be supported, both now and in the future. For most large enterprises, this will include master data
management, operational/transactional data quality, big data and analytics, information governance, data integration, and data migration.
Focus on critical product capabilities that are commonly needed across many use cases and the data domains in which you must operate.
Understand that usability is increasingly important, as the tools vary significantly in this area, and suitability for use by less technical roles grows more
important.
Recognize that tool functionality is just one dimension to evaluate — also consider market presence, availability of skills, support and service
capabilities, and pricing model/price points. Leaders in the overall market may not be those vendors scoring at the top of the various use cases in this
Critical Capabilities research.

What You Need to Know


Buyers in the data quality tools market need to recognize that the critical capabilities assessed in this document represent a subset of the evaluation
criteria that Gartner recommends when selecting vendors and tools. Therefore, the rankings of vendors expressed here do not represent overall vendor
positioning in the market, and do not always align with positioning of vendors in the corresponding "Magic Quadrant for Data Quality Tools."

The Critical Capabilities model drills into the most-critical functional capabilities of the tools without consideration of other vendor characteristics. While
certain vendors score consistently strongly based on customer feedback about the critical capabilities assessed here, product capabilities alone do not
provide a complete vendor and tool evaluation. Organizations must also consider each vendor's market presence, track record, financial and
organizational strength, availability of skills, product support, and the depth of its professional services — see the companion Magic Quadrant for a high-
level vendor assessment that considers these additional characteristics.

The functional characteristics of data quality tools are an increasingly important dimension of evaluation on which buyers — including information
management leaders, information governance-related stakeholders and data quality project leaders — need to focus. The critical capabilities for data
quality tools defined in this research represent the key functional characteristics that are most relevant across the range of contemporary use cases to
which data quality tools are applied.

Organizations need to understand the relative importance of each capability to the use cases they are facing, and use this insight to assess each
provider's suitability for supporting those use cases in the specific way and to the specific depth required. By leveraging the critical capabilities ratings
and use-case scores, buyers can identify a set of providers that may be the best fit to deliver the product functionality necessary to succeed in their
efforts to improve data quality.

In the case of our chosen set of critical capabilities, all vendors score above the "meets requirements" (3.0) level for all use cases (see the Critical
Capabilities Methodology section at the end of this research). This reflects the fact that these vendors offer the most functionally complete data quality
tools in the market. In addition, the relevance of the vendors across all use cases is validated by the customer survey data, which shows that all vendors
have a subset of their customers actively applying their tools in each of the use cases.
The critical capabilities defined in this research and used to assess the vendors align with a subset of the product evaluation criteria in the
corresponding Magic Quadrant. The ratings allocated to each vendor are driven primarily by customer feedback resulting from a survey of reference
customers provided by each vendor. The degree to which the customer base feels a given capability meets its needs, along with the frequency of usage
across the customer sample, influences the rating. As such, the rankings of the vendors regarding strength in each use case are predominantly derived
from assessment of actual usage of a product's capabilities "in the field." However, Gartner's analysis of the vendor capabilities — which is based on
detailed discussions with the vendors on product functionality and input from other customers using the vendors' products (via the Gartner client inquiry
channel and other vehicles of interaction) — is also applied to further refine the ratings.

Gartner recommends that buyers complement this Critical Capabilities assessment with our "Magic Quadrant for Data Quality Tools" (to understand the
vendor landscape beyond product capabilities) and our "Toolkit: RFP Template for Data Quality Tools" (to ensure visibility to the complete breadth of
product capabilities relevant to this market).

Analysis
Critical Capabilities Use-Case Graphics
Figure 1. Vendors' Product Scores for Big Data & Analytics Use Case

Source: Gartner (December 2015)

Figure 2. Vendors' Product Scores for Data Integration Use Case
Source: Gartner (December 2015)

Figure 3. Vendors' Product Scores for Data Migration Use Case
Source: Gartner (December 2015)

Figure 4. Vendors' Product Scores for Information Governance Initiatives Use Case
Source: Gartner (December 2015)

Figure 5. Vendors' Product Scores for Master Data Management Use Case
Source: Gartner (December 2015)

Figure 6. Vendors' Product Scores for Operational/Transactional Data Quality Use Case
Source: Gartner (December 2015)

Vendors
Ataccama
Ataccama's DQ Analyzer, Data Quality Center (DQC), DQ Issue Tracker and DQ Dashboard products provide a range of capabilities that suit all of the
main data quality tools use cases, and the Ataccama customer base reflects broad usage across that range. Reference customers provide very positive
feedback regarding their experiences with the vendor's tools relative to most of the critical capabilities, with data profiling, visualization, performance
and usability cited as particularly strong. While the Ataccama data quality tools are capable of supporting the full range of use cases, the vendor's
strength in data profiling, combined with the core data quality operations (parsing, standardization and cleansing), performance and usability, supports a
strong score for the information governance, data migration and data integration use cases. In addition, Ataccama's work in big data environments
(including support for Hadoop) makes the vendor's tools particularly relevant to the big data use case, garnering its strongest score here. Ataccama's
customers rated the vendor's support for multiple data domains lower than other capabilities, contributing to the vendor's weakest score in the master
data management use case (although still substantially above the "meets requirements" level).
BackOffice Associates
The products that make up BackOffice ' s Data Stewardship Platform (dspMigrate, dspMonitor, dspCompose and dspCloud) represent the vendor's
recent strategy shift toward selling software, with an initial emphasis on the data quality tools market. These products support all required data quality
technology capabilities, with an emphasis (due to the vendor's historical focus) on product data and the SAP applications space. Reference customers
cite the fundamentals of parsing, standardization and cleansing, as well as performance and usability, as strengths. This contributes to the vendor's
strongest scores in the data migration, data integration, and operational data quality use cases. Multidomain weakness drives a substantially lower
score in the master data management use case, although for organizations focused predominantly on product/materials master data this should not be
a concern. Given the vendor's historical focus on SAP data migration services, a majority of BackOffice customers are applying the vendor's tools to the
data migration and master data management use cases, with less usage in others.

DataMentors
DataMentors' DataFuse, ValiData and NetEffect products support the expected range of data quality technology capabilities, with a strong focus on the
customer/party data domain. Reference customers report very positive experiences with this vendor's data profiling, matching, core data quality
operations (parsing, standardization and cleansing), performance and usability, but report significant limitations when working across multiple data
domains (beyond customer/party). Multidomain weakness drives a lower score in the master data management use case, although for organizations
focused predominantly on customer/party master data this should not be a deterrent to adoption. Relatively strong tallies across the other critical
capabilities, as reflected in feedback from reference customers, enable DataMentors to score consistently well across the other use cases, with data
integration, data migration, and big data and analytics having the highest scores for the vendor.
Experian
Experian Pandora, and the company's Capture, Clean and Enhance data quality tools, represent a rebranding of assets that were acquired from QAS and
X88 Software (acquired in 2014, and now part of the Experian group). The company's profiling functionality, related visualization capabilities, scalability
and performance, and usability are cited as significant strengths by reference customers. These strong capabilities were introduced largely as a result of
the Experian acquisition. This contributes to the vendor scoring most strongly against data migration, data integration, big data and analytics, and
information governance use cases. Extremely limited use of the functionality beyond the customer/party domain — as well as weaker feedback on
workflow and matching capabilities — supports lower scores for the master data management use case, although Experian has demonstrated strength
in domain-specific, MDM-related deployments focused on customer/party master data.

IBM
IBM's InfoSphere Information Server for Data Quality provides broad coverage for data quality functionality, including all of the critical capabilities
identified for this market. Reference customers indicate very positive experiences with nearly all of them. They rated as particularly strong the core
functionality (parsing, standardization and cleansing), data profiling, matching and performance. The overall strong ratings support a strong score for
IBM across the full range of data quality technology use cases, with particularly positive ratings for the data integration, data migration, operational data
quality, and big data and analytics use cases. With ratings for multidomain support and workflow capabilities lagging slightly behind the others
(although still well ahead of the basic requirements of the market), IBM garners a slightly lower score for the information governance and master data
management use cases.
Informatica
Informatica's data quality products — consisting of Informatica Data Quality, Data as a Service and Rev — provide comprehensive coverage of the main
data quality functionality required by the market. Reference customers rate the vendor's data profiling, parsing, standardization and cleansing, matching
and performance as particularly strong. With these characteristics, Informatica supports the full range of data quality tools use cases well, with a
particularly strong score for the data integration, data migration and big data and analytics use cases. Reference customers demonstrate successful
usage of the tools in each of these scenarios. Multidomain support, visualization and usability are slightly lagging the other capabilities (although still
well ahead of the basic requirements of the market). Regardless, Informatica still garners strong scores for master data management, information
governance and operational data quality.

Information Builders
Information Builders' iWay Data Quality Suite provides a full range of data quality functionality. The tools are seen in deployments supporting each of the
main data quality tools use cases. Reference customers rate the capabilities for profiling, performance and usability as particularly strong — the vendor's
historical depth in analytics, business intelligence and reporting lends itself well to providing differentiation in these areas. As a result, Information
Builders shows the strongest affinity for the data integration, data migration, and big data and analytics use cases. Capabilities for multidomain support
and workflow are rated lower by the customer base. As a result, Information Builders scores solidly, but slightly lower, for the operational data quality
and master data management use cases.
Innovative Systems
Innovative Systems' data quality products include the i/Lytics Enterprise Data Quality Suite, FinScan, Enlighten, and i/Lytics PostLocate. The portfolio
provides broad functionality with a focus on the customer data domain, which represents the vendor's historical focus and base of strength. Reference
customers cite the vendor's core data quality functionality (parsing, standardization and cleansing) and matching as primary and outstanding strengths.
They also praise the vendor's ability to deliver suitable scalability and performance in large-scale deployments, as well as good usability characteristics.
Customers also rate the vendor's data profiling and visualization capabilities as adequate, but weaker than the core capabilities. Innovative's tools are
rarely deployed outside of the customer/party domain, indicating weakness in multidomain capabilities. As a result, this vendor received its strongest
scores for operational/transactional, data integration, data migration and big data and analytics use cases. The lower profiling and visualization ratings
and the extremely limited multidomain usage, in contrast, are reflected in the information governance and master data management use cases having
lower scores.

MIOsoft
MIOsoft is one of the newest vendors in the data quality tools market. Its MIOvantage product provides functionality for a variety of technology markets,
but the vendor has only recently started to position the product directly for data quality use cases, and has substantially fewer customers and history in
this market relative to competitors. In this product, the vendor provides a full range of data quality functionality, and customer deployments reflect a
variety of all the main use cases. Reference customers overwhelmingly rate the vendor's functionality as extremely strong against each of the critical
capabilities, with usability standing out as the greatest strength. Customer deployments continue to show limited usage for master data management
and in multidomain contexts, contributing to a lower score for this scenario among all use cases. However, very positive customer perceptions for all
other capabilities drive very high scores in support of each of the others, with data integration, data migration and big data and analytics standing out as
most advantageous for MIOsoft.
Neopost
Neopost's products in the data quality tools market — DataCleaner, DataCleaner Cloud, DataEntry and DataHub — reflect the vendor's longtime focus on
customer data. These products cover the expected range of data quality functionality with an emphasis on customer/party and related-location master
data, and the vendor is increasing its focus on product data. Reference customers tell us that Neopost's tools are strong for data profiling, visualization,
parsing, standardization, cleansing and matching. The strong emphasis on customer/party, almost to the exclusion of other domains, drives a low rating
for multidomain support. Reference customers also indicate that workflow capabilities, while meeting basic requirements, are in need of improvement.
As such, Neopost scores most strongly for the data integration, data migration, and big data and analytics use cases, followed by information
governance and operational data quality. While scoring lower for the master data management use case due to limited usage in multidomain scenarios,
Neopost remains strong in the customer/party master data context.
Oracle
The Oracle Enterprise Data Quality product provides support for each of the critical capabilities, with the Oracle EDQ Product Data Extension providing
deep support for the product/materials domain. Reference customers exhibit usage of Oracle's data quality tools in each of the main use cases for this
market. They generally highlight data profiling, parsing, standardization and cleansing as key functional strengths, and also rate Oracle's usability highly.
Visualization and workflow functionality represents an opportunity for improvement. As a result, while Oracle scores well across the full range of data
quality use cases, the information governance, data integration, data migration, and big data and analytics use cases are where the vendor
demonstrates the greatest relevance and value. Limited usage and below-average customer feedback in multidomain scenarios contribute to a slightly
lower score for the master data management use case.
Pitney Bowes
Pitney Bowes' primary data quality product is the Spectrum Technology Platform, which it positions as its strategic solution for both new customers as
well as those seeking to replace its legacy data quality offerings. Reference customers cite the core data quality capabilities (parsing, standardization
and cleansing) and matching as particularly strong, and also cite the scalability and performance of the tools. While the vendor's functionality can be
used with various data domains, observation of some actual deployments and the vendor's strong focus on customer/party data contribute to a lower
rating for multidomain capabilities. As a result, while the master data management use case gains Pitney Bowes a slightly lower score, the vendor
exhibits fairly balanced and solid scores across the range of data quality use cases. The data integration, data migration, and big data and analytics use
cases stand out as the greatest areas of affinity for Pitney Bowes' data quality tools.
RedPoint
The RedPoint Data Management solution provides the full range of data quality functionality expected in this market. This solution also supports
requirements in related markets, such as data integration tools. The vendor's data quality tools are seen in deployments across the full range of use
cases. Reference customers say that the tools' ability to support the core data quality operations of parsing, standardization and cleansing, as well as
matching, is very strong. In addition, scalability and performance is cited as a significant strength, both in traditional environments as well as big data
environments such as Hadoop (where the vendor has made substantial investments and continues to innovate). Only the multidomain support
capabilities are rated by customers below the "meets requirements" level. As a result, RedPoint scores most favorably for big data and analytics, data
integration, data migration and operational data quality use cases, but also scores narrowly in the "meets or exceeds" range for the others.

SAP
SAP's Data Quality Management, Information Steward and Data Services offerings provide comprehensive data quality functionality for both SAP and
non-SAP application and data environments. The vendor's tools are seen in deployments across all of the key data quality use cases with reasonable
frequency. Reference customers rate most of the critical capabilities as consistently meeting or exceeding requirements, with almost equal satisfaction.
The only exception is the multidomain support capabilities, given that most deployments focus on customer data, with few examples of focus on other
or multiple data domains. All other capabilities are rated almost equally, and close to the "meets or exceeds" level. SAP scores solidly above the "meets
requirements" level for all use cases, representing a very consistent and balanced set of capabilities for the broad demand in the market.
SAS
SAS's Data Quality, Data Management, and Data Quality Desktop products represent a full suite of data quality functionality able to support deployments
of all sizes. The vendor's tools are seen in deployments across all of the key data quality use cases with equal frequency. Reference customers report a
very positive experience (often meeting and exceeding requirements) for each of the critical capabilities, with data profiling, visualization, usability, the
core data quality operations (parsing, standardization and cleansing) and matching standing out as the greatest perceived strengths. The vendor
garners slightly lower scores for visualization, workflow and matching, although each of these ratings is solidly above the "meets requirements"
threshold. Only multidomain support falls slightly below this level. The overall solid ratings contribute to highly consistent scores across each of the key
use cases, all near the "meets or exceeds" level. The information governance, data integration, data migration, and big data and analytics use cases
receive the strongest scores, and represent the scenarios of greatest applicability for SAS' data quality tools.

Talend
Talend's Open Studio for Data Quality and Talend Data Management Platform are the products via which the vendor delivers data quality functionality to
the market. Talend's data quality capabilities are most commonly seen in data integration scenarios (a core component of its broader product portfolio),
but appear in each of the other use cases as well. In particular, usage in master data management scenarios is increasing as the vendor expands its
presence in the master data management solutions market. Reference customers generally report that each of the critical capabilities meet or exceed
their requirements, but parsing, standardization, cleansing, performance and usability are recognized as the greatest areas of strength. As a result,
Talend garners its highest scores in the data integration, data migration, operational data quality, and big data and analytics use cases. Visualization,
workflow and multidomain support are rated slightly lower, and contribute to weaker scores for the information governance and master data
management use cases.
Trillium Software
Trillium approaches the data quality tools market via its Trillium Software System, Trillium Cloud, Trillium Big Data, Trillium Global Locator, Trillium for
SAP (CRM, ERP and Master Data Governance), Trillium for Microsoft Dynamics CRM, and Trillium for Salesforce products. With comprehensive
functionality that can support each of the key data quality use cases, Trillium's critical capabilities are rated by reference customers as generally very
strong, reaching the "meets or exceeds" requirements level in all areas with the exception of workflow and multidomain functionality. The latter is due to
Trillium's historical focus on customer/party data, with fewer deployments focused on other data domains. This capability is viewed as suitable to meet
common requirements, and appears to exceed requirements in customer master data scenarios. Use-case scores are balanced and reflect suitable
support for each of the scenarios, with a particular affinity to the data integration, data migration, and big data and analytics use cases. The master data
management use-case scores lower due to the domain-specific (customer data) orientation and experience base of the vendor.

Uniserv
Uniserv's data quality products — Data Analyzer, Data Cleansing, Data Protection and Data Governance — support the main functional requirements of
this market, and are typically deployed in customer/party data applications. Reference customers rate the data profiling, parsing, standardization and
cleansing, matching, performance and usability as particularly strong and exceeding requirements. Multidomain support is rated substantially lower
given the vendor's historical focus solely on customer/party data, with limited functionality for, and experience with, other data domains. The resulting
scores reflect good relevance and ability to meet or exceed requirements for all the key use cases, with big data and analytics, operational data quality,
and data integration showing the strongest applicability. The master data management use case receives the lowest score due to the vendor's domain-
specific focus, but organizations intending to deploy these tools in the customer/party domain will find them highly relevant.

Context
Digital business, fueled by increasingly complex information sources and sophisticated analytics, represents a massive opportunity for organizations in
all industries. As business leaders seek to capitalize on these opportunities to fulfill a digital business agenda, they turn to enabling technologies that
support their business objectives. This synergy — which sees business opportunities fueling technology improvements, that in turn gives rise to new
business models and markets — has been the main undercurrent in the software market, generating strong growth in the past few years. We expect this
trend to continue in the medium to long term.

It is in this context that data quality technologies and practices must be considered. Organizations often have an understanding of what they need to
achieve their strategic objectives, which are typically revenue growth, operational cost reduction, adherence to regulations, and better customer
experience and retention. One requirement is that the information a company holds on its customers, products, suppliers and assets — and their
interrelationships — be fit for purpose. Where this isn't the case, efforts to achieve objectives are impeded, which results in less value being delivered to
shareholders, reduced competitiveness, rising operational costs, loss of customers to competitors, and fines due to noncompliance with industry and
legal regulations.

The data quality tools market remains dynamic, owing to its growth in size and volatility on both the supply side and the demand side. New vendors
continue to enter the space, with differentiated product capabilities that also align well with market demand. The positioning of MIOSoft and RedPoint in
the use-case rankings reflect this trend. We see high demand for data quality tools, including from midsize organizations (which traditionally tended not
to buy them). This demand is driven partly by activities in the fields of business intelligence and big data analytics (analytical scenarios), master data
management (operational scenarios) and digital business. Also contributing to demand are information governance programs, which are growing in
number, and requirements to support ongoing operations and data migrations.
Specifically, the following set of most-significant use cases for data quality tools has emerged:
Master data management

Operational/transactional data quality


Information governance

Data integration
Data migration

Big data and analytics


Each of these use cases require emphasis (and thus weighting) on a different combination of critical capabilities for the tools, as detailed within the Use
Cases section. This means that versatility and strength in many areas is critical as use cases grow more diverse, and organizations grow concerned
about data quality across a wider range of use cases in their enterprise. The critical capabilities for data quality tools defined in this research represent
the most important of these functional characteristics given the trends in data quality demand in the market over the next several years.

Product/Service Class Definition


This market includes vendors that offer stand-alone software products to address the most-critical functional requirements of the data quality
improvement discipline, which are:
Data profiling and data quality measurement: The analysis of data to capture statistics (metadata) that provide insight into the quality of data and
help to identify data quality issues.
Parsing and standardization: The decomposition of text fields into component parts and the formatting of values into consistent layouts, based on
industry standards, local standards (for example, postal authority formats for address data), user-defined business rules, and knowledge bases of
values and patterns.

Generalized cleansing: The modification of data values to meet domain restrictions, integrity constraints, or other business rules that define when the
quality of data is sufficient for an organization.
Matching: The identifying, linking or merging of related entries within or across sets of data.

Monitoring: The deployment of controls to ensure that data continues to conform to business rules that define data quality for an organization.
Issue resolution and workflow: The identification, quarantining, escalation and resolution of data quality issues through processes and interfaces that
enable collaboration with key roles, such as the data steward.
Enrichment: The enhancement of the value of internally held data by appending related attributes from external sources (for example, consumer
demographic attributes and geographic descriptors).

The tools provided by vendors in this market are generally used by organizations for internal deployment within their IT infrastructures, and more broadly
across the business by non-IT roles. They use them to directly support transactional processes that require data quality operations and to enable staff in
data-quality-oriented roles (such as data stewards) to perform data quality improvement work. Off-premises solutions — in the form of hosted data
quality offerings, SaaS delivery models and cloud services — continue to evolve and grow in popularity.
Critical Capabilities Definition
Critical Capabilities Definition
Matching, Linking & Merging
Identifying, linking or merging related entries within or across sets of data via a variety of algorithmic and rule-based approaches.

A common requirement in many data quality tools use cases is the ability to identify relationships between records, and determine whether or not they
are related or represent the same instance of a business concept. Sophisticated capabilities for matching — which can also compensate for the wide
variety of semantics and representations in the typical large enterprise — are increasingly valuable.
Multidomain Support
The ability to address multiple data subject areas (such as various master data domains), and depth of packaged support for specific subject areas.
While the roots of the data quality tools market are largely grounded in requirements to verify the quality of customer and party data types, demand is
now very diverse. Other master data domains, as well as a wide range of scenarios beyond master data, must be supported by the tools.

Parsing, Standardizing & Cleansing


The decomposition and formatting of values based on industry standards, local standards, user-defined business rules, and knowledge bases of values
and patterns. Includes the modification of data values to meet domain restrictions, integrity constraints or other business rules.

Parsing, standardization and cleansing represent the fundamental building blocks of data quality improvement — performing the core operations of
manipulating the syntax and semantics of data to meet "fit-for-purpose" requirements.

Profiling
The analysis of data attributes and datasets to capture statistics (metadata) that provide insight into the quality of data and help to identify data quality
issues.

Data profiling functionality is increasingly critical as organizations wish to expose the facts about quality of data in the enterprise, help stakeholders to
clearly understand levels of data quality, and rapidly identify shifts in the shape of data they are using to proactively identify new data quality flaws. The
ability to analyze diverse sets of data, and to generate metadata and statistics that can be readily assessed to drive data quality improvement efforts, is
increasingly important to buyers in this market.
Scalability & Performance
The ability to deliver suitable throughput and response times to satisfy performance SLAs given increasingly substantial transaction and data volumes.
Data volumes continue to escalate, and the complexity of the application and data landscape continues to grow. New types of solutions, such as those
engaging with the Internet of Things, are fueling this trend. As a result, data quality tools must be able to deal with highly complex and large-volume
scenarios while delivering adequate throughput, response times and reliability.
Usability
Suitability of the tools for use by the range of roles they must enable, with a particular emphasis on engagement of nontechnical roles outside the IT
organization.
As the use cases grow more diverse, organizations seek tools that enable rapid deployment, faster time to value, and superior adaptability. In addition,
as more individuals outside of IT begin to engage with these tools (data stewards and information governance stakeholders, for example), complexity
and technical knowhow become a barrier and inhibitor to use, leading buyers to seek tools with better ease-of-use.

Visualization
Presentation of data profiling, monitoring and operations results and activity — in the form of reports, dashboards or other representation metaphors —
and the openness and configurability of these capabilities.

The increasing desire for business-side stakeholders, information stewards and other nontechnical roles to engage in the data quality improvement
process means that new functionality is needed. These roles require the capability to visually assess the state of data quality, identify issues, and track
data quality metrics over time.

Workflow
Monitoring and identification, quarantine, escalation and event-based resolution of data quality issues through processes and interfaces supporting key
data-quality-related roles (data stewards, data owners or data quality analysts, for example).

With responsibility for stewardship of data moving out of IT and into the lines of business, and the desire to have a more formalized process around
identifying and resolving data quality issues, workflow functionality grows more critical. This capability enables the design and deployment of an
automated set of activities by which roles, such as the information steward, can most effectively support resolution of data quality issues in a managed
and repeatable manner.

Use Cases
Big Data & Analytics
The ingestion, correlation and interpretation of big data sources in support of operational analytics, performance management, sentiment analysis and
other analytic scenarios.
Big data scenarios increasingly involve the combination of diverse sets of data, many of which will come from outside the enterprise and are of
unknown quality. Therefore, this use case has a heavier emphasis on data profiling and matching capabilities. In addition, performance and scalability
are key.
Data quality capabilities are a useful supporting component of analytic solutions. This use case is focused on the degree to which data quality tools
support this use case. It is not an assessment of the big data analytics capabilities of the vendors rated in this Critical Capabilities research. Vendors
appearing here may or may not sell analytics and other big data solutions, and their assessment is independent of such offerings.
Data Integration
Capabilities applied within data integration processes and architectures, in support of both data consolidation for analytics and operational application
integration.
Data integration initiatives cannot be successful without mechanisms to assure the quality of the data being integrated and delivered. Core data quality
operations, performed in the face of increasing complexity and greater numbers of data sources, as well as performance/scalability, are key to success.
Data quality capabilities are one component of a comprehensive data integration solution. This use case is focused on the degree to which data quality
tools support this use case. It is not an assessment of the data integration capabilities of the vendors rated in this Critical Capabilities research. Vendors
appearing here may or may not sell data integration solutions, and their assessment is independent of such offerings.

Data Migration
Capabilities used in the context of a data conversion, migration or modernization initiative (such as conversion from legacy to modern applications).
These initiatives require a strong focus on identifying data quality issues upfront — therefore this use case emphasizes the critical capabilities of data
profiling and visualization, while also demanding strong scalability and performance to support large-scale migration efforts.
Data quality capabilities are one component of a comprehensive data migration solution. This use case is focused on the degree to which data quality
tools support this use case. It is not an assessment of the data migration capabilities of the vendors rated in this Critical Capabilities research. Vendors
appearing here may or may not sell data migration solutions or services, and their assessment is independent of such offerings.
Information Governance Initiatives
Capabilities supporting the goals of an information governance initiative, and its associated roles and stakeholders (data stewards, for example).
Information leadership roles (such as the chief data officer) and initiatives focused on increasing the value of information assets, are being established
by more enterprises. A strong focus on information governance requires capabilities for data profiling, visualization, workflow and superior usability to
support information governance roles at all levels. This includes information stewards, members of information governance boards/councils, and other
business-side stakeholders. These roles are increasingly supported by nontechnical individuals.
Master Data Management
Capabilities applied to various key master data domains in the context of master data management (MDM) initiatives and the deployment of custom or
packaged MDM solutions.
The master data management use case stresses the matching, workflow and multidomain capabilities of the tools most heavily, due to the common
requirements to resolve master data authored in disparate sources. The tools must also support the work tasks of information stewards who perform
the authoring and maintenance, and deal with an increasingly wide range of master data domains.
Data quality capabilities are one component, among many, that make up a comprehensive master data management solution. This use case is focused
on the degree to which data quality tools support this use case. It is not an assessment of the master data management capabilities of the vendors
rated in this Critical Capabilities research. Vendors appearing here may or may not sell master data management solutions, and their assessment is
independent of such offerings.
Operational/Transactional Data Quality
Capabilities applied to controlling the quality of data created by, maintained by, and housed within transactional applications.
As data quality controls are increasingly applied upstream, close to the source of data, the ability to embed data quality capabilities closely with
operational applications is key. This use case emphasizes the core data quality operations (parsing, standardization, cleansing and matching) as well as
the need for strong scalability and performance in the face of ever-increasing transaction volumes.

Vendors Added and Dropped


Added: BackOffice Associates (recently added to the companion Magic Quadrant).
Dropped: X88 (due to acquisition by Experian).

Inclusion Criteria
In the context of this Critical Capabilities analysis, we are using the same inclusion criteria as our "Magic Quadrant for Data Quality Tools."
To be included, vendors had to meet the following criteria:
They must offer stand-alone packaged software tools or cloud-based services (not only embedded in, or dependent on, other products and services)
that are positioned, marketed and sold specifically for general-purpose data quality applications.
They must deliver functionality that addresses, at minimum, profiling, parsing, standardization/cleansing, matching and monitoring. Vendors that offer
narrow functionality (supporting only address cleansing and validation, or dealing only with matching, for example) are excluded because they do not
provide complete suites of data quality tools. Specifically, vendors must offer all of the following:
Profiling and visualization: They must provide packaged functionality for attribute-based analysis (for example, minimum, maximum, frequency
distribution and so on) and dependency analysis (cross-table and cross-dataset analysis). Profiling results must be exposed in either a tabular or a
graphical user interface delivered as part of the vendor's offering. Profiling results must be able to be stored and analyzed across time boundaries
(trending).
Parsing: They must provide packaged routines for identifying and extracting components of textual strings, such as names, mailing addresses and
other contact-related information. Parsing algorithms and rules must be applicable to a wide range of data types and domains, and must be
Big Data & Data Data Information Governance Master Data Operational/Transactional
configurable and extensible by the customer.
Critical Capabilities Analytics Integration Migration Initiatives Management Data Quality
Matching: They must provide configurable matching rules or algorithms that enable users to customize their matching scenarios, audit the results,
and tune the matching scenarios over time. The matching functionality must not be limited to specific data types and domains, nor limited to the
number of attributes that can be considered in a matching scenario.
Standardization and cleansing: They must provide both packaged and extensible rules for handling syntax (formatting) and semantic (values)
transformation of data to ensure conformance with business rules.
Monitoring: They must support the ability to deploy business rules for proactive, continuous monitoring of common and user-defined data
conditions.
They must support this functionality with packaged capabilities for data in more than one language and for more than one country.
They must support this functionality both in scheduled (batch) and interactive (real-time) modes.
They must support large-scale deployment via server-based runtime architectures that can support concurrent users and applications.
They must maintain an installed base of at least 100 production, maintenance/subscription-paying customers for the data quality product(s) meeting
the above functional criteria. The production customer base must include customers in more than one region (North America, Latin America, EMEA
and Asia/Pacific).

They must be able to provide reference customers that demonstrate multidomain and/or multiproject use of the product(s) meeting the above
functional criteria.
Vendors meeting the above criteria but limited to deployments in a single specific application environment, industry or data domain are excluded from
this Critical Capabilities research.
There are many vendors of data quality tools, but most do not meet the above criteria and are therefore not included. Many vendors provide products
that deal with one very specific data quality problem — such as address cleansing and validation — but that cannot support other types of application, or
that lack the full breadth of functionality expected of today's data quality solutions. Others provide a range of functionality, but operate only in a single
country or support only narrow, departmental implementations. Others may meet all the functional, deployment and geographic requirements but are at
a very early stage in their "life span" and, therefore, have few, if any, production customers.
Table 1.   Weighting for Critical Capabilities in Use Cases
Big Data & Data Data Information Governance Master Data Operational/Transactional
Critical Capabilities Analytics Integration Migration Initiatives Management Data Quality

Matching, Linking &


Merging 20% 20% 10% 5% 25% 10%

Multidomain Support 5% 5% 5% 10% 20% 15%

Parsing, Standardizing &


Cleansing 5% 20% 20% 5% 10% 20%

Profiling 20% 10% 20% 25% 5% 5%

Scalability &
Performance 15% 20% 15% 5% 5% 25%

Usability 15% 10% 10% 15% 10% 5%

Visualization 15% 5% 10% 20% 5% 5%

Workflow 5% 10% 10% 15% 20% 15%

Total 100% 100% 100% 100% 100% 100%

As of November 2015

Source: Gartner (December 2015)

This methodology requires analysts to identify the critical capabilities for a class of products/services. Each capability is then weighed in terms of its
relative importance for specific product/service use cases.

Critical Capabilities Rating


Each of the products/services has been evaluated on the critical capabilities on a scale of 1 to 5; a score of 1 = Poor (most or all defined requirements
are not achieved), while 5 = Outstanding (significantly exceeds requirements).
Critical BackOffice BackOffice Information Information
Innovative Innovative Pitney
Capabilities
Use Cases Ataccama Ataccama
Associates Associates
DataMentorsDataMentors
Experian IBMExperian
Informatica
IBM
Table Informatica
Builders
2.   Product/Service Builders
Systems
Rating Systems
MIOsoft
on Critical Neopost
MIOsoft Oracle
Capabilities NeopostBowes
Oracle
Critical BackOffice Information Innovative Pitney
Capabilities Ataccama Associates DataMentors Experian IBM Informatica Builders Systems MIOsoft Neopost Oracle Bowes

Matching, 3.7 3.0 4.5 3.8 4.3 4.2 4.1 4.3 4.4 4.6 3.5 4.0
Linking &
Merging

Multidomain 2.6 2.5 1.3 2.1 3.2 3.5 2.5 1.8 2.8 1.9 2.8 1.5
Support

Parsing, 3.9 4.1 4.0 4.1 4.5 4.2 4.1 4.2 4.4 4.2 4.3 4.1
Standardizing
& Cleansing

Profiling 4.7 3.6 4.0 4.8 4.4 4.8 4.6 3.4 4.4 4.1 4.2 3.4

Scalability & 4.2 4.0 4.4 4.2 4.6 4.2 4.4 4.3 4.5 3.9 3.4 4.0
Performance

Usability 4.1 4.1 4.1 4.0 3.4 3.7 4.1 4.2 4.8 3.7 3.8 3.7

Visualization 4.0 3.4 3.8 3.9 4.0 3.6 3.8 2.8 4.4 4.4 3.2 3.8

Workflow 3.7 3.3 3.7 3.1 3.9 4.0 3.9 3.1 4.3 3.1 3.0 3.5

Source: Gartner (December 2015)

Table 3 shows the product/service scores for each use case. The scores, which are generated by multiplying the use-case weightings by the
product/service ratings, summarize how well the critical capabilities are met for each use case.

Table 3.   Product Score in Use Cases


BackOffice Information Innovative
Use Cases Ataccama Associates DataMentors Experian IBM Informatica Builders Systems MIOsoft Neopost Oracle

Big Data & Analytics 4.04 3.54 4.00 4.00 4.12 4.11 4.11 3.69 4.39 4.00

Data Integration 3.94 3.62 4.02 3.91 4.21 4.13 4.10 3.86 4.37 3.95

Data Migration 4.03 3.65 3.94 4.00 4.19 4.16 4.12 3.70 4.37 3.92

Information Governance
Initiatives 4.00 3.50 3.71 3.86 3.99 4.06 3.99 3.33 4.29 3.75

Master Data Management 3.63 3.28 3.55 3.45 3.94 3.97 3.77 3.42 4.11 3.56

Operational/Transactional
Data Quality 3.79 3.55 3.70 3.67 4.14 4.04 3.92 3.60 4.19 3.64

Source: Gartner (December 2015)

To determine an overall score for each product/service in the use cases, multiply the ratings in Table 2 by the weightings shown in Table 1.

Evidence
This research is based on:
Extensive data on functional capabilities, customer-base demographics, financial status, pricing and other quantitative attributes gained via an RFI
process engaging vendors in this market.

Interactive briefings in which the vendors provided Gartner insight on their product capabilities.
A Web-based survey of reference customers provided by each vendor, which captured data on usage patterns, levels of satisfaction with major
product functionality categories, various nontechnology vendor attributes (such as pricing, product support and overall service delivery) and more. In
total, 329 organizations across all major world regions provided input on their experiences with vendors and tools in this manner.
Feedback about tools and vendors captured during conversations with users of Gartner's client inquiry service.

Critical Capabilities Methodology


This methodology requires analysts to identify the critical capabilities for a class of products or services. Each capability is then weighted in terms of its
relative importance for specific product or service use cases. Next, products/services are rated in terms of how well they achieve each of the critical
capabilities. A score that summarizes how well they meet the critical capabilities for each use case is then calculated for each product/service.
"Critical capabilities" are attributes that differentiate products/services in a class in terms of their quality and performance. Gartner recommends that
users consider the set of critical capabilities as some of the most important criteria for acquisition decisions.
In defining the product/service category for evaluation, the analyst first identifies the leading uses for the products/services in this market. What needs
are end-users looking to fulfill, when considering products/services in this market? Use cases should match common client deployment scenarios.
These distinct client scenarios define the Use Cases.
The analyst then identifies the critical capabilities. These capabilities are generalized groups of features commonly required by this class of
products/services. Each capability is assigned a level of importance in fulfilling that particular need; some sets of features are more important than
others, depending on the use case being evaluated.
Each vendor’s product or service is evaluated in terms of how well it delivers each capability, on a five-point scale. These ratings are displayed side-by-
side for all vendors, allowing easy comparisons between the different sets of features.
Ratings and summary scores range from 1.0 to 5.0:
1 = Poor: most or all defined requirements not achieved
2 = Fair: some requirements not achieved

3 = Good: meets requirements


4 = Excellent: meets or exceeds some requirements
5 = Outstanding: significantly exceeds requirements
To determine an overall score for each product in the use cases, the product ratings are multiplied by the weightings to come up with the product score
in use cases.
The critical capabilities Gartner has selected do not represent all capabilities for any product; therefore, may not represent those most important for a
specific use situation or business objective. Clients should use a critical capabilities analysis as one of several sources of input about a product before
making a product/service decision.

(http://gtnr.it/1KsfgQX)
© 2015 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. or its affiliates. This publication may not be reproduced or
distributed in any form without Gartner's prior written permission. If you are authorized to access this publication, your use of it is subject to the Usage Guidelines for
Gartner Services (/technology/about/policies/usage_guidelines.jsp) posted on gartner.com. The information contained in this publication has been obtained from
sources believed to be reliable. Gartner disclaims all warranties as to the accuracy, completeness or adequacy of such information and shall have no liability for errors,
omissions or inadequacies in such information. This publication consists of the opinions of Gartner's research organization and should not be construed as statements of
fact. The opinions expressed herein are subject to change without notice. Gartner provides information technology research and advisory services to a wide range of
technology consumers, manufacturers and sellers, and may have client relationships with, and derive revenues from, companies discussed herein. Although Gartner research
may include a discussion of related legal issues, Gartner does not provide legal advice or services and its research should not be construed or used as such. Gartner is a
public company, and its shareholders may include firms and funds that have financial interests in entities covered in Gartner research. Gartner's Board of Directors may
include senior managers of these firms or funds. Gartner research is produced independently by its research organization without input or influence from these firms, funds or
their managers. For further information on the independence and integrity of Gartner research, see "Guiding Principles on Independence and Objectivity.
(/technology/about/ombudsman/omb_guide2.jsp)"

About (http://www.gartner.com/technology/about.jsp)
Careers (http://www.gartner.com/technology/careers/)
Newsroom (http://www.gartner.com/newsroom/)
Policies (http://www.gartner.com/technology/about/policies/guidelines_ov.jsp)
Privacy (http://www.gartner.com/privacy)
Site Index (http://www.gartner.com/technology/site-index.jsp)
IT Glossary (http://www.gartner.com/it-glossary/)
Contact Gartner (http://www.gartner.com/technology/contact/contact_gartner.jsp)

You might also like