You are on page 1of 11

Licensed for individual use only

Data Virtualization Or Data Fabric: Which Is Right


For You?
by Noel Yuhanna
March 24, 2020 | Updated: March 27, 2020

Why Read This Report Key Takeaways


Data-driven organizations use data virtualization Data Virtualization Offers A Fast And Easy Way
and data fabric architectures to get value from To Integrate Data Silos
data quickly and to support new business Data virtualization creates a data abstraction
requirements such as real-time and integrated layer by connecting, gathering, and transforming
insights. While many organizations use one or the data silos to support real-time and near real-
other architecture, some use both to support their time insights. It gives you direct access to
digital business initiatives. This brief for enterprise transactional and operational systems in real time
architect professionals explains the differences whether on-premises or cloud. You can also find
between the two architectures and illustrates what ways to obtain insights that aren’t practical with
use cases are often deployed by organizations. traditional batch-oriented ETL technology.

Data Fabric Delivers Comprehensive Data


Management To Support Broader Use Cases
Data fabric focuses on broader business
use cases such as customer 360, customer
intelligence, and IoT analytics. Data fabric
supports many more stack components such as
data catalog, data preparation, data governance,
and data modeling, delivering end-to-end data
management capabilities.

Leverage An Architecture That Fits Your


Business Needs
Both architectures deliver agile, self-service,
and real-time insights across data silos. If you’re
supporting simple data integration requirements,
go with data virtualization and extend to data fabric
over time. If you’re dealing with complex data sets
or need to support broader business use cases
such as customer 360, IoT analytics, or fraud
detection, consider data fabric from the start.
This PDF is only licensed for individual use when downloaded from forrester.com or reprints.forrester.com. All other distribution prohibited.
forrester.com
For Enterprise Architecture Professionals

Data Virtualization Or Data Fabric: Which Is Right For You?

by Noel Yuhanna
with Gene Leganza, Daniel Weber, and Peggy Dostie
March 24, 2020 | Updated: March 27, 2020

Data Virtualization And Data Fabric Overcome New Data Challenges


Connecting data across on-premises and multiple cloud sources is often not trivial, especially when
supporting large data sets, complex data structures, and real-time needs. Poorly integrated business
data leads to poor business decisions, reduces customer satisfaction and competitive advantage,
and slows down innovation and growth. In addition, organizations want to democratize data to allow
business users to access and leverage all data themselves to support faster and more accurate
business decisions. As a result, organizations are now investing in self-service and real-time data
integration architectures to support trusted and timely insights. Organizations are investing in two key
data integration architectures to overcome data challenges:

›› Data virtualization that can quickly and easily integrate data from silos. Data virtualization
creates a data abstraction layer by connecting, gathering, and transforming data from various
sources to support real-time dashboards and insights (see Figure 1). This technology evolved
primarily because of limitations in the traditional batch-oriented ETL technology that failed to move
data quickly to support the growing demand for real-time analytics. With data virtualization, you
can directly access transactional and operational systems and perform real-time integration and
transformations to support BI and analytical needs.1 The primary use cases for data virtualization
are real-time reporting, ad hoc queries, and search across disparate data sources.

›› Data fabric that can support end-to-end data management to enable more use cases.
Data fabric focuses on addressing broader business use cases such as customer 360, customer
intelligence, and IoT analytics. It includes many more components such as data catalog, data
transformation, data preparation, data discovery, data governance, and data modeling, thus
providing the ability to support end-to-end data management capabilities (see Figure 2).2 Forrester
often encounters configurations that include data virtualization as a data source to the data fabric,
enabling firms to leverage the best of both worlds (see Figure 3). Unlike data virtualization, data
fabric architecture is still evolving, largely because the stack components in the fabric need to work
together in tandem to support the business use cases. Forrester finds that often organizations
evolve their information architectures by leveraging data virtualization initially and then transitioning
toward a data fabric architecture over time.

Forrester Research, Inc., 60 Acorn Park Drive, Cambridge, MA 02140 USA


+1 617-613-6000 | Fax: +1 617-613-5000 | forrester.com
© 2020 Forrester Research, Inc. Opinions reflect judgment at the time and are subject to change. Forrester®,
Technographics®, Forrester Wave, TechRadar, and Total Economic Impact are trademarks of Forrester Research,
Inc. All other trademarks are the property of their respective companies. Unauthorized copying or distributing
is a violation of copyright law. Citations@forrester.com or +1 866-367-7378
For Enterprise Architecture Professionals March 24, 2020 | Updated: March 27, 2020
Data Virtualization Or Data Fabric: Which Is Right For You?

FIGURE 1 Data Virtualization Architecture

Dashboards, BI, reporting apps Consumers

ODBC/JDBC/REST/API Interface/protocol

Data virtualization

Cache

Metadata Data
security,
governance,
Data transformation, data integration monitoring

Connectors, adapters

Data sources (on-premises, cloud)

© 2020 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law. 2
Citations@forrester.com or +1 866-367-7378
For Enterprise Architecture Professionals March 24, 2020 | Updated: March 27, 2020
Data Virtualization Or Data Fabric: Which Is Right For You?

FIGURE 2 Data Fabric Architecture

Global distributed platform, in-memory, embedded, Global data access


self-service, APIs,
AI/ML

Data management Data modeling, preparation, curation, graph engine Data discovery
• Metadata/catalog AI/ML
• Data security
• Data governance Transformation, integration, cleansing Data orchestration
• Data processing
AI/ML
• Data quality
• Data lineage Hadoop Data lake Data processing/
Data platform —
• Policies NoSQL processing EDW/BDW persistence
Spark AI/ML

Data ingestion/
Ingestion, streaming, data movement
streaming
AI/ML AI/ML

Data sources
Cloud On-premises

© 2020 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law. 3
Citations@forrester.com or +1 866-367-7378
For Enterprise Architecture Professionals March 24, 2020 | Updated: March 27, 2020
Data Virtualization Or Data Fabric: Which Is Right For You?

FIGURE 3 Data Virtualization Vs. Data Fabric

DF = Enterprise data management


Connects, gathers, transforms, prepares,
curates data sets to support broader use cases

Distributed in-memory
Catalog AI/ML
Data fabric Data pipeline
Data lake/object store
(repositories)
Push-down
processing

Data lake

Object Store

DV = Data
Data virtualization abstraction layer Data virtualization
Connects, gathers,
transforms data

Choose Between Data Virtualization And Data Fabric, Depending On The Use Case

While data virtualization and data fabric provide some similar benefits, the technologies — and the effort
needed to implement them — are quite different. To choose between them, consider these key factors:

›› Time-to-value. Organizations often find data virtualization as the fastest way to integrate disparate
data sources, whether on-premises or cloud. Data virtualization offers many connectors to various
data sources, and can transform and curate data for visualization, reports, and dashboards. On the
other hand, to be successful with data fabric requires more elaborate planning; a team comprised
of enterprise architect, data architect, developers, data security professionals, and business
analysts; and an initial target business use case.

›› Availability of skilled data architects and data engineers. Today, most large and complex data
fabric deployments are done by SI/consulting organizations, primarily because several components
within a fabric need to work together to deliver an outcome. However, with intelligent and highly
automated data fabrics emerging, the need for consulting services will decline in the coming years.3

© 2020 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law. 4
Citations@forrester.com or +1 866-367-7378
For Enterprise Architecture Professionals March 24, 2020 | Updated: March 27, 2020
Data Virtualization Or Data Fabric: Which Is Right For You?

›› Target use cases and the target future state architecture. The key differences between data
virtualization and data fabric are largely around the use cases and end-to-end data management
capabilities that the fabric provides (see Figure 4). A data source to data fabric can be data
virtualization, enabling organizations to extend their platforms easily. Prioritize your architecture
build-out based on business need. For example, Forrester recommends leveraging data
virtualization in one of the “enhance/harden/re-architect” BI governance process steps as either a
target architecture or as a transition point to data fabric based architecture.4

›› Investment needed to get started. Data virtualization is a low-cost integration solution that
does not require huge investments, especially when the need is simple, such as federating and
transforming a few data sources to deliver real-time or near real-time insights. Organizations often
see data fabric ROI to be significantly longer when compared to data virtualization.

© 2020 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law. 5
Citations@forrester.com or +1 866-367-7378
For Enterprise Architecture Professionals March 24, 2020 | Updated: March 27, 2020
Data Virtualization Or Data Fabric: Which Is Right For You?

FIGURE 4 Comparing EDV And EDF Functionality

Business Data
functionality Description virtualization Data fabric

Use cases Top use cases deployed Reporting, ad Customer 360,


hoc queries IoT analytics,
across fraud detection,
distributed global
data, analytics,
visualization, real-time
and BI analytics, and
data science

Time-to-value The time it takes to deploy a simple use case 1-4 weeks for 1-3 months for
from start to finish simple use simple use
case case

Self-services for Leveraging the platform by business users to Yes Yes


business users support data and analytics

Initial cost Initial investment to support a simple use case Typically Typically
50K-100K 200K-400K

Vendors Vendors that offer the capabilities Data Virtuality, Cambridge


(sample) Denodo, Semantics,
Hitachi, IBM, Cloudera,
Lucidworks, DataRobot,
Molecula, Denodo,
Oracle, SAP, Hitachi, IBM,
Teradata, Informatica,
Tibco, Infoworks,
Vantara Oracle, Qlik,
SAP, Solix,
Talend, Tibco,
Vantara

SIS/consulting SI/consulting vendors that support the Accenture, Accenture,


(sample) deployments Bearing Point, Capgemini,
Capgemini, Cognizant,
Cognizant, Deloitte, HCL,
Deloitte, IBM IBM, Infosys,
TCS, Wipro

© 2020 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law. 6
Citations@forrester.com or +1 866-367-7378
For Enterprise Architecture Professionals March 24, 2020 | Updated: March 27, 2020
Data Virtualization Or Data Fabric: Which Is Right For You?

FIGURE 4 Comparing EDV And EDF Functionality (Cont.)

Development Data
functionality Description virtualization Data fabric

Data pipeline The platform supports data pipeline for Limited Yes
(integrated) ingesting data and also performs
transformations

Data modeling Data modeling to model business data, and Limited Yes
prebuilt data models

Data catalog Support for data catalog within the platform No Yes

AI/ML Built-in AI/ML capabilities to support No Yes


capabilities — development
development

Data type Type of data supported such as structured, Mostly All kinds of
unstructured, or semi-structured structured data types
data

Data The level of data management capabilities built Basic Comprehensive


management into the platform
capabilities

Data quality Whether or not data quality including data No Yes


cleansing is supported within the platform

Data The level of data connectivity/ingestion Extensive Extensive


connectivity/ capabilities provided
ingestion

Data The nature of data transformation capabilities Basic Comprehensive


transformation built 1into the platform

Data Support of data integration capabilities within Yes Yes


integration the platform

API access Offers the ability to access data using various Limited Comprehensive
API programming interfaces

© 2020 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law. 7
Citations@forrester.com or +1 866-367-7378
For Enterprise Architecture Professionals March 24, 2020 | Updated: March 27, 2020
Data Virtualization Or Data Fabric: Which Is Right For You?

FIGURE 4 Comparing EDV And EDF Functionality (Cont.)

Deployment Data
functionality Description virtualization Data fabric

Push-down The platform provides the ability to push down No Yes


processing data processing to various platforms such as
data lakes, object stores, EDW, etc.

Caching/ The platform has the capability to support Optional Extensive


in-memory distributed cache/in-memory to deliver
low-latency access to critical data.

Data security/ Supports data security/governance within the Yes Yes


governance platform and can be enabled through policies

Data Can persist data from various sources when Limited Yes
persistence needed

AI/ML Built-in AI/ML capabilities to support No Yes


deployment deployment

Geodistributed Supports geodistributed architecture out of the No Yes


box

Data preparation Offers data preparation capabilities No Yes

Data access Has data access capabilities from various


industry standard protocols such as ODBC, Yes Yes
JDBC, and others

© 2020 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law. 8
Citations@forrester.com or +1 866-367-7378
For Enterprise Architecture Professionals March 24, 2020 | Updated: March 27, 2020
Data Virtualization Or Data Fabric: Which Is Right For You?

Engage With An Analyst


Gain greater confidence in your decisions by working with Forrester thought leaders to apply
our research to your specific business and technology initiatives.

Analyst Inquiry Analyst Advisory Webinar

To help you put research Translate research into Join our online sessions
into practice, connect action by working with on the latest research
with an analyst to discuss an analyst on a specific affecting your business.
your questions in a engagement in the form Each call includes analyst
30-minute phone session of custom strategy Q&A and slides and is
— or opt for a response sessions, workshops, available on-demand.
via email. or speeches.
Learn more.
Learn more. Learn more.

Forrester’s research apps for iOS and Android.


Stay ahead of your competition no matter where you are.

Endnotes
Data virtualization enables real-time integration of disparate data sources. See the Forrester report “Create A
1

Roadmap For A Real-Time, Agile, Self-Service Data Platform.”

Data fabric is the orchestration of disparate data sources intelligently and securely in a self-service manner. See the
2

Forrester report “Big Data Fabric 2.0 Drives Data Democratization.”

With AI/machine learning (ML) functionality, big data fabric enables a higher degree of automation to support
3

advanced data intelligence for new insights and simplified data sharing across users. Data fabric also learns from
data and automatically identifies data patterns and connected data to support more adaptive intelligence, driving
accelerated actionable insights as well as automatic recommendations and alerts. See the Forrester report “Big Data
Fabric 2.0 Drives Data Democratization.”

See the Forrester report “Divide (BI Governance From Data Governance) And Conquer.”
4

© 2020 Forrester Research, Inc. Unauthorized copying or distributing is a violation of copyright law. 9
Citations@forrester.com or +1 866-367-7378
forrester.com

We work with business and technology leaders to drive customer-


obsessed vision, strategy, and execution that accelerate growth.
Products and Services
›› Research and tools
›› Analyst engagement
›› Data and analytics
›› Peer collaboration
›› Consulting
›› Events
›› Certification programs

Forrester’s research and insights are tailored to your


role and critical business initiatives.
Roles We Serve
Marketing & Strategy Technology Management Technology Industry
Professionals Professionals Professionals
CMO CIO Analyst Relations
B2B Marketing Application Development
B2C Marketing & Delivery
Customer Experience ›› Enterprise Architecture
Customer Insights Infrastructure & Operations
eBusiness & Channel Security & Risk
Strategy Sourcing & Vendor
Management

Client support
For information on hard-copy or electronic reprints, please contact Client Support at
+1 866-367-7378, +1 617-613-5730, or clientsupport@forrester.com. We offer quantity
discounts and special pricing for academic and nonprofit institutions.

160343

You might also like