You are on page 1of 39

Licensed for Distribution

Hype Cycle for Storage and Data Protection Technologies, 2019

Published 11 July 2019 - ID G00389644 - 75 min read

By Analysts Julia Palmer

This Hype Cycle evaluates storage and data protection technologies in terms of their business impact,
adoption rate and maturity level to help IT leaders build more stable, scalable, efficient and agile storage
and data protection platform for digital business initiatives.

Analysis

What You Need to Know

The storage and data protection market is evolving to address new challenges in enterprise IT like
exponential data growth, changing demands for skills, rapid digitalization and globalization of business,
requirements to connect and collect everything, and expansion of data privacy and sovereignty laws.
Requirements for robust, scalable, simple and performant storage are on the rise. IT leaders are also
expecting storage to evolve from being delivered by rigid appliances in core data centers to flexible
storage platforms capable of enabling hybrid cloud data flow at the edge and in the public cloud IaaS.

Here, Gartner has assessed 23 of the most relevant storage and data protection technologies that IT
leaders must evaluate to address the fast-evolving needs of the enterprise. For more information about
how peer infrastructure and operations (I&O) leaders view the technologies aligned with this Hype
Cycle, please see “2019-2021 Emerging Technology Roadmap for Large Enterprises.”

The Hype Cycle

IT leaders responsible for storage and data protection increasingly must cope with the fast-changing
requirements of digital business, exponential growth of storage capacity, introduction of new
workloads, and the desire to leverage public cloud and enable edge capabilities. This research informs
I&O leaders and infrastructure technology vendors about new and innovative storage technologies that
are entering the market, and shows how Gartner evaluates highly hyped technologies or concepts and
how quickly enterprises are adopting innovative technologies.

Half of the technologies reviewed in the 2019 Hype Cycle are posed to mature between five and 10
years, while 60% of technologies have a potential to deliver high benefits if driven by genuine business
requirements. To provide readers with clearer, more focused research that supports their analysis and
planning, this year we have reduced the number of innovation profiles. We have only included the ones
that are most relevant to IT leaders today, as well as those having a strong link to the storage and data
protection Hype Cycle and its theme.
There are three new innovation profiles that have been added in 2019: storage class memory,
immutable data vault, and data transfer and storage edge appliances. While very different in their value
proposition, these technologies reflect IT leaders’ priorities to take advantage of new flash technologies,
improve and modernize data protection, and leverage public cloud IaaS while enabling hybrid cloud
workflow.

Fast-moving technologies this year include erasure coding, distributed file systems, hyperconvergence
and infrastructure SDS, which all continue to show healthy adoption rates, driven largely by a desire to
leverage storage software innovation to enable scalable, yet resilient, storage infrastructure based on
industry-standard hardware.

Figure 1. Hype Cycle for Storage and Data Protection Technologies, 2019

Hype Cycle for Storage and Data Protection Technologies, 2019

The Priority Matrix

The Priority Matrix maps the benefit rating for each technology against the length of time before
Gartner expects it to reach the beginning of mainstream adoption. This alternative perspective can help
users determine how to prioritize their storage hardware, software and data protection technology
investments, and adoption. In general, companies should begin with fast-moving technologies that are
rated transformational or high in business benefits and are likely to reach mainstream adoption quickly.
These technologies tend to have the most dramatic impact on business processes, revenue or cost-
cutting efforts. After these transformational technologies, users are advised to evaluate high-impact
technologies that will reach mainstream adoption status in the near term, and work downward and to
the right from there. Organizations that have not already done so should evaluate and implement
continuous data protection and virtual machine backup and recovery to drive improved resiliency and
data protection efficiency. They should also consider implementation of distributed file systems and
object storage to address the growing needs of unstructured data. Hyperconverged storage is increasing
in popularity, experiencing year-over-year growth while replacing storage arrays for enterprises looking
to improve simplicity of management and streamline implementation in the data center and at the
edge.

Figure 2. Priority Matrix for Storage and Data Protection Technologies, 2019

Priority Matrix for Storage and Data Protection Technologies, 2019

Off the Hype Cycle

The online data compression and data deduplication innovation profiles have been removed because
they were Obsolete Before Plateau.

The following profiles have been removed this year because they are not directly related to the 2019
enterprise storage and data protections theme:
Solid-state DIMMs

Hybrid DIMMs

Disaster recovery as a service (DRaaS)

Data classification

Data sanitization

Information dispersal algorithms

The following profiles have changed as stated:

Data backup for mobile devices was consolidated with enterprise endpoint backup.

SaaS archiving of messaging data was consolidated with enterprise information archiving.

Hyperconverged integrated systems and hyperconverged infrastructure were consolidated as


hyperconvergence.

Emerging data storage protection schemes was renamed erasure coding.

On the Rise

Data Transfer and Storage Edge Appliances

Analysis By: Raj Bala; Julia Palmer; Santhosh Rao

Definition: Data transfer and edge appliances are physical devices capable of transporting bulk data to
public cloud IaaS providers via package carriers rather than solely relying on a network to transfer data.
Such appliances are often designed with ruggedized cases to be self-contained shipping units. The
appliances optionally can be equipped with compute capacity in order to preprocess data before being
transported to the cloud.

Position and Adoption Speed Justification: Data transfer and edge appliances are emerging as an
efficient means of transporting large quantities of data when no network or less-than-ideal network
conditions exist. Data transfer and edge appliances are playing an important role in enabling data
transfer from the data centers or edge location to public cloud IaaS for processing and analytics. Gartner
clients are evaluating ways to enable a continuous collection of data centralized in the cloud at
significant scale for which network-based transfer is simply too limited. As data continues to grow at a
rapid pace, Gartner predicts that, within four years, 50% of enterprise-generated data will be created
and processed at the edge. Also, data transfer and edge appliances will play critical roles in enabling
customers to store, process and transfer data going forward.

User Advice: Enterprises are increasingly interested in using public cloud IaaS for an ever-expanding set
of workloads, but find migrating such workloads and the data they require to be a challenge. As a
general guide, moving 10 TBs of data in a 24-hour period requires a 10 Gbps network link. Such large
network links may not be feasible depending on the amount of new data being generated per day.

The physical movement of data is merely part of the challenge. Planning the procedure and preparing
the data take more effort and often more time than the shipment itself. Start planning these data
shipments well in advance of the date on which you need to ship the appliance.

Business Impact: Getting data to the public cloud can be challenging due to network bottlenecks. There
are distinct advantages to shipping data using transfer appliances when the data is unwieldy and the
network bandwidth is constrained. Enterprise backups, for example, can be seeded at the public cloud
target such that only incremental backups need to be sent to the cloud. Data can be collected in low or
no network conditions to then be processed using public cloud services.

Benefit Rating: Moderate

Market Penetration: 1% to 5% of target audience

Maturity: Emerging

Sample Vendors: Amazon Web Services; Google; IBM; Microsoft; Oracle

Container-Native Storage

Analysis By: Julia Palmer; Arun Chandrasekaran

Definition: Container-native storage (CNS) is specifically designed to support container workloads and
focus on addressing unique cloud-native scale and performance demands while providing deep
integration with the container orchestration systems. CNS is designed to be aligned with microservices
architecture principles and adhere to the requirements of container-native data services, such as being
hardware-agnostic, API-driven and based on distributed architecture.

Position and Adoption Speed Justification: As a result of growing interest around containers, many
container-based applications in enterprise production environments now require support for data
persistence. In order to address this demand, vendors are now delivering container-native storage
solutions and integrated systems platforms that have been specifically designed to run cloud-native
applications. The majority of those solutions are deployed as a software-only product on commodity
hardware, while some are being offered as an integrated appliance (compute and storage bundled
together). While the technology for data persistence is not yet standardized, the common foundation is
typically based on a distributed, software-defined single pool of storage, where application containers
and persistent storage services are running on the same platform, similar to a hyperconverged solution.
In addition, the entire stack is most often orchestrated with Kubernetes or, alternatively, Apache Mesos
or Swarm, to create container life cycle integration and enable self-service operations for developers.
Some of these solutions can be deployed on a wide choice of commodity hardware on-premises and as
software in the public cloud. By running on top of highly available and robust container platforms,
container-native storage frees the DevOps team from managing hardware components. However, IT
leaders need to exercise caution, as technology is constantly evolving and many vendors in this space
are early-stage startups.

User Advice: The adoption of containers is rapidly growing in the enterprise, especially for new
applications, and is starting to expand beyond the initial stateless use cases. At this stage, end users
should evaluate container-native storage systems for cloud-native applications. Container-native storage
is designed for containers and can employ data services with a level of granularity not typically possible
with a solution that is not purpose-built for these workloads. The projects that require such solutions are
usually intended to scale quickly, run at massive scale in production and must leverage systems that are
designed to handle the level of agility, portability, developer workflow integration and scale that is
involved in it.

While traditional storage, array solutions have Docker plug-ins and are suitable for supporting
monolithic applications in containers, they fall short of true integration with orchestration systems and
large-scale container deployments.

Business Impact: Containers can deliver agility in the end-to-end life cycle of deploying applications.
Containers and related DevOps tooling can streamline the process of creating, testing, deploying and
scaling applications. But applying traditional approaches to storage for an otherwise streamlined
container infrastructure can be a bottleneck to agility. Container-native storage aims to eliminate the
bottlenecks to achieving agility in the end-to-end process of building and deploying applications.

Benefit Rating: High

Market Penetration: 1% to 5% of target audience

Maturity: Emerging

Sample Vendors: Diamanti; Hedvig; OpenEBS; Portworx; Red Hat; Robin.io; StorageOS

Recommended Reading: “An I&O Leader’s Guide to Storage for Containerized Workloads”

“The Future of Software-Defined Storage in Data Center, Edge and Hybrid Cloud”

“How to Select a Storage Approach for Persistent Containers”

Hybrid Cloud Storage

Analysis By: Raj Bala; Julia Palmer

Definition: Hybrid cloud storage encompasses a number of deployment patterns with varying underlying
technologies. It can take the form of purpose-built hybrid cloud storage appliances, software-defined
storage, broader storage systems with hybrid cloud features or the use of storage technologies from
within colocation facilities connected by private network link to cloud service providers. The common
thread among the varying patterns is the notion of a seamless bridge between disparate data centers
and public cloud storage services.
Position and Adoption Speed Justification: The term “hybrid cloud storage” was first used in 2009 by
vendors in the cloud storage gateway segment to describe their nascent offerings. Those early hybrid
cloud products treated public cloud storage as an archive tier for infrequently used, low-value data. But
the current market for hybrid cloud storage has moved well past the early products in the cloud storage
gateway market. Hybrid cloud storage is now used for modern workloads that transform data using the
elasticity that public cloud compute provides. These workloads typically start off as large, bulky datasets
that require transformation to a smaller result. Examples include videos and a broad range of analytics-
oriented data. In the case of videos, artifacts of a video are collected over time and then rendered into a
final result using the compute capabilities of public cloud IaaS providers.

User Advice: Evaluate vendors of hybrid cloud storage across two imperatives: tactical and strategic
uses. The tactical approach includes uses such as tiering data to the cloud. The strategic approach
includes using public cloud compute services to transform data into usable results. Most vendors
focused on tactical use cases are unable to provide the strategic, transformational capabilities that are
emerging in the market.

Business Impact: Tactical uses of hybrid cloud storage have been available for nearly a decade. These
solutions are often designed such that data is not easily readable in the public cloud due to the opaque
storage formats used by vendors. As a result, these methods limit the full breadth of functionality that
can be unlocked in the cloud.

The strategic uses of hybrid cloud storage are often developed with modern approaches in mind. As
such, vendors have taken care to ensure that data can not only be read in the public cloud, but also
modified and synchronized back to its source. This end-to-end capability requires that providers of
hybrid cloud storage solutions integrate deeply with the cloud service provider in a manner that far
exceeds the functionality required to simply tier to the cloud.

Benefit Rating: High

Market Penetration: 1% to 5% of target audience

Maturity: Embryonic

Sample Vendors: Amazon Web Services; Microsoft; Nasuni; NetApp; Panzura; Peer Software; Qumulo;
SwiftStack

Recommended Reading: “Top Five Approaches to Hybrid Cloud Storage — An Analysis of Use Cases,
Benefits and Limitations”

“Magic Quadrant for Public Cloud Storage Services, Worldwide”

Immutable Data Vault

Analysis By: Mark Jaggers; Michael Hoeck


Definition: Immutable Data Vaults are meant to be utilized by infrastructure and operations recovery
teams to rebuild operational systems in the case of data destruction events, often associated with
malware, ransomware, or malicious insider attacks. They typically consist of data storage environments
that have immutable properties and are used in conjunction with isolated recovery environments to
cleanse data as it is restored.

Position and Adoption Speed Justification: While many financial services organizations have been rapidly
adopting immutable data vaults and focusing on building out cyber resilience capabilities, the broader
market has not been aware of these offerings. As the threat and impact of data destruction events
(malware, ransomware, insider malicious activities) grows, this will cause a broader awareness and
adoption of these capabilities.

User Advice: The idea of having a separate copy of important data on a media type resistant to change
and placed in a secure location can be traced back to the first tape cartridges stored in an off-site
document vault. Even though organizations have moved away from using tape as a deep archive, the
need for secured, immutable storage has remained, although it is important to note that immutable
data vault capabilities are independent of the underlying technology and media.

Immutable data vaults are storage environments or products but are not complete cyber resilience
capabilities unto themselves. They also need complementary isolated recovery capabilities to scan,
cleanse and repair data in order to be fully functional for operational recovery. Since immutable data
vaults are not the first line of defense for disaster recovery, a fast recovery speed may not be necessary.
However, a close proximity to recovery compute capacity is needed to help restart overall operations.
The proximity of compute capabilities differentiate immutable data vaults from data bunkers, as those
did not need compute capabilities for the required functionality.

Air-gapping of storage environments is one method of isolation, but any requirements for this need to
be evaluated. Correctly securing the storage device and any recovery catalogs or software configuration
files are also important. Be mindful that the data stored within an immutable data vault may also
contain the agent or infectious code, and the real purpose of the isolated and immutable nature is to
keep that data from becoming corrupted or deleted. Only scanning, cleansing and repairing of that data
will prevent the reinfection of other systems during the recovery and restoration process.

Business Impact: The business impact is low today. The actual impacts will vary based on regulatory
changes or audit findings for industries, the desire of individual organizations to protect their critical
data from potential destruction, and the increasing impact of malware, ransomware, and malicious
activities.

Benefit Rating: Low

Market Penetration: 1% to 5% of target audience

Maturity: Emerging
Sample Vendors: Continuity Software; Dell EMC; IBM (Business Resiliency Services); Iron Mountain;
Sheltered Harbor; Sungard Availability Services

Recommended Reading: “Market Guide for IT Resilience Orchestration”

“Critical Capabilities for Disaster Recovery as a Service”

“Magic Quadrant for Disaster Recovery as a Service”

Management Software-Defined Storage

Analysis By: Julia Palmer; John McArthur

Definition: Management software-defined storage (MSDS) coordinates the delivery of storage services
to enable greater storage agility. It can be deployed as an out-of-band technology with robust policy
management, I/O optimization and automation functions to configure, manage and provision other
storage resources. Products in the management SDS category enable abstraction, mobility,
virtualization, SRM and I/O optimization of storage resources to reduce expenses and enable portability.

Position and Adoption Speed Justification: While MSDS, for the most part, remains a vision, it could
revolutionize storage architectural approaches and storage consumption models over time. The concept
of abstracting and separating physical or virtual storage services via bifurcating the control plane (action
signals) regarding storage from the data plane (how data actually flows) is foundational to SDS. This is
achieved largely through programmable interfaces (such as APIs), which are still evolving. MSDS
requests will negotiate capabilities through software that, in turn, will translate those capabilities into
storage services that meet a defined policy or SLA. Storage virtualization abstracts storage resources,
which is also foundational to MSDS, whereas the concepts of policy-based automation and orchestration
— possibly triggered and managed by applications and hypervisors — are key differentiators between
simple virtualization and MSDS.

User Advice: MSDS targets end-user use cases where the ultimate goal is to improve or extend existing
storage capabilities and improve operating expenditure (opex). However, value propositions and leading
use cases of MSDS are not clear, as the technology itself is fragmented by many subcategories. When
looking at different products, identify and focus on use cases applicable to your enterprise, and
investigate each product for its capabilities.

Implement proofs of concept (POCs) to determine a product’s suitability for broader deployments.

The top reasons for interest in MSDS, as gathered from interactions with Gartner clients, include:

Improving the management and agility of the overall storage infrastructure through better
programmability, interoperability, automation and orchestration

Hybrid cloud enablement

Storage virtualization and abstraction


Performance improvement by optimizing and aggregating storage I/O

Better linkage of storage to the rest of IT and the software-defined data center

Opex reductions via reducing the demands of administrators

Capital expenditure (capex) reductions via more efficient utilization of existing storage systems

Despite the promise of SDS, there are potential problems with some storage point solutions that have
been rebranded as SDS to present a higher value proposition versus built-in storage features, and it
needs to be carefully examined for ROI benefits.

Business Impact: MSDS’s ultimate value is to provide broad capability in the policy management and
orchestration of many storage resources. While some management SDS products are focusing on
enabling provisioning and automation of storage resources, more-comprehensive solutions feature
robust utilization and management of heterogeneous storage services, allowing mobility between
different types of storage platforms on-premises, at the edge and in the public cloud. As a subset of
MSDS, I/O optimization products can reduce storage response times, improve storage resource
utilization and control costs by deferring major infrastructure upgrades. The benefits of MSDS are in
improved operational efficiency by unifying storage management practices and providing common
layers across different storage technologies. The operational ROI of management SDS will depend on IT
leaders’ ability to quantify the impact of improved ongoing data management, increased operational
excellence and reduction of opex.

Benefit Rating: Moderate

Market Penetration: 1% to 5% of target audience

Maturity: Emerging

Sample Vendors: DataCore Software; Dell EMC; FalconStor; HammerSpace; HubStor; IBM; Komprise;
Leonovus; Nodeum; Peer Software

Recommended Reading: “The Future of Software-Defined Storage in Data Center, Edge and Hybrid
Cloud”

“Top Five Approaches to Hybrid Cloud Storage — An Analysis of Use Cases, Benefits and Limitations”

“Competitive Landscape: Infrastructure Software-Defined Storage”

NVMe and NVMe-oF

Analysis By: Julia Palmer; Joseph Unsworth

Definition: Nonvolatile memory express (NVMe) and Nonvolatile memory express over fabrics (NVMe-
oF) are host controller and network protocols that are taking advantage of the parallel-access and low-
latency features of solid-state storage and the PCIe bus. NVMe-oF extends access to nonvolatile memory
(NVM) remote storage subsystems across a network. The specification defines a common protocol
interface and is designed to work with high-performance fabric technology including RDMA over: Fibre
Channel, InfiniBand, RoCEv2, iWARP or TCP.

Position and Adoption Speed Justification: NVMe is a storage protocol that is being used internally
within solid-state arrays and servers. It takes advantage of the latest nonvolatile memory to address the
needs of extreme-low-latency workloads. However, NVMe-oF, which requires a storage network, is still
emerging and developing at different rates depending on the network encapsulation method. Today,
many NVMe-oF offerings that use fifth generation and/or sixth generation Fibre Channel (FC-NVMe) are
available, but adoption of NVMe-oF within 25/40/50/100 Gigabit Ethernet is slower. In November 2018,
the NVMe standards body ratified NVMe/TCP as a new transport mechanism. In the future, it’s likely
that TCP/IP will evolve to be an important data center transport for NVMe. NVMe technology is fast-
evolving and replacing server-side flash used to accelerate workloads at the compute layer; but servers
have limited capacity and are managed as a silo on a server-per-server basis. NVMe technology becomes
increasingly important for next generation storage class memory (SCM) that has superior performance
attributes compared to flash. The NVMe-oF protocol can take advantage of high-speed RDMA networks
and will accelerate the adoption of next-generation storage architectures, such as disaggregated
compute, scale-out software-defined storage and hyperconverged infrastructures, bringing HPC-like
performance to the mainstream enterprise. Unlike server-attached flash storage, shared accelerated
NVMe and NVMe-oF can scale out to high capacity with high-availability features and be managed from
a central location, serving dozens of compute clients. Most vendors have already debuted at least one
NVMe-oF capable product with nearly all vendors expected to do so during 2020. Users should expect to
see diverse NVMe-based solid-state array, software-defined storage and hyperconverged integrated
system product offerings during the next two to three years.

User Advice: Buyers should clearly identify workload where the scalability and performance of NVMe-
based SSA and NVMe-oF justify the premium cost of an end-to-end NVMe deployed SSA, such as AI/ML
or transaction processing. Next, identify appropriate potential array, NIC/HBA and network fabric
suppliers to verify that interoperability testing has been performed and that reference customers are
available. Should buyers not desire the immediate performance gains and associated costs, then they
should investigate how simply and nondisruptively their existing products migration path to ensure
investment protection for the future.

During the next 12 months, most SSA vendors will offer solid-state arrays with internal NVMe storage,
followed by support of NVMe-oF connectivity to the compute hosts. HCIS vendors will deliver NVMe
storage in an integrated offering during the next 12 to 18 months, but customers need to verify the
availability of NVMe-oF networks between HCI nodes. Similarly, when customers require NVMe-oF
storage networks that encompass switches, host bus adapters (HBAs), and OS kernel drivers, IT
infrastructure modernization will be required.

This will constrain the adoption of leading-edge solutions in mainstream enterprises. However, due to
the increased interoperability and availability of FC-NVMe within the next two to three years, I&O
leaders implementing NVMe-oF within an existing Fibre Channel SAN infrastructure will have a simpler
transition than those moving to NVMe-oF with 25/50/100/400 Gigabit Ethernet and RoCEv2. Investment
protection for customers with existing fifth-generation or sixth-generation FC SANs is compelling
because customers can implement new fast NVMe storage arrays and connect via NVMe-oF to servers
while using the same SAN. Therefore, old and new storage, network switches and host bus adaptors can
run together in the same FC-based storage network (SAN), with SCSI and NVMe storage separated by
zones, as long as compatible fifth- or sixth-generation FC equipment is used. NVMe-oF over TCP/IP can
leverage existing Ethernet deployment, thereby easing the transition and providing investment
protection.

Business Impact: Today, NVMe and NVMe-oF offerings can have a dramatic impact on business use
cases where low-latency requirements are critical to the bottom line. Though requiring potential
infrastructure enhancements, the clear benefits these technologies can provide will immediately attract
high-performance computing customers who can quickly show a positive ROI. Designed for all low-
latency workloads where performance is a business differentiator, NVMe and NVME-oF will deliver
architectures that extend and enhance the capabilities of modern general-purpose solid-state arrays.
Most workloads will not need the multimillion IOPS performance that these new technologies offer, but
most customers are demanding the lower, consistent response times provided by NVMe-based systems.

Benefit Rating: High

Market Penetration: 1% to 5% of target audience

Maturity: Emerging

Sample Vendors: Dell EMC; E8 Storage; Excelero; Hitachi Vantara; IBM; Kaminario; NetApp; Pavilion Data
Systems; Pure Storage; Vexata

Recommended Reading: “Competitive Landscape: Solid-State Arrays, Worldwide”

“2019 Strategic Roadmap for Storage”

“The Future of Storage Networking — Will This Picture Ever Change”

“Cool Vendors in Storage Technologies”

At the Peak

Storage Class Memory

Analysis By: Alan Priestley; Joseph Unsworth

Definition: Storage class memory (SCM), also referred to as persistent memory, is a new class of memory
technology providing nonvolatile memory (byte or block addressable) with access speeds close to that of
traditional DRAM-based memory modules. SCM refers to the application of emerging memory
technology for storage applications, which differs from persistent memory which interfaces to the main
DDR memory channels in a compute environment.
Position and Adoption Speed Justification: Increasing dataset sizes is driving demand for large density
memory systems. While DRAM density increases on a regular cadence, cost/bit remains >10 times more
expensive than current nonvolatile memory technologies (such as NAND flash) and cannot retain its
contents without supercapacitors or battery backup. The introduction of SCM to storage arrays was
announced by leading vendors over three years ago, however, only now is SCM starting to become
generally available to customers. Today, there are no converged or hyperconverged solutions with SCM
available. SCM has manifested itself in storage environments in three ways: a small cache directly on the
SSD, being used as cache in the storage array and being used as its own tier of storage. All of these
approaches are in complement with existing flash/SSD technology and while 100% SCM array is
technically feasible today it would be prohibitively expensive for all but the most extremely performant
workloads.

To date, SCM-based SSDs all use NVMe PCIe for maximum throughput and low latency. Currently the
underlying technology has typically been 3D XPoint, developed by Intel and Micron, but STT-MRAM
(from Everspin) has also been used — mostly as a low density SSD cache. A new category of high-
performance NAND flash has emerged to rival emerging memory technologies called FastNAND.
Samsung has its ZNAND technology and Toshiba has its XL-NAND flash technology, which is a modified
SLC NAND flash to achieve greater performance and reliability, albeit at substantially higher cost than
conventional NAND technologies.

User Advice: Given the superior attributes of SCM, the technology can provide sustained, consistently
low latency and high bandwidth for extreme performance workloads where persistence and high
availability are critical. SCM is also likely to be used with lower quality flash-based technology, such as
QLC, where it can be used to complement the less performant and reliable flash technology to boost
performance. I&O leaders must understand the application workload requirements and the return on
investment with SCM-based solutions in order to justify the cost premiums. The mist also assesses the
potential SCM performance impact of disaster recover (DR) sync replication, which may negate SCM
benefits. Furthermore, I&O leaders must have a full infrastructure view before deploying the technology
to ensure that there are no infrastructure and application bottlenecks.

Business Impact: SCM will be compelling for the very highest performance application workloads such as
big data analytics for real-time analytic processing, in memory databases, AI/ML workloads and other
workloads where performance premiums can be justified. However, these workloads will also see
benefits from DIMM-based persistent memory installed in the processors main memory array, which
may offer a compelling alternative to SCM for many workloads. Early adoption will be in key verticals
such as finance, government, natural resources, biomedical and other select verticals where extreme
performance is essential.

Benefit Rating: High

Market Penetration: 1% to 5% of target audience

Maturity: Emerging
Sample Vendors: Dell EMC; E8 Storage; Hewlett Packard Enterprise; Intel; NetApp; VAST Data; Vexata

Recommended Reading: “2019 Strategic Roadmap for Storage”

“Critical Capabilities for Solid-State Arrays”

“Market Guide for Compute Platforms”

File Analysis

Analysis By: Julian Tirsu; Alan Dayley

Definition: File analysis (FA) software analyzes, indexes, searches, tracks and reports on file metadata
and file content. FA tools are offered as both on-premises and SaaS options. FA software reports on file
attributes and provides detailed metadata and contextual information to enable better data governance
and data management actions.

Position and Adoption Speed Justification: FA software is an emerging technology that assists
organizations in understanding the ever-expanding repository of unstructured “dark” data. This includes
file shares, email databases, Microsoft SharePoint, content collaboration platforms and cloud platforms,
especially the rapid adoption of Microsoft Office 365 and Google G Suite. Metadata reports include data
owner, location, duplicate copies, size, last accessed or modified, security attribute changes, file types,
and custom metadata. The primary use cases for FA software for unstructured data environments
include:

Organizational efficiency and cost optimization

Regulatory compliance

Risk mitigation

Text analytics

The desire to mitigate business risks (including security and privacy risks), identify sensitive data,
optimize storage cost and implement information governance is a key factor driving the adoption of FA
software. The hype associated with the GDPR and the desire to adhere to the numerous subarticles
associated with this European privacy regulation has greatly raised the interest and awareness for FA
software. When exposed through the use of FA software, the potential value of contextually rich
unstructured data, such as messaging or IoT data, is capturing the interest of data and analytics teams.
Key features of FA software, including the identification, classification, migration, protection,
remediation and disposition of data, have also garnered interest within a combination of IT and business
departments.

User Advice: Organizations should not only use FA software to better grasp the risk of their unstructured
data footprint, including where it resides and who has access to it, but also to expose another rich
dataset for driving business decisions. Data visualization maps created by FA software can be presented
to other parts of the organization and be used to better identify the value and risk of the data. This, in
turn, can enable IT, line-of-business and compliance organizations to make better-informed decisions
regarding classification, data governance, storage management and content migration. Once known,
redundant, outdated and trivial data can be defensibly deleted or can be migrated or quarantined,
retention policies can be applied to other data.

Business Impact: FA software reduces risk by identifying which files reside where and who has access to
them. They support remediation in areas such as the elimination or quarantining of sensitive data,
identifying and protecting intellectual property, and finding and eliminating redundant and outdated
data that may lead to unnecessary business risk. FA shrinks costs by reducing the amount of data stored.
It also assists in classifying valuable business data so it can be more easily leveraged and analyzed, and
supports e-discovery efforts for legal and regulatory investigations. In addition, FA software feeds data
into corporate retention initiatives through the utilization of standard and custom file attributes.

Benefit Rating: Moderate

Market Penetration: 5% to 20% of target audience

Maturity: Adolescent

Sample Vendors: Active Navigation; Adlib; Condrey; Ground Labs; Index Engines; SailPoint; STEALTHbits
Technologies; TITUS; Varonis; Veritas Technologies

Recommended Reading: “Market Guide for File Analysis Software”

“Why ‘Store Everything’ Is Not an Effective Information Governance Strategy”

Cloud Data Backup

Analysis By: Jerry Rozeman; Chandra Mukhyala; Michael Hoeck

Definition: Policy-based, cloud data backup tools back up and restore production data generated
natively in the cloud. The data can be generated by SaaS applications (e.g., Microsoft Office 365 or
Salesforce) or by infrastructure as a service (IaaS) compute services (e.g., Amazon Elastic Compute Cloud
[Amazon EC2] instances). Backup copies can be stored in the same or a different cloud location, or on-
premises in the data center, where restore/recovery options should be offered in terms of restore
granularity and recovery location.

Position and Adoption Speed Justification: Backup of data generated natively in public cloud is an
emerging requirement, because cloud providers focus on infrastructure high availability and disaster
recovery, but are not responsible for application or user data loss. Most SaaS applications’ natively
included, data protection capabilities are not true backup, and they lack secure access control and
consistent recovery points to recover from internal and external threats.

As Microsoft Office 365 (O365) gains more momentum, O365 backup capabilities have begun to emerge
from mainstream backup vendors and small vendors. IaaS data backup, on the other hand, is a more
nascent area that caters to organizations’ need to back up production data generated in the IaaS cloud.
Native backup of IaaS usually resorted to snapshots and scripting, which may lack application
consistency, restore options, data mobility, storage efficiency and policy-based automation. However,
more data center backup vendors now offer improved cloud storage backup capabilities that automate
snapshot management and address some cloud-native limitations.

User Advice: Before migrating critical on-premises applications to SaaS or IaaS, organizations need a
thorough understanding of cloud-native backup and recovery capabilities and should compare them to
their situations today. If the native capabilities seem to fall short (e.g., in application consistency,
security requirements and recovery point objective [RPO]), factor additional backup costs into the total
cost of ownership (TCO) calculation before migrating to the cloud. Organizations planning to use cloud-
native recovery mechanisms should ensure that their contracts with cloud providers clearly specify the
capabilities and costs associated with the following items in terms of native data protection:

Backup/restore methods — This describes how user data backup and restore are done, including any
methods to prevent users from purging their own “backup copies” and to speed up recovery after a
propagated attack, such as ransomware.

Backup/restore performance — Some users have observed poor recovery time objectives (RTOs) when
restoring or recovering data from cloud object storage.

Retention period — This measures how long cloud providers can retain native backups free of charge or
with additional cost.

Clear expectations in writing, if not service-level agreement (SLA) guarantees, regarding recovery time
objectives — RTO measures how long it takes to restore at different granular levels, such as a file, a
mailbox or an entire application.

Additional storage cost due to backup — Insist on concrete guidelines on how much storage IaaS’s
native snapshots will consume, so that organizations can predict backup storage cost.

For third-party backup tools, focus on ease of cloud deployment, policy automation for easy
management, data mobility, storage efficiency and flexible options in terms of backup/recovery
granularity and location.

Business Impact: As more production workloads migrate to the cloud (in the form of SaaS or IaaS), it has
become critical to protect data generated natively in the cloud. Deploying data protection for cloud-
based workloads is an additional investment; however, this is often an afterthought, because it was not
part of the business case. Without additional protection of cloud-based data, customers face additional
risks, due to the impact of data loss, data corruption or ransomware attacks on their data.

SaaS and IaaS providers typically offer infrastructure resiliency and availability to protect their systems
from site failures. However, when data is lost due to their infrastructure failure, the providers are not
financially responsible for the value of lost data, and provide only limited credit for the period of
downtime. When data is lost to user errors, software corruption or malicious attacks, user organizations
are fully responsible themselves. The more critical cloud-generated data is, the more critical it is for
users to provide recoverability of such data.

Benefit Rating: Moderate

Market Penetration: 5% to 20% of target audience

Maturity: Emerging

Sample Vendors: Actifio; Cohesity; Commvault; Dell EMC; Druva; Rubrik; Spanning Cloud Apps; Veeam;
Veritas

Recommended Reading: “Adopt Office 365 Backup for Damage Control and Fast Recovery After
Malicious Attacks”

“Debunking the Myth of Using EFSS for Backup”

Open-Source Storage

Analysis By: Julia Palmer; Arun Chandrasekaran

Definition: Open-source storage is a form of software-defined storage for which the source code is made
available to the public through a free distribution license. Open-source storage supports many of the
same features as proprietary storage, including support of primary, secondary and tertiary storage tiers,
as well as heterogeneous management.

Position and Adoption Speed Justification: Although open-source storage (OSS) has been around over a
decade, it has been mainly adopted by hyperscalers, technical service providers and large organization.
Recent innovations in x86 hardware and flash, combined with an innovative open-source ecosystem, are
making open-source storage attractive for cloud and big data workloads and as a potential alternative to
proprietary storage. As cloud computing, microservices application architectures, big data analytics and
information archiving push the capacity, pricing and performance frontiers of traditional scale-up
storage architectures, there has been renewed interest in open-source software as a means to achieve
high scalability in capacity and performance at lower acquisition costs.

The emergence of open-source platforms such as Apache Hadoop and Kubernetes, which are backed by
large, innovative communities of developers and vendors, together with vendors such as Red Hat
(Gluster Storage, Ceph Storage) and SUSE (ceph Storage) and DDN (Lustre), is enabling enterprises to
consider open-source storage for use cases such as cloud storage, big data, stateful Microservices
workloads and archiving. There have been some open-source storage projects for container-based
storage such as Minio.

User Advice: Although open-source storage offers a less-expensive upfront alternative to proprietary
storage, IT leaders need to measure the benefits, risks and costs accurately. Some enterprise IT leaders
often overstate the benefits and understate the costs and risks. Conversely, with the emerging maturity
of open-source storage solutions, enterprise IT buyers should not overlook the value proposition of
these solutions. IT leaders should actively deploy pilot projects, identify internal champions, train
storage teams and prepare the overall organization for this disruptive trend. Although source code can
be downloaded for free, it is advisable to use a commercial distribution and to obtain support through a
vendor, because OSS requires significant effort and expertise to install, maintain and support. IT leaders
deploying “open core” or “freemium” storage products need to carefully evaluate the strength of lock-in
against the perceived benefits. This is a model in which the vendor provides proprietary software — in
the form of add-on modules or management tools — that functions on top of OSS.

In most cases, open-source storage is not general-purpose storage. Therefore, choose use cases that
leverage the strengths of open-source platforms — for example, batch processing or a low-cost archive
for Hadoop and test/development use cases for containers — and use them appropriately. It is
important to focus on hardware design and choose cost-effective reference architectures that have
been certified by the vendors and for which support is delivered in an integrated manner. Overall, on-
premises integration, management automation and customer support should be key priorities when
selecting open-source storage solutions.

Business Impact: Open-source storage is playing an important role in enabling cost-effective, scalable
platforms for new cloud and big data workloads. Today, Gartner clients are evaluating open-source
storage across block, file and object protocols. Gartner is seeing adoption among technology firms’
service provider clients, as well as in research and academic environments. Big data, dev/test and
private cloud use in enterprises are also promising use cases for open-source storage, where Gartner is
witnessing keen interest. As data continues to grow at a frantic pace, open-source storage will enable
customers to store and maintain data, particularly unstructured data, at a lower acquisition cost, with
“good enough” availability, performance and manageability.

Benefit Rating: Moderate

Market Penetration: 1% to 5% of target audience

Maturity: Emerging

Sample Vendors: BeeGFS; Cloudera; DataDirect Networks (DDN); iXsystems; Minio; Openfiler; OpenIO;
Red Hat; SUSE; SwiftStack

Recommended Reading: “Reinvent Your Open-Source Software Strategy to Stay Relevant”

“Use Service-Level Requirements to Drive Decisions Between Commercial and Self-Support for Open-
Source Software”

“Four Steps to Adopt Open-Source Software as Part of the DevOps Toolchain”

Sliding Into the Trough

Copy Data Management

Analysis By: Chandra Mukhyala; Michael Hoeck


Definition: Copy data management (CDM) products capture application-consistent data via snapshots
from primary storage to create live “golden images” in secondary storage systems where virtual copies
in native disk format can be mounted for backup/recovery, disaster recovery or test/development. This
differs from storage-hardware-based snapshots in primary storage devices. Support for heterogeneous
primary storage is essential. Different CDM products may offer additional data management and data
reduction capabilities (e.g., compression and deduplication).

Position and Adoption Speed Justification: CDM is not a widely understood concept, although it is more
than six years old and more backup/recovery and hyperconverged integrated systems (HCIS) vendors
are promoting it. Adoption rates vary greatly among different vendors and products because of different
functions and varying sales focus. Although DevOps teams often need test/development workflow
automation, they don’t typically evaluate the same product with the backup team. In fact, the concept
and value proposition of activating backup data for test/development is foreign to many organizations.
The main challenge faced by CDM products is that they have to target different buying centers and
decision makers. The lack of products from major vendors and inconsistent use of the term “CDM”
impede greater adoption.

User Advice: Organizations face increased cost and productivity challenges, due to the management
complexities of provisioning multiple application copies for test/development. CDM products should be
evaluated to improve time to market and reduce storage waste. CDM could also be useful for
organizations that are looking for active access to secondary data sources for reporting or analytics due
to its separation from the production environment. Enterprises should also look at opportunities for
database and application archiving for storage reduction or governance initiatives to further justify
investment. Due to the short history of the new architecture and vendors, new use cases beyond the
common ones (e.g., backup and test/development enablement) are not field-proven and should be
approached with caution.

Business Impact: IT organizations have historically used different hardware and software products to
deliver backup, archive, replication, test/development, legacy application archiving and other data-
intensive services with little control or management across these services. This results in overinvestment
in storage capacity, software licenses and operational expenditure (opex) costs associated with
managing multiple solutions. CDM facilitates the use of one copy of data for many or all of these
functions via virtual copies, dramatically reducing the need for multiple physical copies of data and
enabling organizations to cut the costs associated with multiple disparate software licenses and storage
islands.

The separation of the virtual “golden image” from the production environment can facilitate aggressive
recovery point objectives (RPOs) and recovery time objectives (RTOs). In the case of test/development,
CDM improves the workflow process and operational efficiency by enabling database administrators and
application developers more self-service capabilities.

Benefit Rating: High

Market Penetration: 1% to 5% of target audience


Maturity: Emerging

Sample Vendors: Actifio; Catalogic Software; Cohesity; Delphix; Rubrik; Veritas Technologies

Recommended Reading: “Innovation Insight: Copy Data Management Accelerates Bimodal IT”

Infrastructure SDS

Analysis By: Julia Palmer; Chandra Mukhyala

Definition: Infrastructure software-defined storage (SDS) creates and provides data center services to
replace or augment traditional storage arrays. It can be deployed as a virtual machine, container or as
software on a bare-metal industry standard server, allowing organizations to deploy a storage-as-
software package. This creates a storage platform that can be accessed by file, block or object protocols.

Position and Adoption Speed Justification: Infrastructure SDS changes the delivery model and potentially
the economics of enterprise storage infrastructures. Whether deployed independently, or as an element
of a hyperconverged infrastructure, SDS alters how organizations buy and deploy enterprise storage.
Following web-scale IT’s lead, I&O leaders are deploying SDS as hardware-agnostic storage, and breaking
the bond from proprietary, external-controller-based (ECB) storage hardware. The power of multicore
Intel x86 processors, use of software-based RAID or Erasure Coding, use of flash and high throughput
networking have essentially eliminated most hardware-associated differentiation, transferring the value
to storage software. Expect new infrastructure SDS vendors and products to emerge targeting a broad
range of delivery models and workloads, including server virtualization, backup, archiving, big data
analytics, HPC, containers and unstructured data. Comprehensive analyses of SDS total cost of
ownership (TCO) benefits involve both capital expenditure (capex) and operating expenditure (opex),
including administration, verification, deployment, and ongoing management, maintenance and
support, as well as a potential improvement in business agility.

User Advice: Infrastructure SDS is the delivery of data services and storage-array-like functionality on top
of industry standard hardware. Enterprises choose a software-defined approach when they wish to
accomplish some or all of the following goals:

Build a storage solution at a low acquisition price point on commodity x86 platform.

Decouple storage software and hardware to standardize their data center platforms.

Establish a scalable solution specifically geared toward Mode 2 workloads.

Build an agile, “infrastructure as code” architecture, enabling storage to be a part of software-defined


data center automation and orchestration framework.

Take advantage of latest innovations in storage hardware before they are supported in traditional ECB
storage arrays.

Advice to end users:


Recognize that infrastructure SDS remains a nascent, but growing, deployment model that will be
primarily focused on web-scale deployment agility, but also has applicability at the edge and public
cloud deployments.

Implement infrastructure SDS solutions that enable you to decouple software from hardware, reduce
TCO and enable greater data mobility.

Assess emerging storage vendors, technologies and approaches, and create a matrix that matches these
offerings with the requirements of your specific workloads.

Deploy infrastructure SDS for single workloads or use cases. Take the lessons learned from this first
deployment and apply SDS to additional use cases.

For infrastructure SDS products, identify upcoming initiatives where SDS could deliver high value. Use
infrastructure SDS with commodity hardware as the basis for a new application deployment aligned with
these initiatives.

Build infrastructure SDS efficiency justification as a result of a proof-of-concept deployment, based on


capex, ROI data and opex impact, as well as better alignment with core business requirements.

Business Impact: Infrastructure SDS is a hardware-agnostic platform. It breaks the dependency on


proprietary storage hardware and lowers acquisition costs by utilizing the industry standard x86 server
platform of the customer’s choice. Some Gartner customers report up to 40% TCO reduction with
infrastructure SDS that comes from the use of x86 industry standard hardware and lower cost upgrades
and maintenance. However, the real value of infrastructure SDS in the long term is increased flexibility
and programmability that is required for Mode 2 workloads. I&O leaders that successfully deployed and
benefited from infrastructure SDS have usually belonged to large enterprises or cloud service providers
that pursued web-scale-like efficiency, flexibility and scalability, and viewed SDS as a critical enablement
technology for their IT initiatives. I&O leaders should look at infrastructure SDS not as another storage
product but as an investment in improving storage economics and providing data mobility including
hybrid cloud storage integration.

Benefit Rating: Transformational

Market Penetration: 5% to 20% of target audience

Maturity: Adolescent

Sample Vendors: Hedvig; IBM; Nutanix; Red Hat; Scality; StorMagic; SUSE; SwiftStack; VMware; WekaIO

Recommended Reading: “The Future of Software-Defined Storage in Data Center, Edge and Hybrid
Cloud”

“Top Five Approaches to Hybrid Cloud Storage — An Analysis of Use Cases, Benefits and Limitations”

“Competitive Landscape: Infrastructure Software-Defined Storage”


“Magic Quadrant for Distributed File Systems and Object Storage”

“An I&O Leader’s Guide to Storage for Containerized Workloads”

“Market Insight: How to Dominate the Unstructured Data Market”

Hyperconvergence

Analysis By: John McArthur; Philip Dawson

Definition: Hyperconvergence is scale-out software-integrated infrastructure designed for IT leaders


seeking operational simplification. Hyperconvergence provides a building block approach to compute,
network and storage on standard hardware under unified management. Hyperconvergence vendors
build appliances using off-the-shelf infrastructure, engage with system vendors that package software as
an appliance, or sell software for use in a reference architecture or certified server. Hyperconvergence
may also be delivered as a service or in a public cloud.

Position and Adoption Speed Justification: Hyperconvergence solutions are maturing and adoption is
increasing as organizations seek management simplicity. VMware vSAN utilization within VMware ESXi
customers and Storage Spaces Direct utilization within Microsoft Windows Server 2016 and 2019 Data
Center Edition customers are on the rise. Nutanix, an early innovator in HCIS appliances, has largely
shifted to a software revenue model and continues to increase the number of OEM relationships.
Hyperconvergence vendors are achieving certification for more-demanding workloads, including SAP
HANA, and end users are beginning to consider hyperconvergence as an alternative to integrated
infrastructure systems for some workloads. Meanwhile, suppliers are expanding hybrid cloud
deployment offerings. Larger clusters are now in use, and midsize organizations are beginning to
consider hyperconvergence as the preferred alternative for on-premises infrastructure for block storage.
Meanwhile, a growing number of hyperconvergence suppliers are delivering scale-down solutions to
address the needs of ROBO and edge environments previously only addressed by niche vendors.

User Advice: IT leaders should implement hyperconvergence when agility, modular growth and
management simplicity are of greatest importance. The acquisition cost of hyperconvergence may be
higher and the resource utilization rate lower than for three-tier architectures, but management
efficiency is often superior.

Hyperconvergence requires alignment of compute and storage refresh cycles, consolidation of budgets,
operations and capacity planning roles, and retraining for organizations still operating separate silos of
compute, storage and networking. Adopt for mission-critical workloads, only after developing
knowledge with lower-risk deployments, such as test and development. Workload-specific proofs of
concept are an important step in meeting the performance needs of applications. Consider the impact
on DR and networking. Test under a variety of failure scenarios, as solutions vary greatly in performance
under failure, their time to return to a fully protected state and the number of failures they can tolerate.

Consider nonappliance options to enable scale-down optimization of resources for high-volume edge
deployments. In product evaluations, consider the ability to independently scale storage and compute,
retraining costs, and the ability to avoid additional operating system, application, database software and
hypervisor license costs. In large deployments, plan for centralized management of multiple smaller
clusters, and for data center deployments, ensure that clusters are sufficiently large to meet
performance and availability requirements during single and double node failures. While servers are
perceived as commodities, they differ greatly in terms of power, cooling and floor space requirements,
and performance, so evaluate hyperconvergence software on a variety of hardware platforms for lowest
total cost of ownership and best performance.

Business Impact: The business impact of hyperconvergence is greatest in dynamic organizations with
short business planning cycles and long IT planning cycles. Hyperconvergence enables IT leaders to be
responsive to new business requirements in a modular, small-increment fashion, avoiding the big-
increment upgrades typically found in three-tier infrastructure architectures. Hyperconvergence
provides simplified management that decreases the pressure to hire hard-to-find specialists and will,
over time, lead to lower operating costs, especially as hyperconvergence supports a greater share of the
compute and storage requirements of the data center. For large organizations, hyperconverged
deployments will remain another silo to manage. Hyperconvergence is of particular value to midsize
enterprises that can standardize on hyperconvergence and the remote sites of large organizations that
need cloud-like management efficiency with on-premises edge infrastructure. As more vendors support
public cloud deployments, hyperconvergence will also be a stepping stone toward public cloud agility.

Benefit Rating: High

Market Penetration: 20% to 50% of target audience

Maturity: Early mainstream

Sample Vendors: Cisco; Dell; HPE; Huawei; Microsoft; Nutanix; Pivot3; Red Hat; Scale Computing;
VMware

Recommended Reading: “Magic Quadrant for Hyperconverged Infrastructure”

“Critical Capabilities for Hyperconverged Infrastructure”

“Toolkit: Sample RFP for Hyperconverged Infrastructure”

“The Road to Intelligent Infrastructure and Beyond”

“Use Hyperconverged Infrastructure to Free Staff for Public Cloud Management”

Object Storage

Analysis By: Raj Bala; Chandra Mukhyala

Definition: Object storage refers to a system that houses data in structures called “objects,” and serves
hosts via APIs such as Amazon Simple Storage Service (S3). Conceptually, objects are similar to files, in
that they are composed of content and metadata. In general, objects support richer metadata and are
stored in a flat namespace, compared with file-and-block-based storage platforms. Object storage
products are available to be deployed as virtual appliances, managed hosting, purpose-built hardware
appliances or software.

Position and Adoption Speed Justification: The market for on-premises, deployed object storage
platforms is not growing rapidly, particularly when compared with adjacent storage segments, such as
hyperconverged integrated system (HCIS) and solid-state arrays (SSAs). However, the market for object
storage is increasing, albeit slowly, as enterprises seek petabyte-scale storage infrastructures at a lower
total cost of ownership (TCO).

Hybrid cloud storage capabilities from emerging vendors and refreshed products from large storage
portfolio vendors are expected to further stimulate adoption from end users, as mainstream enterprises
seek seamless interaction between on-premises and public cloud infrastructure. Although cost
containment of traditional storage area network/network-attached storage (SAN/NAS) infrastructure
continues to be the key driver for object storage adoption, cloud-native use cases in industries, such as
media and entertainment, life sciences, the public sector, and education/research, are spawning new
investments.

User Advice: IT leaders that require highly scalable, self-healing and cost-effective storage platforms for
unstructured data should evaluate the suitability of object storage products, but not when the primary
use case requires file protocols, such as Network File System (NFS) and Common Internet File System
(CIFS). Most object storage vendors offer lackluster implementations of file protocols as the engineering
effort is substantial. The common use cases that Gartner sees for object storage are archiving, content
distribution, analytics and backup. When building on-premises object storage repositories, customers
should evaluate the product’s API support for dominant public cloud providers, so that they can extend
their workloads to a public cloud, if needed.

Amazon’s S3 has emerged as the dominant API over vendor-specific APIs and OpenStack Swift, which is
in precipitous decline. Select object storage vendors that offer a wide choice of deployment (software-
only versus packaged appliances versus managed hosting) and licensing models (perpetual versus
subscription) that can provide flexibility and reduce TCO. These products are capable of a huge scale in
capacity, and are better-suited for workloads that require high bandwidth than transactional workloads
that demand high input/output (I/O) operations per second (IOPS) and low latency.

Business Impact: Rapid growth in unstructured data (40% year over year) and the need to store and
retrieve it in a cost-effective, automated manner will drive the growth of object storage. Enterprises
often deploy object storage on-premises when looking to provide a public cloud infrastructure as a
service (IaaS) experience in their own data centers. Object storage is well-suited to multitenant
environments and requires no lengthy provisioning for new applications. There is growing interest in
object storage from enterprise developers and DevOps team members looking for agile and
programmable infrastructures that can be extended to the public cloud. Object storage software,
deployed on commodity hardware, is emerging as a threat to external controller-based (ECB) storage
hardware vendors in big data environments with heavy volume challenges.
Benefit Rating: High

Market Penetration: 5% to 20% of target audience

Maturity: Adolescent

Sample Vendors: Caringo; Cloudian; DataDirect Networks; Dell EMC; Hitachi Vantara; IBM; NetApp; Red
Hat; Scality; SwiftStack

Recommended Reading: “Magic Quadrant for Distributed File Systems and Object Storage”

“Critical Capabilities for Object Storage”

Distributed File Systems

Analysis By: Julia Palmer; Chandra Mukhyala

Definition: Distributed file systems storage uses a single parallel file system to cluster multiple storage
nodes together, presenting a single namespace and storage pool to provide high bandwidth for multiple
hosts in parallel. Data is distributed over multiple nodes in the cluster to handle availability and data
protection in a self-healing manner, and cluster both capacity and throughput in a linear manner.

Position and Adoption Speed Justification: The strategic importance of storing and analyzing large-scale,
unstructured data is bringing distributed scale-out storage architectures to the forefront of IT
infrastructure planning. Storage vendors are continuing to develop distributed file systems to address
performance and scalability limitations in traditional, scale-up, network-attached storage (NAS)
environments. This makes them suitable for batch and interactive processing, and other high-bandwidth
workloads. Apart from academic high-performance computing (HPC) environments, commercial vertical
industries (such as oil and gas, financial services, media and entertainment, life sciences, research and
telecommunication services) are leading adopters of distributed file systems for applications that
require highly scalable storage bandwidth.

Beyond the HPC use case, rich-media streaming, analytics, content distribution, collaboration, backup
and archiving are other common use cases for cluster file systems. Built on a “shared nothing”
architecture, distributed file systems provide resilience at the software layer, and does not require
proprietary hardware. IT leaders are looking for distributed file systems to enable interoperability
between on-premises and public cloud IaaS storage. This will enable new use cases looking to leverage
public cloud computing and share application data across edge, core and cloud deployments. Vendors
are also increasingly starting to offer software-based deployment options in a capacity-based perpetual
licensing model, or with subscription-based licensing, to stimulate market adoption.

User Advice: Distributed file systems have been around for decades, although vendor maturity varies
widely. Users who need products that enable them to pay as they grow in a highly dynamic
environment, or who need high bandwidth for shared storage, should put distributed file systems on
their shortlist. Most commercial and open-source products specialize in tackling specific use cases, but
integration with application workflows may be lacking in several products. Select distributed file system
storage products based on their interoperability with the ISV solutions that are dominant in their
environment.

Validate all performance claims with proof-of-concept deployments, given that performance varies
greatly by protocol type and file sizes. Prioritize products with software-defined storage capabilities,
versus file systems requiring a ECB as the underlying storage. This approach will enable you to extend
distributed file systems to the public cloud and edge deployments. Shortlist vendors with the ability to
run natively in the public cloud and that enable hybrid cloud storage deployments with bidirectional
tiering, as this emerging paradigm is experiencing positive, early traction with enterprises.

Business Impact: Distributed file systems are based on scale-out platform alternatives to traditional
scale-up NAS architectures. Unlike NAS, they scale storage bandwidth more linearly, surpassing
expensive monolithic frame storage arrays in this capability. The business impact of distributed file
systems is most pronounced in environments in which applications generate large amounts of
unstructured data, and the primary access is through file protocols. However, they will also have an
increasing impact on traditional data centers that want to overcome the limitations of dual-controller
NAS storage designs, as well as for use cases, such as backup and archiving. Many of the file systems
products are being deployed as software-only products on top of industry-standard x86 server
hardware, which has the potential to have lower TCO compared to ECB storage arrays. Many distributed
file systems will have a significant impact on private cloud services, which require a highly scalable,
resilient and elastic infrastructure. IT professionals keen to consolidate file server or NAS file sprawl
should consider using distributed file system storage products that offer operational simplicity and
nearly linear scalability.

Benefit Rating: High

Market Penetration: 20% to 50% of target audience

Maturity: Early mainstream

Sample Vendors: DataDirect Networks; Dell EMC; Elastifile; Huawei; IBM; Inspur; Pure Storage; Qumulo;
Red Hat; WekaIO

Recommended Reading: “Critical Capabilities for Distributed File Systems”

“Magic Quadrant for Distributed File Systems and Object Storage”

“The Future of Software-Defined Storage in Data Center, Edge and Hybrid Cloud”

Cloud Storage Gateways

Analysis By: Raj Bala

Definition: Cloud storage gateways are physical or virtual appliances that reside in an organization’s data
center and/or in the public cloud. They provide users and applications with seamless access to data
stored in a public or private cloud. Users and applications typically read and write data through network
file system (NFS) or host connection protocols. Data is then transparently written to remote cloud
storage through web service APIs, such as those offered by Amazon Web Services (AWS) and Microsoft
Azure.

Position and Adoption Speed Justification: Enterprise interest in cloud storage gateways is focused
mainly on specialized workloads that involve synchronizing large files found in the architecture,
construction and engineering vertical industries. Customers in these verticals often use products from
vendors such as Nasuni and Panzura, which provide a global namespace for files across disparate
locations, but with local file access performance. Cloud storage gateways originally served as technology
bridges between on-premises storage and public cloud storage. However, technology bridges are often
temporary. They are eventually dismantled when users understand how to get to the other side, which
is already happening in the market for cloud storage gateways.

As the public cloud has matured and become mainstream, enterprises no longer need a bridge to
consume public cloud infrastructure as a service (IaaS). Customer use cases for cloud storage gateways
are narrowing in on specialized, niche workloads that are not required broadly by enterprises. The
startup vendors in this market are largely trying to find their second act. Some are working to front-end
on-premises object storage platforms or perhaps become object storage themselves.

User Advice: There are no vendors in this market with high-growth revenue. Compared with adjacent
storage markets, such as hyperconverged integrated systems (HCIS) and solid-state arrays (SSA), vendors
in the cloud storage gateway market have modest revenue and customer adoption. As a result, there is
inherent risk in depending on a cloud storage gateway offered by a startup.

However, there is unique functionality that isn’t present in products from more mature markets, such as
HCIS. In particular, no other categories of storage products provide a global namespace and file locking.
These features serve collaboration and file-sharing use cases across disparate geographies that are
otherwise underserved by the larger storage market.

Enterprises that require this functionality should factor in the risk associated with small vendors that
may eventually be acquired by larger portfolio vendors.

Business Impact: Cloud storage gateways can provide customers that want to reduce in-house
backup/disaster recovery processes, archives and unstructured data with compelling, cloud-based
alternatives. Some organizations are deploying cloud storage gateways such as virtual appliances in
compute instances running in public cloud IaaS providers, such as AWS and Google Cloud Platform
(GCP).

The gateways then connect to a customer’s enterprise data center and act as a bridge between
elastically scaled compute instances in the public cloud and the data stored on primary storage
platforms inside the customer’s data center. This scenario is particularly useful for big data workloads in
which the compute capacity is best used in a temporary, elastic fashion. This model flips the traditional
notion of an enterprise’s use of public cloud: An enterprise data center becomes an extension of the
public cloud, rather than vice versa.

Benefit Rating: Low

Market Penetration: 5% to 20% of target audience

Maturity: Adolescent

Sample Vendors: Amazon Web Services; Avere Systems; Ctera Networks; Dell EMC; Microsoft; Nasuni;
NetApp; Panzura

Recommended Reading: “Market Guide for Cloud Storage Gateways”

Integrated Backup Appliances

Analysis By: Chandra Mukhyala; Michael Hoeck

Definition: An integrated backup appliance is an all-in-one backup software and hardware solution that
combines the functions of a backup application server, media server (if applicable) and backup target
device. The appliance is typically preconfigured and fine-tuned to cater to the capabilities of the
onboard backup software. It is a more simplified and easier-to-deploy backup solution than the
traditional approach of separate software and hardware installations, but lacks flexibility on hardware
choices and, in some cases, scalability.

Position and Adoption Speed Justification: Integrated backup appliances have been around for many
years without much fanfare. The current hype is driven by existing large backup software vendors that
have started packaging their software in an appliance, and by innovative emerging vendors offering all-
in-one solutions. The momentum of integrated backup appliances is driven by the desire to simplify the
setup and management of the backup infrastructure, because complexity is a leading challenge when it
comes to backup management. Overall, integrated backup appliances have resonated well with many
small and midsize enterprise customers that are attracted by the one-stop-shop support experience and
tight integration between software and hardware. Vendors delivering appliances using the scale-out
approach removes some of the scalability concerns through this method, and targets midsize to large
enterprise customers.

Within the integrated backup appliance market, the former clear segmentation by backup repository
limitations has vanished, with all vendors adding cloud target or tiering capabilities.

There are generally three types of vendor selling integrated backup appliances, separated primarily by
heritage:

The first kind includes backup software vendors that package their software with hardware in order to
offer customers integrated appliances. Examples include Arcserve, Commvault, Dell EMC and Veritas
Technologies.
The second type is made up of vendors offering products that tightly integrate software with hardware
in a scale-out approach, such as Cohesity and Rubrik.

The third kind is a cloud backup provider that offers a customer and on-premises backup appliance as
part of a cloud backup solution. Examples include Barracuda, CTERA, Datto and Unitrends.

User Advice:

Organizations should first evaluate backup software functions to ensure that their business
requirements are met, before deciding about acquiring an integrated backup appliance or a software-
only solution.

Once a specific backup software product is chosen, deploying an appliance with that software will
simplify operational processes and address any compatibility issues and functionality gaps between
backup software-only products and deduplication backup target appliances.

Customers should keep in mind that integrated appliances can also act as a lock-in for the duration of
the useful life of the hardware.

If customers prefer deploying backup software-only products to gain hardware flexibility, they should
carefully consider which back-end storage to choose — be it generic disk array/network-attached
storage (NAS) or deduplication backup target appliances.

Business Impact: Integrated backup appliances ride the current trend of converged infrastructure and
offer tight integration between software and hardware, simplify the initial purchase and configuration
process, and provide the one-vendor support experience with no finger-pointing risks.

On the downside, an integrated backup appliance tends to lack the flexibility and heterogeneous
hardware support offered by backup software-only solutions, which is often needed by large, complex
environments.

Benefit Rating: Moderate

Market Penetration: 20% to 50% of target audience

Maturity: Early mainstream

Sample Vendors: Arcserve; Barracuda; Cohesity; Commvault; CTERA; Datto; Dell EMC; Rubrik; Unitrends;
Veritas Technologies

Recommended Reading: “Magic Quadrant for Data Center Backup and Recovery Solutions”

“Critical Capabilities for Data Center Backup and Recovery Solutions”

“Market Guide for Data Center Backup Targets”

Climbing the Slope


Enterprise Endpoint Backup

Analysis By: Jerry Rozeman; Michael Hoeck; John Girard

Definition: Enterprise endpoint backup refers to backup products for laptops and desktops that can
recover corrupted or lost data, as well as personal settings residing on the devices. Endpoint backup
differs from file sync and share’s versioning capabilities, in that backup preserves secure, centrally
managed copies that cannot be changed or deleted by end users, and it protects PC/laptop data in a
more-comprehensive way.

Position and Adoption Speed Justification: Overall, more organizations are adopting endpoint backup to
tackle different risks, including ransomware, insider threats, and potential risks exposed by enterprise
file sync and share solutions, including Office 365 OneDrive for Business. Organizations with globally
distributed offices and employees like to leverage web-scale, public cloud storage providers and backup-
as-a-service providers that offer a multiple-country presence. As employees become more mobile,
laptop backup has been the driving force for organizations to adopt endpoint backup. Endpoint backup
restores lost data, and enables more-efficient ways to perform ongoing laptop refresh/migration. This
supports compliance with company policies, and the performance of legal hold and e-discovery.

In terms of technology, vendors have added more features to cater to the mobile nature of laptops,
such as virtual private network (VPN)-less backup over the internet, cellular network awareness and
remote wipe. Other new product developments focus on security and compliance capabilities, device
replacement/migration automation and full-text search for faster restore/recovery. The performance
issues are averted by a combination of client-side deduplication, in addition to incremental-forever
backups, near-continuous data protection technologies, and CPU and network throttling.

User Advice: Scheduled and encrypted endpoint user data backups should be part of a robust enterprise
data protection and recovery plan. This benefits users personally as much as it helps companies to
maintain continuity after loss or theft of endpoints. Choices abound for PCs, Macs and Android. iOS
backups are design-limited by Apple. To save money, many companies are asking users to voluntarily
back up to enterprise accounts for OneDrive, GDrive and so on, but these are not a replacement for
enterprise-grade protection.

Business Impact: As the global workforce becomes more mobile and creates increasing amounts of
business content on endpoint devices, endpoint backup and recovery is growing in importance.
Moreover, new malicious attacks, such as ransomware, have increased risk profiles, and organizations
often rely on backup to restore data, instead of paying the ransom. If employees don’t back up their
endpoint devices regularly (and many do not on their own), companies may face significant risks when
important or sensitive data is lost, stolen or leaked. Such risks include R&D setbacks, fines, legal actions
and the inability to produce user data in a lawsuit.

Benefit Rating: High

Market Penetration: 20% to 50% of target audience


Maturity: Early mainstream

Sample Vendors: Carbonite; Code42; Commvault; Druva; Infrascale; Micro Focus

Recommended Reading: “Adopt Microsoft Office 365 Backup for Damage Control and Fast Recovery
After Malicious Attacks”

“Debunking the Myth of Using EFSS for Backup”

“Use These Five Backup and Recovery Best Practices to Protect Against Ransomware”

“Critical Capabilities for Data Center Backup and Recovery Solutions”

“How to Address Three Key Challenges When Considering Endpoint Backup”

Erasure Coding

Analysis By: Chandra Mukhyala; Raj Bala

Definition: Erasure-code-based protection is an alternative to traditional RAID based protection. Erasure-


coding takes a block of data and divides it up into “n” smaller chunks, and then add “k” chunks of
encoded data, in such a way that up to k chunks can be recovered from any “n+k” chunks. The devices
can be storage disks or storage nodes that are in one or more geographical locations. The advantage of
erasure coding protection over traditional RAID is that it allows for a greater number of simultaneous
storage device failures.

Position and Adoption Speed Justification: Hard-disk drive (HDD) capacity is growing faster than HDD
data rates. The result is ever-longer rebuild times that increases the probability of experiencing
subsequent disk failures before the rebuild has completed. This results in the need for protection
schemes that continue to protect data even in the presence of failures (greater fault tolerance) and the
focus on reducing rebuild times. Erasure coding algorithms take advantage of inexpensive and rapidly
increasing compute power to write blocks of data as systems of equations. These algorithms then
transform these systems of equations back into blocks of data during read operations. Modern flash-
centric storage arrays that implement coalesced or aggregated writes minimize the performance impact
of erasure coding. Allowing the user to specify the number of failures that can be tolerated before data
integrity can no longer be guaranteed enables users to trade off data protection overhead (costs)
against mean time between data loss (MTBDL). Erasure coding is typically used in large-scale or web-
scale systems where storage drive or even entire storage node failures are common, and the system is
expected to withstand such failures.

User Advice: Have vendors profile the performance/throughput of their storage systems supporting your
workloads using the various protection schemes that they support with various storage efficiency
features (such as compression and deduplication or autotiering) turned on and off to better understand
performance-overhead trade-offs. Confirm that the choice of protection scheme does not limit the use
of other value-added features. Request minimum/average/maximum rebuild times to size the likely
rebuild window of vulnerability in a storage system supporting your production workloads. Cap
microprocessor consumption at 75% of available cycles to ensure that the system’s ability to meet
service-level objectives is not compromised during rebuilds and microcode updates. Give extra credit to
vendors willing to guarantee response times (latency), MTBDL and rebuild times.

Business Impact: Advanced data protection schemes enable vendors and users to continue lowering
storage costs and power the evolution of digital businesses by enabling the deployment of low-cost,
high-capacity disks as soon as they become technically and economically attractive. The rapid adoption
of new high-capacity HDDs lowers environmental footprints and the frequency and urgency of repair
activities by encouraging the deployment of fewer, larger storage systems that may also enable users to
delay or avoid facilities upgrades or expansions.

Benefit Rating: High

Market Penetration: 20% to 50% of target audience

Maturity: Early mainstream

Sample Vendors: Caringo; DDN; Dell EMC; IBM; NetApp; NEC; Panasas; Scality; SwiftStack

Recommended Reading: “Technology Overview for Erasure Coding”

“Slow Storage Replication Requires the Redesign of Disaster Recovery Infrastructures”

Public Cloud Storage

Analysis By: Raj Bala

Definition: Public cloud storage is infrastructure as a service (IaaS) that provides block, file and/or object
storage services delivered through protocols. The services are stand-alone, but are often used with
compute and other IaaS products. They are priced based on capacity, data transfer and/or the number
of requests. The services provide on-demand storage and are self-provisioned. Stored data exists in a
multitenant environment, and users access that data through block, network and representational state
transfer (REST) protocols.

Position and Adoption Speed Justification: Public cloud storage is a critical part of most workloads that
use public cloud IaaS, even if it’s often invisible to end users. In fact, the default volume type used for
virtual machines (VMs) on some providers is solid-state drive (SSD)-based block storage. Unstructured
data is frequently stored in object storage services for high-scale, low-cost requirements; however, end
users are often unaware of the underlying storage type being used. The market for public cloud storage
is becoming more visible to end users, as cloud providers begin offering more-traditional enterprise
brands with data management capabilities of storage systems found on-premises.

User Advice: Do not choose a public cloud storage provider based simply on cost or on your enterprise’s
relationship with the provider. The lowest-cost providers may not have the scale and operational
capabilities required to become viable businesses that are sustainable over the long term. Moreover,
these providers are also unlikely to have the engineering capabilities to innovate at the rapid pace set by
the leaders in this market. Upheaval in this market warrants significant consideration of the risks if
organizations choose a provider that is not one of the hyperscale vendors, such as Alibaba, Amazon Web
Services (AWS), Google and Microsoft. Many of today’s Tier 2 public cloud storage offerings may not
exist in the same form tomorrow, if they exist at all.

Use public cloud storage services when deploying applications in public cloud IaaS environments,
particularly those workloads focused on analytics. Match workload characteristics and cost
requirements to a provider with equivalently suited services.

Business Impact: Public cloud storage services is part of the bedrock that underpins public cloud IaaS.
Recent advances in performance, as they relate to these storage services, have enabled enterprises to
use cloud IaaS for mission-critical workloads in addition to new, Mode-2-style applications. The security
advances enable enterprises to use public cloud storage services and experience the agility aspects of a
utility model, yet retain complete control from an encryption perspective.

Benefit Rating: High

Market Penetration: More than 50% of target audience

Maturity: Mature mainstream

Sample Vendors: Alibaba Cloud; Amazon Web Services; Google; IBM; Microsoft; Oracle; Rackspace;
Virtustream

Recommended Reading: “Magic Quadrant for Public Cloud Storage Services, Worldwide”

“Magic Quadrant for Cloud Infrastructure as a Service, Worldwide”

Enterprise Information Archiving

Analysis By: Julian Tirsu; Michael Hoeck

Definition: Enterprise information archiving (EIA) solutions provide tools for journaling data into a
distributed or centralized repository for compliance and efficiency. EIA supports multiple data types,
including email, IM, file, public and business social, and voice. These tools provide access to archived
data in the repository, either through a plug-in to the native application or via a pointer or browser
access. Some are also able to manage the data in place. EIA tools support operational efficiency,
compliance, retention management and e-discovery.

Position and Adoption Speed Justification: The number of vendors offering EIA solutions has stabilized,
and consolidation in the marketplace has been a key trend over the past two years. Driven by awareness
created through Microsoft Office 365 adoption, archiving is becoming mainstream for meeting
compliance and e-discovery needs for organizations, and adoption has spread beyond heavily regulated
industries. Compliance and regulatory requirements drive the retention of messaging data, with SaaS-
based archiving increasingly becoming the repository of choice. In financial services, the need to capture
voice has also risen as a key requirement.

Support for the capture and supervision of social media has become a requirement in regulated
industries. Support for multiple content types is also standard for most EIA products. Many companies
are looking to replace their archiving products with newer ones (particularly SaaS solutions), and many
migration services are available. The cost of these services remains an inhibitor to switching vendors in
some cases.

The appetite for email-only archiving solutions remains; however, most organizations are looking to
vendors with support for multiple communications types (such as email, social, mobile and voice).

User Advice: As requirements to store, search and discover old data grow, companies are implementing
an EIA solution, starting with email as the first managed content type. Many organizations are looking to
migrate to cloud email and productivity solutions, such as those offered by Microsoft and Google. When
migrating, associated compliance and regulatory retention requirements must be considered. In
addition, organizations should have an overall data retention plan, including the need to archive
additional content types. EIA use cases are growing to include records management, analytics and
classification abilities.

Organizations must ensure contractually that they have an appropriately priced process as well as an
option for extracting data from an archive solution — namely, from SaaS providers. Migrating personal
stores to the archive should be part of the deployment of an archive system. The migration of legacy
email archives, including into and out of a hosted solution, can be complex and expensive, and it should
be scoped during the selection phase. In SaaS-archiving contracts, organizations need to include an exit
strategy that minimizes costs and to understand that they own the data, not the SaaS providers. When
determining costs versus benefits for SaaS archiving, include soft expenses associated with on-premises
solutions for personnel and IT-involved discovery requests.

Business Impact: EIA improves application performance, delivers improved service to users and enables
a timely response to legal discovery and business requests for historical information. Archived data can
be stored in a less expensive fashion, with the opportunity to take some data offline or delete it. Moving
old data to an archive also reduces backup and recovery times by decreasing the active dataset. This
provides significant improvements to established process that ultimately result in cost savings for the
organization.

Email and e-discovery remain the predominant content type and use case, but long-term digital
preservation for file and classification are gaining interest as EIA capabilities. Archiving offered via SaaS is
increasing in popularity because of the benefits associated with offloading low-business-value tasks to a
third party, as well as the reduced capital expense. SaaS-based message data archiving is leading the
way and is currently priced on a per-user, per-month basis, with no storage overages. As cost structure
and integration issues are ironed out, more file system data and application data will be archived in the
cloud. In addition, more organizations are seeking to create a holistic information governance strategy,
including analytics of all data, so the right selection of an archiving or retention solution becomes even
more imperative.

EIA is an important part of e-discovery, providing support for the Electronic Discovery Reference Model
(EDRM). Legal hold, retention management, search and export features are used to meet discovery and
compliance requirements. Supervision tools for sampling and reviewing messages are available with
many EIA products. This is in response to requirements specific to the regulated portion of the financial
industry. To meet the requirements of mobile workers, EIA offers a way for organizations to keep data
compliant in an archive, while providing access via mobile devices.

Benefit Rating: High

Market Penetration: 20% to 50% of target audience

Maturity: Early mainstream

Sample Vendors: Bloomberg; Global Relay; Google; Micro Focus; Microsoft; Mimecast; Proofpoint;
Smarsh; Veritas Technologies; ZL Technologies

Recommended Reading: “Design a Business-Driven Archive Strategy”

“Magic Quadrant for Enterprise Information Archiving”

“Critical Capabilities for Enterprise Information Archiving”

Virtual Machine Backup and Recovery

Analysis By: Jerry Rozeman; Michael Hoeck

Definition: Virtual machine (VM) backup and recovery focuses on protecting and recovering data from
VMs, as opposed to the physical servers they run on. Backup methods optimized for VM backup typically
leverage hypervisor-native APIs for changed block tracking (CBT), which enables block-level,
incremental-forever backup, eliminating the need for the in-guest, agent backup method. Some backup
vendors create their own CBT drivers before a hypervisor vendor introduces its own, and adopt
hypervisor-native CBT when it becomes available.

Position and Adoption Speed Justification: Enterprise VM backup typically focuses on VMware and
Microsoft Hyper-V, because they are the most-often deployed hypervisors in enterprise data centers.
Increasingly, data center backup vendors also support Kernel-based Virtual Machine (KVM) hypervisors
from Amazon Web Services (AWS), Nutanix and Red Hat. Most backup software solutions have
abandoned the traditional guest OS agent approach and have adopted clientless, snapshot-based
backup with CBT technology, leveraging hypervisor APIs. “VM stun” issues remain a challenge for VM
backups for applications with high input/output (I/O) change rates. In addition, many traditional backup
applications require the installation of guest OS agents to do granular item restore for applications such
as Exchange and SharePoint running on VMware.
User Advice: Now that most companies have virtualized much of their data center applications, virtual
infrastructure recovery has become the most significant component of an organization’s overall data
availability, backup/recovery and disaster recovery (DR) plan. Evaluate VM backup and recovery
solutions on their capabilities for ease of use, scalability, cloud integration, replication, DR orchestration
and automation capabilities. In addition, look at their new recovery scenarios, such as instant recovery,
ransomware detection, and recovery features and data immutability capabilities for protecting the
backup system.

Business Impact: As production environments have become highly or completely virtualized, the need to
recover data in these environments has become critical. VM backup and recovery solutions help recover
from the impact of disruptive events, including user or administrator errors, application errors, external
or malicious attacks, equipment malfunction, and the aftermath of disaster events. The ability to protect
and recover VMs in an automated, repeatable and timely manner is important for many organizations.

Benefit Rating: High

Market Penetration: More than 50% of target audience

Maturity: Mature mainstream

Sample Vendors: Actifio; Arcserve; Cohesity; Commvault; Dell EMC; IBM; Rubrik; Unitrends; Veeam;
Veritas

Recommended Reading: “Best Practices for Repairing the Broken State of Backup”

Entering the Plateau

Continuous Data Protection

Analysis By: Santhosh Rao; Jerry Rozeman

Definition: Continuous data protection is an approach to continuously, or nearly continuously, capture


and transmit changes to applications, files and/or blocks of data. Depending on solution architecture,
real-time changes are journaled or replicated to a local or remote storage target. This capability provides
options for more granular recovery point objectives and is used for backup/recovery, disaster recovery
and data migration use cases. Some CDP solutions can be configured to capture changes continuously
(true CDP) or at scheduled times (near CDP).

Position and Adoption Speed Justification: The difference between near CDP and regular backup is that
backup is typically performed once, to only a few — typically no more than four — times a day.
However, near CDP is often done every few minutes or hours, providing many more recovery options
and minimizing any potential data loss. Several products also provide the ability to heterogeneously
replicate and migrate data between two different types of storage devices, allowing for potential cost
savings for disaster recovery solutions and data or cloud migrations. Checkpoints of consistent states are
used to enable rapid recovery to known good states (such as before a patch was applied to an
application or the last time a database was reorganized). This ensures the application consistency of the
data and minimizes the number of log transactions that must be applied.

True-CDP and near-CDP capabilities are increasingly integrated with backup software capabilities but can
still be packaged as server-based software, as network-based appliances or as part of a storage
controller snapshot implementation. The delineation between frequent snapshots (one to four per hour
or less granularity) and near CDP is not crisp. Administrators often implement snapshots and CDP
solutions in a near-CDP manner to strike a balance between resource utilization and improved
recoverability.

User Advice: Consider CDP for critical data where regular snapshots and/or backups do not enable
meeting the required recovery point objectives (RPOs). Gartner has observed that true-CDP
implementations are often used for files, email and laptop data, but adoption for replication and
recovery of VMs, databases and other applications is a common use case too. Ensure that the
bandwidth requirements of CDP are addressed before implementing the solution.

Business Impact: CDP can dramatically change the way data is protected, decreasing backup and
recovery times, as well as reducing the amount of lost data, and can provide additional recovery points.
Compared to traditional backup, which typically captures data only once a day, the amount of data lost
in a restore situation can be nearly 24 hours for backup versus minutes or a few hours with CDP. The
integration of live mounting capabilities with CDP technology can even further shorten RTO
requirements. CDP can be an effective countermeasure against ransomware.

Benefit Rating: High

Market Penetration: More than 50% of target audience

Maturity: Mature mainstream

Sample Vendors: Actifio; Arcserve; Code42; Commvault; DataCore; Dell EMC; Druva; Microsoft; Reduxio;
Zerto

Recommended Reading: “Magic Quadrant for Data Center Backup and Recovery Solutions”

“Critical Capabilities for Data Center Backup and Recovery Solutions”

Appendixes

Figure 3. Hype Cycle for Storage Technologies, 2018

Hype Cycle for Storage Technologies, 2018

Hype Cycle Phases, Benefit Ratings and Maturity Levels

Table 1: Hype Cycle Phases

Innovation Trigger
A breakthrough, public demonstration, product launch or other event generates significant press and
industry interest.

Peak of Inflated Expectations

During this phase of overenthusiasm and unrealistic projections, a flurry of well-publicized activity by
technology leaders results in some successes, but more failures, as the technology is pushed to its limits.
The only enterprises making money are conference organizers and magazine publishers.

Trough of Disillusionment

Because the technology does not live up to its overinflated expectations, it rapidly becomes
unfashionable. Media interest wanes, except for a few cautionary tales.

Slope of Enlightenment

Focused experimentation and solid hard work by an increasingly diverse range of organizations lead to a
true understanding of the technology’s applicability, risks and benefits. Commercial off-the-shelf
methodologies and tools ease the development process.

Plateau of Productivity

The real-world benefits of the technology are demonstrated and accepted. Tools and methodologies are
increasingly stable as they enter their second and third generations. Growing numbers of organizations
feel comfortable with the reduced level of risk; the rapid growth phase of adoption begins.
Approximately 20% of the technology’s target audience has adopted or is adopting the technology as it
enters this phase.

Years to Mainstream Adoption

The time required for the technology to reach the Plateau of Productivity.

Source: Gartner (July 2019)

Table 2: Benefit Ratings

Transformational

Enables new ways of doing business across industries that will result in major shifts in industry dynamics

High

Enables new ways of performing horizontal or vertical processes that will result in significantly increased
revenue or cost savings for an enterprise

Moderate
Provides incremental improvements to established processes that will result in increased revenue or
cost savings for an enterprise

Low

Slightly improves processes (for example, improved user experience) that will be difficult to translate
into increased revenue or cost savings

Source: Gartner (July 2019)

Table 3: Maturity Levels

Embryonic

In labs

None

Emerging

Commercialization by vendors

Pilots and deployments by industry leaders

First generation

High price

Much customization

Adolescent

Maturing technology capabilities and process understanding

Uptake beyond early adopters

Second generation

Less customization

Early mainstream

Proven technology

Vendors, technology and adoption rapidly evolving

Third generation

More out-of-box methodologies


Mature mainstream

Robust technology

Not much evolution in vendors or technology

Several dominant vendors

Legacy

Not appropriate for new developments

Cost of migration constrains replacement

Maintenance revenue focus

Obsolete

Rarely used

Used/resale market only

Source: Gartner (July 2019)

Evidence

End-user inquiries

Vendor briefings

© 2019 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of
Gartner, Inc. and its affiliates. This publication may not be reproduced or distributed in any form without
Gartner's prior written permission. It consists of the opinions of Gartner's research organization, which
should not be construed as statements of fact. While the information contained in this publication has
been obtained from sources believed to be reliable, Gartner disclaims all warranties as to the accuracy,
completeness or adequacy of such information. Although Gartner research may address legal and
financial issues, Gartner does not provide legal or investment advice and its research should not be
construed or used as such. Your access and use of this publication are governed by Gartner’s Usage
Policy. Gartner prides itself on its reputation for independence and objectivity. Its research is produced
independently by its research organization without input or influence from any third party. For further
information, see "Guiding Principles on Independence and Objectivity."

AboutCareersNewsroom PoliciesSite IndexIT GlossaryGartner Blog NetworkContactSend


FeedbackGartner, Inc.

© 2018 Gartner, Inc. and/or its Affiliates. All Rights Reserved.

You might also like