You are on page 1of 20

Meeting the

Storage Needs of AI
With artificial intelligence (AI) workloads emerging, find
out what you need to do to keep your data always
available.
In this e-guide In this e-guide:
• How AI is changing the Artificial intelligence is transforming the way organizations do
storage consumption business, but when it comes to storage best practices to meet the
landscape (P. 2)
infrastructure needs of this new technology, there are still a lot of
unknowns.
• 6 key AI data storage
questions answered (P. 7)
This e-guide will clear things up and help you get the most out of
storage for AI, including trends in the storage consumption landscape,
• 8 factors that make AI answering key AI storage questions, and much more.
storage more efficient (P.13)

• Storage options meet the ▼ Next Article


needs of machine learning
and AI (P.19)

Page 1 of 19
In this e-guide How AI is changing the storage
• How AI is changing the
consumption landscape
storage consumption
Marc Staimer, Founder
landscape (P. 2)
When it comes to storage for AI applications, the key issue isn't that these apps consume
• 6 key AI data storage more storage than other applications -- they don't. The key issue is that they consume storage
differently. With AI applications, data moves from storage to AI processing or I/O. It also
questions answered (P. 7) moves between different storage systems and media at different points in its lifecycle.

• 8 factors that make AI I/O is primarily tied to throughput, regardless of the type of storage or storage media the data
is stored on. AI's three modes -- machine learning, deep machine learning and neural
storage more efficient (P.13)
networks -- each ingest and process data differently and, therefore, have distinctive I/O
• Storage options meet the requirements. A look at each reveals how AI applications are changing storage consumption.

needs of machine learning Speed is key to storage consumption


and AI (P.19)
Machine learning, the most common AI, can potentially use millions of data points to make
predictions or decisions based on human-derived algorithms. The accuracy of an AI app's
outcomes is tied to the number of data points ingested within a specified timeframe. More
data points lead to more accurate predictions and decisions.

Time is the limiting factor: If a prediction or decision is required in "n" milliseconds, the speed
at which the machine learning algorithm can ingest and examine the data points will
determine the quality of the outcome. GPUs and high-performance computing have
eliminated most processing bottlenecks. That leaves the storage I/O having to keep up with
the machine learning algorithm.
Page 2 of 19
Deep learning applications draw on millions, or even billions, of data points, and they make
In this e-guide multiple passes on the same data. This exacerbates the I/O bottleneck problem.

• How AI is changing the Machine learning and deep machine learning algorithms can run on modern server
architectures, but neural networks are different. Neural networks, also referred to as artificial
storage consumption
neural networks, mimic the neuron architecture of the human brain. By definition, they require
landscape (P. 2) extensive scale-out GPUs and CPUs, ranging from dozens to millions of processors. The key
to neural network storage is to provide extremely high-performance parallel file system
• 6 key AI data storage throughput. This is where IBM Spectrum Scale, Lustre (open source), Panasas ActiveStor
and WekaIO are a good fit.
questions answered (P. 7)
Legacy block storage systems generally aren't able to deliver the hundreds-of-gigabytes- to
• 8 factors that make AI terabytes-per-second read throughput required. However, several newer extreme-
storage more efficient (P.13) performance storage systems can meet these needs, some better than others, including: Dell
EMC PowerMax, DirectData Networks (DDN) Storage Fusion Architecture, Excelero NVMesh,
• Storage options meet the Fungible, IBM FlashSystem, Oracle Exadata, Pavilion Hyperparallel Flash Array and
StorCentric Vexata.
needs of machine learning
and AI (P.19) Legacy file storage systems also aren't up to the task. But newer generations of parallel file
systems with global namespaces can deliver the throughput needed. They include DDN
EXA5, IBM Spectrum Scale, Panasas ActiveStor and WekaIO.

Page 3 of 19
In this e-guide

• How AI is changing the


storage consumption
landscape (P. 2)

• 6 key AI data storage


questions answered (P. 7)

• 8 factors that make AI


storage more efficient (P.13)

• Storage options meet the


needs of machine learning
and AI (P.19)
Solving the data migration problem
Keep in mind that both machine learning and deep machine learning use current and
historical data to examine, learn, predict and decide. The historical data these apps use is
unlikely to reside on super-fast, expensive storage systems. It may start there, but as it ages,
it's moved to slower, cheaper storage systems or the cloud.

Page 4 of 19
This is a problem. Moving data from fast,
In this e-guide expensive storage to slower, cheaper systems Data movement is the
must be simple and automated, and it also must primary reason many AI
• How AI is changing the be transparent to the AI applications. In other
projects fail, so getting
storage consumption words, as far as the AI application is concerned,
the data is still there. Data movement is the this right is paramount
landscape (P. 2) primary reason many AI projects fail, so getting to success.
this right is paramount to success. There are two
• 6 key AI data storage ways to accomplish this.
questions answered (P. 7)
The first is to build data migration into the storage system, so data is moved within the system
• 8 factors that make AI or to the cloud. The system knows where the data resides -- i.e., the storage it is consuming --
and feeds it to the AI to process on demand. This approach to data migration and storage
storage more efficient (P.13) consumption suffers from limited system capacity. Once again, newer scale-out technologies
mitigate this problem. Some examples of this are the Nimbus Data, Oracle Exadata X8M,
• Storage options meet the StorOne and Vast Data.
needs of machine learning
The Oracle Exadata X8M uses both high-performance Intel Optane DC Persistent Memory
and AI (P.19) Module (PMEM) in Application Direct Mode with NVMe flash in its primary storage servers
and low-cost, high-capacity spinning disks in its XT storage servers. The amount of PMEM is
limited to no more than 27 TB per rack and supports as many as 18 racks. That's potentially a
lot of high-performance storage, which isn't cheap and not all databases or all data within a
database require PMEM performance. The Oracle Database will move older, less accessed
data from PMEM to NVMe flash and then to lower-performance and lower-cost storage
servers, such as its low-cost XT storage server. The Oracle Database knows where the data
is and still has access to It on-demand.

Page 5 of 19
Vast Data's all-flash array uses high-performance, high-endurance Optane storage class
In this e-guide memory (SCM) drives as cache for lower cost, performance and endurance quad-level cell
(QLC) flash SSDs. Clever software puts most of the writes on the SCM and limits writes to the
• How AI is changing the QLC flash. The data invariably ends up on the QLC while still providing exceptional AI read
storage consumption performance. Storage startups Nimbus Data and StorOne claim to provide similar capabilities.

landscape (P. 2) The second approach is to move the data in the background, leaving a hierarchical storage
management (HSM) stub, symbolic link (symlink) redirect or global namespace middleware.
• 6 key AI data storage The traditional HSM is problematic in that the data must be moved back to the original storage
first before it's read, and that's too slow. Stubs can be brittle and break, orphaning data.
questions answered (P. 7)
Symlinks can also break, albeit less frequently. And global namespace typically means
• 8 factors that make AI something -- probably middleware -- is in the data path that adds some latency. However, that
latency is nominal and only for the first readable byte.
storage more efficient (P.13)
Symlinks and global namespaces can provide the data required from multiple resources,
• Storage options meet the concurrently filling the AI machine learning or deep machine learning app with the data points
it needs. There are several products able to deliver these functions, including Datadobi, iRods
needs of machine learning
(open source), Komprise and StrongBox Data Solutions StrongLink.
and AI (P.19)
Transparent data movement is what's needed when providing AI machine learning and deep
learning apps with the data volume they need at an acceptable cost. Such transparency
drives high-cost, high-performance storage consumption to lower-cost and lower-performing
storage consumption. This approach is a necessity to effectively handle the vast volumes of
data AI applications demand.

▼ Next Article

Page 6 of 19
In this e-guide 6 key AI data storage questions
• How AI is changing the
answered
storage consumption
Stacey Peterson, Senior Site Editor
landscape (P. 2)
Research firm IDC projected that, by 2023, global spending on AI technology will hit nearly
• 6 key AI data storage $98 billion, up from $37.5 billion in 2019. That increase represents a compound annual growth
rate of nearly 30%. All those new, complex AI applications won't be deployed in a vacuum. A
questions answered (P. 7) range of IT infrastructure, including storage tailored for AI, must support and process these
new workloads.
• 8 factors that make AI
AI, and particularly machine and deep learning, applications require vast amounts of data that
storage more efficient (P.13)
gets sent to CPUs or GPUs for processing and analysis in near real time. And then much of
• Storage options meet the that data must be stored for possible future use. AI applications and technology will be among
the top factors affecting infrastructure decisions over the next several years. For storage, that
needs of machine learning means enterprises must understand the data they're processing, find ways to get their storage
and AI (P.19) media closer to compute technology and enhance AI data storage performance to match that
of the processors.

What follows are six questions we answered in various SearchStorage articles related to how
AI workloads are changing enterprise storage infrastructure.

1. What should be considered when planning AI


data storage workloads?

Page 7 of 19
Putting together storage for AI applications is no easy matter. There are several issues to
In this e-guide consider and details to get right. Consultant and tech writer Robert Sheldon has a list of eight
such factors:
• How AI is changing the
• Understand your various workloads' storage requirements.
storage consumption
• Know your capacity and scalability requirements.
landscape (P. 2) • Find out how long you'll need to hold onto the data and how it will be accessed.
• Factor in the throughput and I/O rates you'll need.
• 6 key AI data storage • Consider location -- data near compute will minimize latency.
• Assess the best type of storage to use -- block, file or object.
questions answered (P. 7) • Use intelligent and software-defined storage to enhance performance.
• Ensure all systems involved are tightly integrated.
• 8 factors that make AI
storage more efficient (P.13)

• Storage options meet the


needs of machine learning
and AI (P.19)

Page 8 of 19
In this e-guide

• How AI is changing the


storage consumption
landscape (P. 2)

• 6 key AI data storage


questions answered (P. 7)

• 8 factors that make AI


storage more efficient (P.13)

• Storage options meet the


needs of machine learning
and AI (P.19)

2. What challenges does AI data storage bring?


There are two distinct challenges related to building storage for AI applications, according to
IT industry veteran Chris Evans. On the one hand, it won't always be clear at the outset of an
AI or machine learning project what data will be useful. As a result, long-term archive storage
will be required where data can be retained and accessed when needed to support the
learning effort.

Page 9 of 19
The other major storage challenge lies in ensuring sufficient high-performance storage is
In this e-guide available for the active data that needs processing. Vendors are combining fast storage with
AI and machine learning capabilities to meet this need. Evans outlines the challenges of
• How AI is changing the building a storage platform that balances the storage needs of AI workloads. Packaged
storage consumption storage tuned for AI products are part of that discussion and can be attractive because they
offer a specific level of performance.
landscape (P. 2)

• 6 key AI data storage 3. What challenges does object storage pose?


questions answered (P. 7) On paper, the high-node count storage clusters of object storage systems should be ideal for
the demands of the large, unstructured data workloads generated by AI and machine learning
• 8 factors that make AI applications. In addition, most AI and machine learning frameworks communicate with storage
via the Amazon S3 protocol, as do most object storage systems. However, in reality, other
storage more efficient (P.13)
factors come into play that can interfere with object storage's effectiveness when it comes to
• Storage options meet the AI data storage.

needs of machine learning Metadata, in particular, can be a problem, overwhelming dedicated controllers and negatively
affecting the performance of SSDs and HDDs. Contributor George Crump examines metadata
and AI (P.19) and other issues related to cluster communications, internode networking and the protocol
emulation required for IoT devices that aren't native S3 and use NFS.

4. What role will flash play in AI data storage


workloads?

Page 10 of 19
To get the information they need, AI and machine
In this e-guide learning applications process large amounts of AI applications have
data. These applications usually rely on a cluster high storage capacity
• How AI is changing the of compute nodes where at least some of the
demands that can easily
storage consumption nodes use more expensive GPUs to deliver the
performance of up to 100 CPUs. The storage start in the terabyte
landscape (P. 2) infrastructure must ensure data is continuously range and scale into
provided to these GPUs so they're always in use.
• 6 key AI data storage These systems must be able to store and process
hundreds of petabytes.
questions answered (P. 7) the millions, and even billions, of data files
typically generated by sensors or IoT devices.
• 8 factors that make AI
As a result, AI applications have high storage-capacity demands that can easily start in the
storage more efficient (P.13) terabyte range and scale into hundreds of petabytes. Crump looks at what the demands of AI
workloads on storage means for the storage media used. Because of these high demands
• Storage options meet the and the relatively high cost of flash, he expects AI storage to rely less on flash and more on a
needs of machine learning combination of RAM, hard disks and tape.

and AI (P.19)
5. How does NVMe help meet AI data storage's
needs?
NVMe provides the large bandwidth and lower latency that AI and machine learning
applications need to maximize performance. It's a natural fit with the high-performance, scale-
out storage and GPU-based compute that AI platforms use and will help eliminate I/O
bottlenecks and provide scalability.

Page 11 of 19
IT journalist John Edwards tracks how NVMe is replacing traditional storage in AI
In this e-guide environments and why that makes sense in this performance-driven world.

• How AI is changing the 6. Where does storage class memory fit in?
storage consumption
As a new tier in the memory/storage hierarchy, storage class memory (SCM) sits between
landscape (P. 2) SSDs and dynamic RAM (DRAM). These devices connect directly to the server memory and,
like DRAM, are byte-addressable. But, like NAND-based devices, SCM devices are persistent
• 6 key AI data storage and they support block-level access. SCM devices usually sit between memory and storage,
questions answered (P. 7) letting applications access large data sets through system memory without forcing enterprises
to pay the high price of DRAM.
• 8 factors that make AI
AI applications are among the use cases where emerging SCM technology makes sense,
storage more efficient (P.13) according to Sheldon. SCM devices provide the low latency, high durability and optimal
performance AI workloads require.
• Storage options meet the


needs of machine learning
Next Article
and AI (P.19)

Page 12 of 19
In this e-guide 8 factors that make AI storage more
• How AI is changing the
efficient
storage consumption
Robert Sheldon, Contributor
landscape (P. 2)
Today's AI workloads need storage systems that deliver the performance, capacity and
• 6 key AI data storage availability necessary to ensure reliable operations throughout the application's lifecycle. AI
technologies, such as machine learning, deep learning and predictive analytics, require AI
questions answered (P. 7) storage systems that can deal with the vast amounts of diverse data they generate, along with
the fluctuating and process-intensive workloads.
• 8 factors that make AI
What follows is a look at eight factors to consider when planning storage for your AI
storage more efficient (P.13)
workloads.
• Storage options meet the
1. Workload patterns
needs of machine learning
and AI (P.19) AI storage requirements vary significantly from one application to the next. They generate
different quantities of data and have a variety of access requirements and I/O patterns. For
example, a deep learning application might need to access and process data more frequently
than a basic machine learning one, while at the same time continuously add data to the
existing pool. You must thoroughly understand each workload's storage requirements now
and in the future and don't assume any two workloads are alike.

But understanding those requirements is no small matter. A typical AI application goes


through several stages of operation, and storage requirements can vary from one stage to the
next.

Page 13 of 19
For example, during the ingestion stage, vast amounts of heterogenous data are collected
In this e-guide and saved to disk, usually as sequential write operations. But during the transformation stage,
when data must be cleansed, organized and transformed, fluctuating amounts of data are
• How AI is changing the read and written, incurring both random and sequential operations.
storage consumption
landscape (P. 2)

• 6 key AI data storage


questions answered (P. 7)

• 8 factors that make AI


storage more efficient (P.13)

• Storage options meet the


needs of machine learning
and AI (P.19)

Page 14 of 19
In this e-guide 2. AI storage scalability
• How AI is changing the An AI application needs lots of data. The more data available to the AI application, the more
accurate its results. And that data can come from a variety of sources and in a wide range of
storage consumption formats. True, some AI applications require less data than others, but you must still factor in
capacity and scalability requirements. Be sure to consider the need to copy, move, aggregate
landscape (P. 2)
or in other ways manipulate and process the data.
• 6 key AI data storage All this storage can represent a significant investment, whether in data center systems or
questions answered (P. 7) cloud-based services. It can be expensive to handle the storage needed on premises,
especially using high-performing flash arrays, yet farming it all out to the cloud isn't always the
• 8 factors that make AI best alternative.
storage more efficient (P.13) One way to keep costs down and still meet scalability requirements is to use both flash and
hard-disk storage, rather than relying solely on flash. Another option is to implement a hybrid
• Storage options meet the or multi-cloud strategy. The challenge with this approach, however, is that you must carefully
needs of machine learning regulate the amount of data you're copying or migrating across platforms, and you must keep
in mind distances. Otherwise, data duplication or migration costs could undermine the
and AI (P.19) advantages of a cloud strategy.

3. Data durability
For some AI applications, the amount of data isn't the only consideration. You also must look
at how long you need to keep that data. Some applications require ongoing analytics that
continuously infuse new data into the old, a process that can span years, resulting in
enormous stockpiles of information. To ensure the data is going to be around for the duration,
you need comprehensive backup and disaster recovery strategies, in addition to heaps of
storage capacity.
Page 15 of 19
When evaluating your AI application's workload
In this e-guide patterns and scalability requirements, be sure to Organizations that
account for issues such as how long you must process and store data
• How AI is changing the hang on to the data, how the data will be
in their own data centers
storage consumption accessed going forward, what data can be
archived, when it can be archived and, of course, or on a single cloud
landscape (P. 2) the amount of data that needs to be stored platform have an
throughout the entire lifecycle.
• 6 key AI data storage advantage.
questions answered (P. 7) 4. System performance
• 8 factors that make AI An AI solution collects, processes, aggregates, trains and analyzes data. To carry out these
operations against massive data sets, AI storage must be fast and efficient, able to deliver the
storage more efficient (P.13)
necessary throughput and I/O rates, while reducing latency and contention. If the storage
• Storage options meet the system isn't built and optimized to meet these demands, you might be looking at weeks to
complete a single iteration of the data training phase.
needs of machine learning
Today's AI products often run on high-performing, GPU-based compute systems. The storage
and AI (P.19) platform must keep up with these systems to make the investment worthwhile. That means,
among other things, avoiding I/O bottlenecks and performance issues. A massively parallel
storage architecture is one way to achieve these AI storage goals, especially when training
data, which puts heavy demands on compute and storage systems alike.

5. Data locality
The location of your data plays a role in efficiently processing massive volumes. The nearer
the data is stored to where it's processed, the more efficient the operations. Organizations
that process and store data in their own data centers or on a single cloud platform have an
Page 16 of 19
advantage. Organizations that use hybrid and multi-cloud strategies could have a tougher
In this e-guide time, undermining some of the advantages that come with cloud strategies. To implement an
effective AI solution, you must minimize latencies, and distance can be one of the biggest
• How AI is changing the contributors to latency.
storage consumption
landscape (P. 2)
6. Storage type
Another consideration when implementing AI workloads is how data will be stored. Object
• 6 key AI data storage data storage is the most common approach. It has the advantage of supporting extensive sets
questions answered (P. 7) of metadata. Storing metadata along with the actual data makes it possible to describe the
data in multiple ways, which, in turn, enables faster and easier searching, an important
• 8 factors that make AI consideration with AI analytics. In addition, object storage is fast, flexible, space-efficient and
highly scalable, making it an ideal match for AI workloads.
storage more efficient (P.13)

• Storage options meet the 7. Continuous optimization


needs of machine learning Any AI storage system must be continuously optimized to maximize performance and
and AI (P.19) minimize latency. Today's intelligent storage can go a long way in helping to keep systems
optimized. An intelligent storage system, which itself uses AI technologies, can uncover
patterns in the metric data collected from the storage systems, as well as from other systems
in the environment. From these patterns, the intelligent system can automatically resolve
issues and optimize storage performance, without human intervention.

Another trend that can benefit AI workloads is software-defined storage (SDS), a systems
architecture that decouples storage software from the hardware. By abstracting the physical
storage resources, SDS provides greater flexibility, simplifies management and automates
operations, while optimizing storage performance, all of which will benefit AI workloads.

Page 17 of 19
In this e-guide 8. Cross-platform integration
• How AI is changing the No system or application exists in a vacuum. Data almost always originates from multiple
sources -- sometimes a significant number of them -- and is often stored in numerous
storage consumption locations. Hybrid and multi-cloud strategies only add to the mix, as do technologies such as
edge computing, IoT and hyper-converged infrastructures.
landscape (P. 2)
No matter how your data moves or where it's stored, you must ensure all systems seamlessly
• 6 key AI data storage integrate with one another to minimize deployment and maintenance efforts, as well as
questions answered (P. 7) potential bottlenecks. Wherever possible, use standards-based technologies to help with this
process.
• 8 factors that make AI
storage more efficient (P.13)
▼ Next Article
• Storage options meet the
needs of machine learning
and AI (P.19)

Page 18 of 19
In this e-guide Storage options meet the needs of machine
• How AI is changing the learning and AI
storage consumption Amy Novotny, Senior products editor

landscape (P. 2) Organizations are using machine learning and AI to get insights they can use to improve how
they do business. But these workloads differ from regular ones, as they require large amounts
• 6 key AI data storage of data to build and train statistical models.
questions answered (P. 7) All this data must also be processed and stored: Active data must be moved to a high-
performance platform for processing, and other data often gets transferred to long-term
• 8 factors that make AI storage. To meet these requirements, some storage vendors offer either converged
storage more efficient (P.13) infrastructure products or products that organizations can build into their machine learning
and AI projects. These tools package storage, networking and compute or scale-out file
• Storage options meet the storage with GPUs. Other vendors work with partners to deliver their products as
prepackaged appliances.
needs of machine learning
and AI (P.19) To help identify the storage product that will best meet your organization's AI workload needs,
it's important to understand which features and functionality are most important. Evaluate
features based on how your organization will be performing machine learning and AI tasks.
Key considerations include scalability, high-performance capabilities and overall cost.

This buyer's guide examines products from nine industry-leading storage vendors and newer
startups. A comprehensive product roundup compares numerous features and functionality,
including capacity, performance and the file sizes supported.

Page 19 of 19

You might also like