You are on page 1of 2

Data Sheet

Auto-Tiered Storage
Conquering I/O Bottlenecks with Hybrid Storage

Overview
For more than a decade, storage system When data is requested from ZFS, it
performance has remained rather stagnant first looks to the ARC; if it is there, it can
while drive capacities and application be retrieved extremely fast (typically in
performance demands steadily have nanoseconds) and provided back to the
increased. The result of this trend is application. The contents of the ARC are
an expensive problem: Storage users balanced between the most recently used
are forced into buying expensive hard (MRU) and most frequently used (MFU)
disk drives (HDDs) to get a moderate data.
performance boost (by reducing I/O Level-Two ARC (L2ARC): The L2ARC
latency) and/or forced into over-buying lives in SSDs. In concept, it is an extension
capacity in order to meet performance of the ARC. Without an L2ARC, data that
requirements. could not fit in the ARC would have to be
With the advent and decreasing price of retrieved from HDDs when requested. That
flash, storage vendors are integrating is when drive speed makes a difference,
it into their products to solve this but the performance difference between
problem. ZFS technology is leading the “fast” (e.g., 15k-RPM) and “slow” (e.g.,
industry in its ability to automatically and 7,200-RPM) is in terms of latencies
intelligently use flash in a storage system measured as a few milliseconds or
that offers the appropriate capacity and several milliseconds; both are dramatically
performance capabilities at a total cost slower than ARC accesses measured in
that is dramatically lower than most legacy nanoseconds. L2ARC, in SSDs, fits nicely
storage systems. between the two—both in terms of price
and performance.
Buying hundreds of gigabytes of flash is
The ZFS Hybrid Storage Pool cheaper than the same capacity of DRAM
ZFS is a robust, scalable file system with (though still more expensive today than
Hybrid Storage Pools features not available in other file systems HDDs), and flash’s I/O latencies typically
available today. One of these revolutionary are measured in microseconds—slower
features is the ZFS Hybrid Storage Pool than DRAM but still far faster than even
(HSP), which allows you to combine “high-performance” HDDs. The L2ARC is
DRAM, SSDs, and spinning HDDs into an populated by data first placed in the ARC
accelerated storage medium. Below we will as it becomes apparent that the data might
explore each of these offerings. get squeezed out of the ARC, and not
every piece of data that existed in ARC will
Adaptive Replacement Cache (ARC): make it to the L2ARC (those that do not
The ARC lives in DRAM. It is the first would be retrieved from HDDs instead, if
destination for all data written to a ZFS requested); the algorithms that manage
pool, and it is the fastest (lowest-latency) L2ARC population are automatic and
source for data read from a ZFS pool. intelligent.
Data Sheet

ZFS Intent Log (ZIL): The ZIL is used Hard Disk Drives (HDD): With the
to handle synchronous writes—write ARC, L2ARC, and ZIL/SLOG providing
operations that are required by protocol the bulk of the performance from a ZFS
(e.g., NFS, SMB/CIFS) to be stored in a Hybrid Storage Pool, spinning drives
non-volatile location on the storage device are relegated to the job they do well—
before they can be acknowledged—to providing lower-performance, higher-
ensure data stability. Databases usually density, low-cost storage capacity. Until the
require transactions to be on stable day that flash competes with HDDs on a
storage devices before continuing, so dollar-per-gigabyte front, the right balance
reducing the latency of synchronous writes of DRAM and flash for performance, and
has a direct impact on performance. HDDs for capacity, results in a total cost of
ownership (TCO) that is less—both initially
ZFS can do this by using placing the ZIL
and over the long-term—than solving both
on a SSD. All writes (whether synchronous
or asynchronous) are written into the requirements using all flash or all HDDs.
ARC in DRAM, and synchronous writes A New Storage Parameter:
are also written to the ZIL before being Working Set Size
acknowledged.
For legacy storage systems, sizing means
Under normal conditions, ZFS regularly determining necessary capacity, IOPS,
bundles up all of the recent writes in the and throughput and then performing some
ARC and flushes them to the spinning simple math to determine the number of
drives—at which point the data in the ZIL is spindles that could provide those numbers.
no longer relevant (because it now exists Working Set Size
on its long-term, non-volatile destination) As the industry moves towards more
and can be replaced by new writes. The sophisticated caching methodologies in
ZIL only is read from when synchronous storage systems, a new parameter for
writes in the ARC are unable to be written expressing storage needs has become
to spinning disk—like after a power failure evident: The Working Set Size (WSS) can
be described as the subset of total data
or controller failover—at which point ZFS
that is actively worked upon (e.g., 500GB
Conclusion
reads the ZIL and places that data onto
the spinning drives as intended. One of this quarter’s sales data out of a total ZFS hybrid storage pools intelligently
might compare this concept to non-volatile database of 20TB). combine DRAM, flash, and hard disk
RAM (NVRAM) from storage vendors like drives to achieve the right balance of
Knowing the WSS makes it possible to
NetApp, but where NVRAM uses batteries cost and performance for any given
size ARC, L2ARC, and even HDDs more
that can wear out and have other issues, working set, while reducing the need
accurately, but few applications today have
write-optimized SLC (single-level cell) flash for administrators to constantly monitor
an awareness of WSS.
devices do not need batteries. storage for I/O bottlenecks.

And while NVRAM scalability is limited By reducing both read and write latency
to available slots, adding SLOGs is as with the use of flash in a ZFS hybrid
easy as adding HDDs. Like L2ARC, the storage pool, we end up with a system that
ZIL/SLOG is managed automatically and performs far better than legacy storage
intelligently by ZFS: Writes that need it are systems, while having a much lower total
accelerated, without any additional effort cost of ownership (TCO).
by the administrator.

Nexenta Systems is the leading supplier of enterprise-class Nexenta Systems, Inc. Nexenta Systems EMEA
OpenStorage solutions. Its flagship software-only platform, 455 El Camino Real Camerastraat 8,
NexentaStor, delivers high-performance, ultra-scalable, cloud- Santa Clara, CA 95050 1322 BC, Almere
and virtualization-optimized storage solutions. www.nexenta.com The Netherlands
www.facebook.com/nexenta
Copyright © 2012 Nexenta® Systems,
twitter.com/nexenta
ALL RIGHTS RESERVED
Rev. 080712

You might also like