Professional Documents
Culture Documents
Lin Qian, Zhu Mei, Hengmao Pang, Jun Yu, Guangxin Zhu, Haiyang Chen, Min Bu
State Grid Electric Power Research Institute
(SGEPRI)
Nanjing, China
qianlin@sgepri.sgcc.com.cn, meizhu2016@aliyun.com, panghengmao@sgepri.sgcc.com.cn,yujun@sgepri.sgcc.com.cn,
zhuguangxin@sgepri.sgcc.com.cn, chenhaiyang@sgepri.sgcc.com.cn, bumin@sgepri.sgcc.com.cn
Abstract—First, it briefly introduces the status of write- This article focuses on the research work of writing
optimized distributed file system, focuses on the issues related optimized distributed file system. Specifically speaking, the
to write optimization, and then introduces write-optimized full- memory is used as the physical disk acceleration media to
stack cache acceleration and related technology of storage build a multi-level cache, so that the hot and cold data are
tiering, followed by full-stack cache acceleration in SSD Cache. adaptively provided in the storage system. Section 1 of this
Accelerated technology was introduced in detail, and the paper briefly introduces the current status of write-optimized
performance advantages were further demonstrated through distributed file systems and focuses on the issues related to
comparative testing, from the above data, we can clearly see write optimization. Section 2 describes related techniques for
that using the Flash cache, the flash cache of the four nodes is
full-stack cache acceleration and storage tiering; Section 3
64% lower than the SATA hard disk processing time. Finally,
describes full-stack cache acceleration. The SSD Cache
the challenges that may be faced in the future research work
were prospected.
acceleration technology in the focus of the introduction;
Section 4 through the Flash cache and HDD, SATA
Keywords-write feature;cache promoting;storage tiering;flash comparison test to prove the performance advantages of
cache; memory acceleration cache acceleration, the final summary of the full text, and the
future of attention to the research direction of a preliminary
discussion.
I. INTRODUCTION
In recent years, with the continuous development of II. FLASH ACCELERATION RELATED TECHNOLOGIES
emerging technologies such as big data and cloud computing,
the data in the world has increased dramatically, posing A. Full stack cache acceleration
severe challenges to the storage capacity and performance. With the rapid development of hardware technology,
Distributed file system has become a research hotspot as the attention is paid to the local rules of the program. Memory,
underlying mass data storage system supporting the big data SSD, NVM, and 3D-Xpoint are used as the first storage
era [1]. medium, and the storage location of data is dynamically
Data reading and writing are two basic operations adjusted to balance the conflict between data capacity and
provided by a distributed file system. The performance of system performance. . At the same time, we noticed that the
read operations of a distributed file system[2] is extremely performance bottleneck of the flash memory storage medium
high due to the read performance of physical media such as is the write operation, not only because the write operation is
memory and SSD, and read operations do not need to more time-consuming than the read operation, but also
consider data security. Research heat has not received much because the read operation is also affected by the lock
attention compared to write optimization. mechanism when the write operation is not completed.
In the process of writing data, the distributed file system In this research direction, as early as the second session
needs to solve the contradiction between the consistency of of the FAST International Conference in 2003, N. Megiddo
write data brought by the distributed architecture [3] and the [5] and others proposed an adaptive cache algorithm to
overall system performance, and has become a key concern accelerate the storage IO performance. In the following years,
in the field of distributed storage in recent years. FAST, the FAST, MSST and other important international storage The
most important academic conference in the field of storage, optimization of storage IO performance by the cache at the
reported on the topic of writing optimized file systems in the meeting has become a hot topic and will continue until 2016
form of topics in the year 15 and 16. Among them, the write with new applications such as mobile.
optimization of memory acceleration [4] has a supportive S. Huang [6] and others proposed using flash media and
role in significantly improving the system performance, and adaptive algorithms to optimize the underlying storage
the premise of how to ensure data storage security Next, the performance. R. Koller et al. [7] provided high-performance
use of write buffers to speed up the concurrent suspension of storage through the client's flash acceleration technology to
file system problems caused by frequent IO is a hot topic. enhance the coverage of the client to the bottom layer. The
309
RAND: random replacement strategy. The essence of this The essential function of Device mapper is to forward IO
strategy is that, when it is necessary to replace the block in requests from the mapped device of the logical device to the
the SSD, the serial number of the SSD to be replaced with corresponding target device according to the mapping
the hardware and software random number is replaced. relationship and the IO processing rule described by the
target driver.
III. SSD CACHE ACCELERATION TECHNOLOGY DM-Cache utilizes the Device Mapper principle. The
Device mapper is a mapping framework mechanism from HDD common disks and SSD disks are regarded as Target
logical device to physical device provided in the Linux 2.6 Devices in the DM, aggregated into one virtual device
kernel. Under this mechanism, users can easily formulate (Cache device), and the HDD ordinary disks are mapped
management strategies for implementing storage resources with the SSD disk blocks. The Dm-Cache can be mapped to
according to their own needs. the HDD ordinary disks according to the mapping. The
Device mapper is registered as a block device driver in operations are converted to operations on the SSD device,
the kernel. It contains three important object concepts: thereby realizing the function of storing the hot spot data in
mapped device, mapping table, and target device: the SSD. The mapping of the HDD common disk to the SSD
Ь Mapped device is a logical abstraction that can be device block may be a simple linear mapping or a Hash
understood as a logical device provided by the kernel. mapping.
Ь The mapping table describes the mapping from the IV. PERFORMANCE TESTING
Mapped device to the Target device.
Ь Target device represents the physical space segment A. Comparison test with HDD hard disk
mapped by a mapped device. For a logical device
Here we chose the Flash cache as the Cache software for
represented by a mapped device, it is a physical device to
testing. Compared with the performance of Flash cache and
which the logical device is mapped.
HDD hard disk, the test produced 4 concurrent requests to
The three objects in the Device mapper and the target
ngnix to forward pressure to 6 clients to mix and
driver plug-in together form an iterable device tree. As
concurrently read and write data, and record the average
shown below:
feedback time and processing per transaction per press. The
total number of transactions and each node and client nmon
data, using the Filebench tool for results analysis.
Flash cache, 5 nodes, 400 concurrent, 50% read, 50%
write, 8k files:
310
From the above data, we can clearly see that with the analyzes that the future write-optimized distributed file
Flash cache, the read/write performance has been greatly system still has room for optimization in terms of network
improved, which is about 6 times higher than that of the latency, local file system load awareness, and further
ordinary HDD hard disk. optimization of the metadata structure.
TABLE I. RAN- READ AND WRITE TEST RESULTS OF FLASH CACHE ACKNOWLEDGMENT
AND HDD
This work was financially supported by the State Grid
Disk Type 5 Node Notes Corporation of Science and Technology Project(WBS
HDD hard 372ops/s 5 Nodes, 400 number:521104170019).
disk Concurrency, 50% REFERENCES
Read, 50% Write,
[1] Zhou Jiang,Wang Weiping,Meng Dan,Ma Can,Gu Xiaoyan,Jiang Jie,
8k File “Key Technology in Distributed File System Towards Big Data
Flash cache 61ops/s 5 Nodes, 400 Analysis,” Journal of Computer Research & Development, Feb. 2014 ,
Concurrency, 50% 51 (2) :pp.382-394.
Read, 50% Write, [2] Zhu, Yifeng. Improved read performance in a cost-effective, fault-
tolerant parallel virtual file system (CEFT-PVFS). In Proceedings 3rd
8k File IEEE/ACM International Symposium on Cluster Computing and the
Grid, pp 730-735, 2003.
[3] Chen, Jie; Tan, Zhihu; Wu, Fei; Xie, Changsheng. SJournal. A new
B. Comparison test with SATA hard disk design of journaling for file systems to provide crash consistency. In
Compared with the performance of Flash cache and Proceedings - 9th IEEE International Conference on Networking,
SATA hard disk, the scenario description: Architecture, and Storage, pp 53-62, 2014.
Ь 800 concurrent random reads and writes [4] Chen, Kerhong; Bunt, Richard B.; Eager, Derek L. Write caching in
distributed file systems. In Proceedings International Conference on
Ь File size: 128k (60%) 1M (40%) Distributed Computing Systems, pp 457-466, 1995.
Ь Number of initialization files per client: 5000 Folders: [5] N. Megiddo and D. S. Modha. ARC: A self-tuning,low overhead
50 replacement cache. In Proceedings ofthe 2nd USENIX Conference on
Ь Read & Write Strategy: Read 60% Write 40% File and Storage Technologies (FAST 03), pages 115–130,
Berkeley,CA, USA, 2003. USENIX Association.
Ь Network Type: 10Gb [6] S. Huang, Q. Wei, J. Chen, C. Chen, and D. Feng.Improving flash-
Ь Number of nodes: 4,5 based disk cache with lazy adaptive replacement. In Proceedings of
the 29th IEEE Symposium on Mass Storage Systems and
TABLE II. RAN- READ AND WRITE TEST RESULTS OF FLASH CACHE Technologies(MSST 13), pages 1–10. IEEE, 2013.
AND SATA [7] R. Koller, L. Marmol,R. Ranganswami, S. Sundararaman,N.
Talagala, and M. Zhao. Write policies for host-side flash caches. In
Disk Type 4 Node 5 Node Notes Proceedings of the 11th USENIX conference on File and Storage
Sata 7200 1T 635ms 432ms Sata line is Technologies (FAST 13), 2013.
connected to raid [8] C. Li, P. Shilane, F. Douglis, H. Shim, S. Smaldone,and G. Wallace.
Nitro: A capacity-optimized SSD cache for primary storage. In
card, raid card has Proceedings of the 2014 USENIX Annual Technical Conference
512MB memory, (ATC14), pp 501–512. USENIX Association, 2014.
faster than bare disk [9] Dulcardo Arteaga,Jorge Cabrera,Jing Xu,Swaminathan
PCI-e as 227ms 171ms Pci-e cache cache Sundararaman, and Ming Zhao. CloudCache: On-demand Flash
Cache Management for Cloud Computing. In Proceedings of the 11th
Flash Cache for sata, size 100GB USENIX conference on File and Storage Technologies (FAST 14),
From the above data, we can clearly see that using the pages 352–369. USENIX Association, 2016.
Flash cache, the flash cache of the four nodes is 64% lower [10] Islam, Nusrat Sharmin, Wasi-Ur-Rahman, Md., Lu, Xiaoyi, Panda,
than the SATA hard disk processing time. Dhabaleswar K. High performance design for HDFS with byte-
addressability of NVM and RDMA. In Proceedings of the
V. CONCLUSIONS International Conference on Supercomputing, pp1-14,2016.
[11] J Sim. Dynamically configuring regions of a main memory in a write-
For the rapid popularization of Internet applications, back mode or a write-through mode. Advanced Micro Devices Inc
designing and developing a write-optimized distributed file California, 2014.
system that can handle a large number of small files has [12] Tanwir,G Hendrantoro,A Affandi. Early result from adaptive
become a hot topic in the field of storage research. This combination of LRU, LFU and FIFO to improve cache server
article focuses on the key technology of flash acceleration performance in telecommunication network. International Seminar on
write optimization to improve database application Intelligent Technology & Its Applications,2015,pp 429-432.
performance, and demonstrates the performance advantages
of this technology through comparative testing. The author
311