Paper 24 - Design and Implementation For Multi-Level Cell Flash Memory Storage Systems

(IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 2, No.
11, 2011
Design and Implementation for Multi-Level Cell Flash Memory Storage Systems
Amarnath Gaini, K Vijayalaxmi
Assistant Professor Department of Electronics VITS (N9), Andhra Pradesh
Sathish Mothe
Assistant Professor Department of Electronics JITS (27), Andhra Pradesh implemented in the Flash Translation Layer (FTL). The objective of the FTL is to provide transparent services for file systems such that flash memory can be accessed as a block oriented device. As an alternative approach, a file system can be made flashmemory-aware by having the characteristics of flash memory been taken into consideration. JFFS and YAFFS take this approach. Although a flash-memory aware file system is more efficient, FTL scheme has the advantage of enabling flash memory available to existing file systems directly. Since most of the applications, especially for portable devices, access their files under FAT file systems, this paper focuses on FTL.
Abstract The flash memory management functions of write coalescing, space management, logical-to-physical mapping, wear leveling, and garbage collection require significant on-going computation and data movement. MLC flash memory also introduces new challenges: (1) Pages in a block must be written sequentially. (2) Information to indicate a page being obsolete cannot be recorded in its spare area. This paper designs an MLC Flash Translation Layer (MFTL) for flash-memory storage systems which takes new constraints of MLC flash memory and access behaviors of file system into consideration. A series of trace driven simulations is conducted to evaluate the performance of the proposed scheme. Our experiment results show that the proposed MFTL outperforms other related works in terms of the number of extra page writes, the number of total block erasures, and the memory requirement for the management. Keywords-Flash memory; MFTL; MLC; BAST; FAST.
I.
INTRODUCTION
Flash memory chips are constructed from different types of cells (NOR and NAND), and with different numbers of cells per memory location (single-level cell or SLC; and multi-level cell or MLC). These variations result in very different performance, cost, and reliability characteristics. NOR flash memory chips have much lower density, much lower bandwidth, much longer write and erase latencies, and much higher cost than NAND flash memory chips. For these reasons, NOR flash has minimal penetration in enterprise deployments; it is primarily used in consumer devices. Leading enterprise solid state drives (SSDs) are all designed with NAND flash. Another distinction in flash memory is SLC versus MLC. MLC increases density by storing more than a single bit per memory cell. With their increased density, the cost of MLC flash chips is roughly half that of SLC, but the MLC write bandwidth is about 2 times worse than SLC, and MLC supports from 3 to 30 times fewer erase cycles than SLC. A new generation of SSDs incorporates special firmware that closes the performance and durability gap between SLC and MLC. Flash-memory storage systems are normally organized inlayers, as shown in Figure. 1. The Memory Technology Device (MTD) layer provides lower-level functionalities of flash memory, such as read, write, and erase. Based on these services, higher-level management algorithms, such as wear-leveling, garbage collections, and logical/physical address translation, are
Figure 1. Architecture of Flash-Memory Storage Systems
Depending on the granularity with which the mapping information is managed, FTL can be classified into page-level, block-level, and hybrid mapping schemes. For a page-level mapping scheme, each entry in the mapping table includes a physical page number (PPN), which is indexed by the logical page number (LPN). When a request arrives, the mapping scheme translates the associated logical address into an LPN and uses this LPN to locate the corresponding PPN from the mapping table. With the mapping table, a logical address can be directly mapped to a page in any location of a flash memory. It makes storage management more flexible and results in better performance for random accesses. Since the mapping table is large in size
138 | P a g e www.ijacsa.thesai.org
(IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 2, No. 11, 2011
and usually resides in SRAM, a page-level mapping scheme is considered impractical due to cost consideration. To reduce SRAM requirement, block-level mapping scheme was proposed. In a block-level mapping scheme, a logical address has two components: a logical block address (LBA) and a logical page offset. LBA can be translated into a physical block address (PBA), and the logical page offset is used to locate the target page in the physical block. The size of the mapping table is now significantly reduced and proportional to the total number of blocks in flash memory. However, update requests usually incur a block-level copy overhead. Block level mapping shown in fig 2.
stringent constraints [8]. First, the number of partial program cycles in the same page is limited to one. This constraint implies that it is no longer permitted to mark a page as dead by simply clearing some specific bit in the spare area, which is widely adopted in previous researches. Another constraint is related to the order in which write operations are performed. In MLC, free pages of a block must be written consecutively from the first free page to the target page in the block; random written order is prohibited. Such a constraint makes most existing block-level mapping schemes and hybrid mapping schemes inapplicable for modern flash memory chips. Further, direct modification to these mapping schemes is not appropriate as it will inevitably lead to a significant increase of overhead. Motivated by the above concerns, we propose a hybrid mapping scheme for the management of the modern MLC flash memory. Since FAT file system is most commonly used for accessing flash memory, the proposed scheme focuses on the file access behavior under FAT file system as well. In the proposed scheme, the mapping granularity would adjust adaptively according to the access pattern. Our scheme is distinguished from existing solutions in that the proposed scheme consumes less SRAM while provides a better performance. In addition, existing solutions does not take stringent constraints of MLC flash memory into account while our scheme could apply to SLC flash memory as well. The rest of the paper is organized as follows. Section II presents the proposed scheme in details. Section III evaluates the performance of the proposed scheme and compares it with existing works. Section IV draws the conclusion. II. A. LAYER 1) Overview When a write request arrives, MFTL calculates the corresponding virtual block address (VBA) via the logical address of the request, allocates a free (physical) flash-memory block for the virtual block, and writes data to the allocated block. The PBA of the allocated block is recorded in Block Table [VBA]. Note that VBA is similar to LBA in block-level mapping schemes, and we use different notation to avoid confusion. Since out-place update is a widely adopted solution for handling data-update requests, an extra replacement block (physical flash-memory block) is assigned to store the newly updated data. To keep track of the latest version of data, an UpdateRec is created whenever a replacement blocks is allocated for a virtual block. Note that we do not maintain UpdateRec for every virtual block all the time as we do for Block Table. In fact, as shown later in the experiments maintaining five UpdateRecs would be enough to deal with sustained read/write requests and MFTL always keeps the latest UpdateRecs in SRAM. Since data updates might be random and data amount of each update might be small, only UpdateRec is insufficient if fast data access is required. 2) Data Structure Each UpdateRec contains the following fields: MFTL:MLC FLASH TRANSLATION
Figure 2: Block level mapping
A hybrid mapping scheme is a block-level mapping scheme with a limited number of records adopting page-level mapping. In a hybrid scheme, a logical address is consisted of an LBA and a logical page offset as a block-level scheme. However, unlike the block-level scheme, it allows data update to the next available page in another physical block, i.e., a log block. A page mapping table is required for recording the actual locations of the updated data. Hybrid mapping schemes consume an extra amount of SRAM space for the records with page-level mapping but reduces the copying overhead of a block-level mapping scheme. Block Associative Sector Translation (BAST) is a hybrid mapping scheme proposed by Kim et al. [4]. BAST maintains a limited number of log blocks to handle update requests. Each data block may associate with at most one log block. When processing an update request, the associated data is appended to free pages of the corresponding log block sequentially, as long as such free pages are available. Merge operation is required when no free log block available. Fully Associative Sector Translation (FAST) is another hybrid mapping scheme proposed by Lee et al. [6]. FAST shares the log blocks among all of the data blocks. It allows an update request to be written to the current log block. As a result, FAST improves the utilization of log blocks and reduces the number of merge operations. Since a log block may associate with several data blocks at the same time, merging a log block may require the merging of several data blocks. Both BAST and FAST are designed for single-level cell (SLC) flash memory. Currently, multi-level cell (MLC) flash memory has occupied the largest part of the flash memory market, due to an increasing demand for a larger capacity and smaller size storage medium. In MLC flash memory, each cell can store two or more bits of data. Although MLC flash memory offers a lower cost and higher density solution, it imposes additional
a) VBA records the corresponding virtual blockaddress. b) Primary records the physical block address of the associatedprimary block. c) Replace records the physical block address of The associatedreplacement block. d) LWP is the index of the last written page in the associatedreplacement block. e) Count maintains the number of backward updates occurred in thisrecord. The initial value of Count is 0. f) Mode is a one-bit flag indicating whether the mapping isin block mapping mode (0) or pagemapping mode (1). g) Priority is the priority for UpdateRec replacement. Whenwrite/update request arrives, Priority of The correspondingUpdateRec (if any) is set to 0xFF, while Priority ofotherUpdateRecs would be decreased by 1. Page Map Status is introduced to maintain page-mapping information for virtual blocks. When an UpdateRec switches from block mapping mode to page mapping mode, MFTL has to merge the corresponding primary block into the replacement block and then treat the resulting replacement block as the primary block. We can eliminate such overhead by directly switching to page mapping mode without merging operation when LWP is less than half of total pages in a block. The most significant bit (MSB) is used to indicate whether the mapped age is located in the primary block (set to 0) or in the replacement block (set to 1). Each Page Map Status contains two elements: a) VBA records the corresponding virtual blockaddress. b) MapArray[N] is an integer array keeping the page mapping information of the block, Where N is the number of pages per block. A value smaller than 0x80 in the Map Array[i] means the i-th logical page is located at the Map Array[i]-th page of the primary block. Otherwise, the i-th logical page is located at the (Map Array[i] AND 0x7F)-th page of the replacement block. 3) Write Flow When the FAT file system creates a new file, it first issues a write to root directory to update the file information. It then updates FAT tables to allocate required storage space for the file. Finally, the file body is written to the allocated area. Conceptually, the storage space could be treated as containing two parts: system area and data area. System area consists of a boot record, FAT tables and a root directory, while data area is the area used to store file body. Any write/update to a file (data area) is accompanied by several small updates to the system area to ensure correct file information. Observing the very different behavior over system area and data area, we adopt
various mechanisms to deal with write operations in an adaptive manner. Since writing data is trivial in flash-memory management, we will focus on the operation of data updating. Upon arrival of an update request, MFTL first determines whether the corresponding UpdateRec exists by checking the VBA field of each UpdateRec. If no such an UpdateRec exists, MFTL allocates a new one. In case that all the UpdateRecs have been assigned and the new update request does not match any one of them, MFTL will select the UpdateRec with the lowest priority as a victim for replacement (ties are broken arbitrarily). Live pages in the primary block and the replacement block of the victim UpdateRec will be merged to a newly allocated block, and the victim UpdateRec could then be released and reassigned to the new request. After the UpdateRec is located, MFTL determines the mapping mode of the UpdateRec. Different mapping mode would affect the manner how MFTL writes data to flash memory. If the mapping mode is in block mapping mode, some further checks are required. According to the index of starting target page for the update request (i.e., STP) relative to the index of the last written page in the replacement block (i.e. LWP); the update request is treated differently. For the case in which STP is larger than LWP, the request is treated as a forward update. Otherwise, some previously updated data is to be updated again by the request, and the request is treated as a backward update. Since backward updates incur higher page copying overheads, page-mapping mode is thus introduced to deal with small, random, and frequently updated data. The Count Field in UpdateRec is used to maintain the number of backward updates that had occurred in this record. When it exceeds a threshold, the mapping is switched to page mapping mode. If a backward update occurs but does not result in a switch to page mapping mode, MFTL will examine whether page-copying after updating data is required. The primary reason behind page-copying after updating data is the assurance of data integrity, since the primary block or the replacement block would be reassigned in this case. 4) Read Flow For a read request, MFTL locates the target data through either Block Table or UpdateRecs (and Page Map Statuss, if required).When a read request arrives, MFTL first derives the corresponding VBA and STP via the logical address of the request. Then, MFTL determines whether the derived VBA matches the VBA field of any UpdateRec. If no such UpdateRec is found, the PBA of the target data can be obtained from Block Table [VBA]. With the PBA, MFTL accesses the corresponding flash-memory block and read the required data from page STP to the last page of the block. Suppose that the UpdateRec is successfully found, and it is in block mapping mode. In this case, MFTL has to identify which block and how many pages shall be read. It compares the derived STP with LWP of the UpdateRec. If STP is smaller than or equal to LWP, a portion of the required data shall be read from the corresponding replacement block. MFTL reads the required data from page STP to page
LWP in the replacement block and then read from page LWP+1 to the last page in the primary block. Otherwise, MFTL simply reads the required data from STP to the last page in the primary block. For the case in which the corresponding UpdateRec is found and in page mapping mode, MFTL may not be able to read the required data sequentially because the live pages in the replacement block are not programmed in such a manner. MFTL must look up Map Array to determine physical locations of the needed data. For each target page i, if the value of Map Array[i] is 0xFF, MFTL reads data from page i in the primary block. Otherwise, MFTL reads from page Map Array[i] in the replacement block. III. PERFORMANCE E VALUATION
Windows XP writes files with size over64KB by two threads; five extra blocks might be required for updating a large file. As a result, over six extra blocks would not have an obvious improvement for extra page writes and block erasures.
TABLE I: CHARACTERISTICS OF EVALUATION PATTERNS
A. Experiment Setup This section is meant to evaluate the performance of the proposed MFTL in terms of the number of extra page writes2, the number of total block erasures, and the memory requirement for the management. We compared the proposed MFTL with two well-known hybrid mapping schemes, BAST [4] and FAST [6], under different number of extra blocks. Suppose the number of extra blocks is set to n, MFTL could have up to n UpdateRecs. The maximum number of blocks reserved for random accesses in BAST could be n, while FAST would contain one sequential write (SW) log block and (n 1)random write (RW) log blocks. The experiments were conducted by a trace-driven simulation. Traces with various access behaviors were collected under Windows XP with FAT16 file system over a 2GB flash memory. The flash memory chip adopted in the experiment was Samsung K9WAG08U1B 16G-bit SLC flash memory each block contains 64 pages, and each page is of 2KB. All of the traces were captured by the SD/MMC card protocol analyzer VTE2100 [9]. Note that the FAT file system would update the root directory and FAT tables frequently to maintain the correct file information. The cluster size was defined as16KB, where the cluster size was the minimum allocation unit for a file. TABLE I summarizes characteristics of the seven evaluation patterns. B. Experiment Results Fig. 3 compares the number of extra page writes and the number of total block erasures of BAST, FAST, and the proposed MFTL under different number of extra blocks over a 2GB flash memory. In addition to the extreme case (i.e. the pattern 4KB-R), FAST had a good performance when the number of extra blocks was very limited; three extra blocks would be enough to make its performance stable. This was because FAST shares its extra blocks for all logical blocks. However, the proposed MFTL outperformed both BAST and FAST for all patterns when four extra blocks could be provided. Based on our experimentation, the performance improvement was significant when the number of extra blocks was between four and five. Such results were caused by two reasons: (1) Since the FAT file system would update two FAT tables, root directory, and data area while writing a file to storage, four extra blocks could significantly improve the performance area while writing a file to storage, four extra blocks could significantly improve the performance. (2)
For the pattern 4KB-R, as illustrated in Fig. 3(a) and Fig. 3(b), the number of extra blocks seemed to have no effect on the performance improvement. It was because a tremendous amount of random 4KB file updates would rapidly use up log blocks. For BAST, each subsequent update would trigger a merge operation, and each merge operation would incur 64extra page writes and 2 block erasures. For FAST, since a log block was shared by all data blocks, merge operations would not be triggered as frequent as BAST did. Recall that each flash-memory block contains 64 2KB pages and the cluster size was 16KB, each merge operation would incur 8 64 = 512 extra page writes and 9 block erasures (compared with 8 merge operations in BAST with 512 extra page writes and 16 block erasures). For MFTL, random 4KB file updates would not switch UpdateRecs into page mapping mode, and each 4KB file update would be treated as a forward update in block mapping mode (please refer to Section II-C1). In addition, updates to root directory and FAT tables could always reside in page mapping mode thanks to Count and Priority in UpdateRec. As a result, FAST and MFTL could have a better performance in extra page writes and outperform BAST in block erasures. For the pattern 64KB-S, as illustrated in Fig. 3(c) and Fig.3(d), there was no internal fragmentation issue and readers might expect similar performances for the three hybrid mapping schemes. However, the experiment result showed a different story. For BAST and FAST, while there was no log block assigned to the corresponding data block and page off set of the write request did not align to the block boundary (i.e. page offset 0), a log block would be assigned and updated data would be written to the page 0 of the block. Even the following write requests were sequential writes; BAST and FAST would still incur extra page writes and block erasures during merge operations. On the other hand, as mentioned in Section II-C, MFTL uses Count in UpdateRec to prevent from switching to page mapping mode immediately.
This mechanism helped UpdateRec to keep in block mapping mode for sequential write requests. Thus merge overheads could be reduced, and a better performance was achieved. For patterns SUB and MP3, since large files led to much more sequential write operations, the number of merge operations was largely reduced. As shown in Fig. 3(f) and Fig. 3(h),the improvement over the reduction of block erasures was not as significant as patterns 4KB-R and 64KB-S; MFTL could achieved a better performance on block erasures and extra page writes compared With the other two schemes for large file patterns while at least four extra blocks were provided. Note that in the pattern MP3 with at least four extra blocks, MFTL outperformed BAST and FAST in terms of extra page writes. It was because MFTL uses Count in UpdateRec to prevent from switching to page mapping mode immediately. With a small overhead of page copying, this mechanism helped UpdateRec to keep in block mapping mode. As a result, merge overhead could be reduced
when an update request dose not begins from page 0 of some block. For the memory requirements of the three hybrid mapping schemes, since block mapping tables were the same for each scheme, we focused on the memory requirement for managing extra blocks. TABLE II lists the memory requirements for each extra block and each Map Array under the three different schemes. BAST reserved one more byte to store information for the selection of a victim log block when a merge operation is triggered. MFTL reserved three extra bytes for fields of LWP, Count, Mode, and Priority. For page mapping, BAST required an array for each log block to keep track of the page mapping information. Since each block contained 64 pages, 64entries were required for the array and each entry was of one byte. FAST required an array for each RW block. Since each RW block was shared by all data blocks, Each entry of the array required three bytes to maintain complete information. MFTL did not reserve an array for each UpdateRec. Thus MFTL required two more bytes to record the corresponding VBA.
Figure 3: Performance Comparison of BAST/FAST/MFTL under Different Number of Extra Blocks.
up to 69.55% compared with FAST.

TABLE II: SRAM REQUIREMENTS FOR DIFFERENT SCHEMES (UNIT: BYTES)
These savings can be as high as 59.44% and 83.89% when there are ten extra blocks provided by the system.
TABLE III: SRAM REQUIREMENTS FOR MANAGING UPDATE REQUESTS WITH DIFFERENT NUMBER OF EXTRA BLOCKS. (UNIT: BYTES)
TABLE III lists the memory requirements for these schemes with respect to different number of extra blocks. As shown in the table, when the number of extra blocks is five, MFTL can save about 31.55% memory space compared with BAST and
IV. CONCLUSION Flash technology is constantly changing, providing faster program and erase cycles, a bigger number of guaranteed erase and re-program cycles and longer data retention. This paper proposes a management scheme for MLC flash memory storage systems. Observing that most of existing user applications access NAND flash memory under FAT file system, this paper takes new constraints of MLC flash memory and access behaviors of FAT file system into consideration. The proposed MFTL scheme treats different access behaviors in an adaptive manner. It adopts a hybrid mapping approach to balance between performance and SRAM requirement. An extensive set of experiments has been conducted to evaluate the performance of the proposed MFTL. Our experiment results show that the proposed scheme out performs BAST and FAST in terms of extra page writes and block erasures while SRAM requirement is largely reduced. For the future work, we shall further take the crash recovery and wear leveling issues into our design and survey other newly announced hybrid mapping schemes, e.g., LAST and KAST, for comparisons. REFERENCES
[1] [2] [3] [4] Journaling Flash File System (JFFS) and Journaling Flash File System 2 (JFFS2), http://sources.redhat.com/js2/js2-html/ Understanding the Flash Translation Layer (FTL) Specification. Technical report, Intel Corporation, Dec 1998. Aleph One Company. Yet another Flash Filing System. H. Cho, D. Shin, and Y. I. Eom. KAST: K-Associative Sector Translation for NAND Flash Memory in Real-Time Systems. In Design, Automation and Test in Europe (DATE), pages 507512, April
[5]
[6]
[7]
[8] [9] [10] [11] [12]
[13] [14]
[15]
2009. J. Kim, J. M. Kim, S. H. Noh, S. L. Min, and Y. Cho. A Space Efficient Flash Translation Layer for Compact Flash Systems. In IEEE Transactions on Consumer Electronics, Vol. 48, No. 2, pages 366375. S. Lee, D. Shin, Y.-J. Kim, and J. Kim. LAST: Locality-Aware Sector Translation for NAND Flash Memory-Based Storage Systems. ACM SIGOPS Operating Systems Review, 42:3642, October 2008. S.-W. Lee, D.-J. Park, T.-S. Chung, D.-H. Lee, S. Park, and H.-J. Song. A Log Buffer-Based Flash Translation Layer Using Fully-Associative Sector Translation. In ACM Transactions on embedded Computing Systems, Vol. 6, No. 3, Article 18, July 2007. M-Systems. Flash-memory Translation Layer for NAND flash (NFTL), 1998. Samsung Electronics. Samsung K9LBG08U0M(v1.0) - 32Gb DDP MLC Data Sheet. Test metrix Inc. VTE2100. D. Woodhouse. JFFS: The Journaling Flash File System. In Ottawa Linux Symposium, 2001. A set-based mapping strategy for flash-memory reliability enhancement. Design, Automation & Test in Europe Conference & Exhibition, 2009. Program/erase selection for flash memory, Jerry A. Kreifels et al. A flash-memory based file system, TCON'95 Proceedings of the USENIX 1995 Technical Conference Proceedings USENIX Association Berkeley, CA, USA 1995 Dr. John R Busch; Flash Memory Architecture Alternatives. AUTHOR PROFILE
Amarnath Gaini received the Bachelors. degree in Electronics and Communication Engineering from JNTUH, Hyderabad, India in 2007 and Masters degree in VLSI System Design from JNTUH, Hyderabad, India in 2009. He is currently working as an Assistant professor in Electronics and Communication Engineering department in VITS (N9), Andhra Pradesh, India. He has a Life Member ship in ISTE. His research interests in VLSI, Embedded systems and digital systems platform.

Paper 24 - Design and Implementation For Multi-Level Cell Flash Memory Storage Systems

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Paper 24 - Design and Implementation For Multi-Level Cell Flash Memory Storage Systems

Uploaded by

Copyright:

Available Formats

(IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 2, No.

Figure 2: Block level mapping

Figure 3: Performance Comparison of BAST/FAST/MFTL under Different Number of Extra Blocks.

up to 69.55% compared with FAST.

[8] [9] [10] [11] [12]

You might also like