You are on page 1of 5

H. Yan and Q.

Yao: An Efficient File-aware Garbage Collection Algorithm for NAND Flash-based Consumer Electronics 623

An Efficient File-aware Garbage Collection


Algorithm for NAND Flash-based
Consumer Electronics
Hua Yan, Member, IEEE, Qian Yao

Abstract — The use of NAND flash memory is increasing in Although NAND flash memory has significant advantages,
consumer electronics. Because an out-of-place update scheme it has some characteristics that are different from magnetic
is used to address the erase-before-write hardware constraint in disks. First, NAND flash memory has an erase-before-write
NAND flash memory, a garbage collection algorithm should be constraint, which means that data in the memory cannot be
designed into the flash translation layer (FTL) or the flash- updated directly at the same position. To address this
specific file system to reclaim garbage pages and obtain free limitation, NAND flash memory performs an out-of-place
space. In this paper, an efficient file-aware garbage collection update scheme, in which new data are written to an erased or
algorithm, called FaGC, is proposed for NAND flash memory free page of the memory, and the old data in the original page
systems in consumer electronics. The purpose of the proposed are invalidated. With time, many invalidated, or garbage,
algorithm is to reduce garbage collection overhead and pages accumulate in the memory, and the free space is
improve wear leveling in NAND flash memory systems. The gradually reduced. Eventually, the free space in the memory
experimental results show that the proposed algorithm becomes insufficient, and a garbage collection algorithm must
outperforms existing garbage collection algorithms in terms of be activated to collect garbage space and create free space.
the number of copy operations, the number of erase operations, Second, NAND flash memory has different costs for the
and the degree of wear leveling. Additionally, with limited cost, read, write, and erase operations. As shown in Table I, the
a desired degree of wear leveling can be achieved using a pre- time required for the erase operation is much greater than for
designated value, which is advantageous for NAND flash the write operation, and the time for the write operation is
memory systems in consumer electronic devices1. much greater than the read operation. The garbage collection
procedure selects a victim block to erase and copies the valid
Index Terms — Garbage collection algorithm, Consumer data in the block to free pages before the victim block is
electronics, File systems, Flash translation layer, Wear leveling. erased; consequently, a series of read, write and erase
operations must be performed. Because the write and erase
I. INTRODUCTION operations are time consuming, garbage collection usually
NAND flash memory has become one of the most popular increases overhead. Therefore, the garbage collection policy
storage media, widely used in consumer electronics such as should minimize the number of copy and erase operations.
smart phones, intelligent terminals, digital cameras, portable
TABLE I
media players, laptop computers, and tablet personal THE CHARACTERISTICS OF NAND FLASH MEMORY
computers [1], [2]. Compared to hard disk drives, NAND flash
memory has numerous advantages, such as strong shock Basic Operation Latency
resistance, low power consumption, low noise, small size, Read (2K bytes) 10us
light weight, and fast data access [3]. Additionally, every year Write (2K bytes) 200us
its cost decreases and its storage capacity and density increase. Erase (128K bytes) 2000us
NAND flash memory is composed of blocks, which serve
as the units for erase operations; each block has a fixed Third, the number of erase operations that can be performed
number of pages, which are the units for read and write on each block of a NAND flash memory chip is limited. This
operations. Each page is further divided into two regions, number is typically 100,000 for single-level cell (SLC) flash
namely, the data region and the spare region. The data region memory and 10,000 for multi-level cell (MLC) flash memory.
is responsible for storing data, whereas the spare region is When the number of erase operations performed on a block
used to store the status of the data region, including error exceeds this limit, the block will suffer from frequent write
correction codes. errors. Therefore, another important requirement for garbage
collection is to limit the erase count for each block to prevent
1
This work was supported by the National Natural Science Foundation of blocks from becoming unevenly worn, and thus to lengthen
China (Grant No. 61172181) and the State Scholarship Fund through the the lifespan of the entire flash memory. This requirement is
China Scholarship Council (CSC) of the Ministry of Education of China. called wear leveling.
Hua Yan is with the College of Electronics and Information Engineering,
Sichuan University, Chengdu 610064, China (e-mail: yanhua@scu.edu.cn). Therefore, an efficient garbage collection algorithm has two
Qian Yao is with the College of Electronics and Information Engineering, goals. One goal is to minimize the number of copy and erase
Sichuan University, Chengdu 610064, China (e-mail: 64879927@qq.com). operations to reduce overhead, and the other goal is to improve
Contributed Paper
Manuscript received 09/11/14
Current version published 01/09/15
Electronic version published 01/09/15. 0098 3063/14/$20.00 © 2014 IEEE
624 IEEE Transactions on Consumer Electronics, Vol. 60, No. 4, November 2014

the degree of wear leveling to extend the lifespan of the NAND There are two ways to support NAND flash memory in file
flash memory. To achieve these two goals, an efficient file- systems. For existing file systems, such as Ext2 and FAT, a
aware garbage collection algorithm, called file-aware garbage flash translation layer (FTL) can be introduced between the
collection (FaGC), is proposed for NAND flash memory in existing file system and the flash memory [4]. A more
consumer electronic devices. The proposed algorithm efficient way to use flash memory for storage in consumer
introduces a table in which the update frequency for each part of electronic devices is to use a file system designed specifically
a file is recorded; this frequency information is then used to for flash memory, such as JFFS (journaling flash file system)
cluster the valid pages from the victim block as they are copied or YAFFS (yet another flash file system), without a translation
to free blocks. Additionally, a hybrid wear leveling policy is layer. Either an FTL or a flash-specific file system provides
adopted to improve the degree of wear leveling.
address translation between physical and logical addresses to
A series of experiments is conducted to evaluate the
provide transparent access to the flash memory. The
effectiveness of the proposed algorithm, and the experimental
translation or mapping between the logical location and the
results show that the proposed algorithm is better than the
state-of-the-art garbage collection algorithms in terms of the physical location can be maintained either at the page level or
number of copy operations, the number of erase operations, the block level [5]. Page mapping is adopted in this paper.
and the degree of wear leveling. B. Existing Garbage Collection Algorithms
The remainder of this paper is organized as follows. Section II
The greedy algorithm (GR) was proposed by Wu and
introduces the system architecture for NAND-flash-based
Zwaenepoel for garbage collection [6]. The greedy algorithm
consumer electronic devices and reviews existing garbage
collection algorithms. The proposed garbage collection algorithm, selects the block with the fewest valid pages as the victim
FaGC, is presented in section III. Section IV introduces the block for garbage collection. This approach can reduce the
experimental methods and presents the experimental results. overhead required for copying valid pages within the victim
Finally, conclusions are presented in section V. block to free space during garbage collection. However, the
GR algorithm does not take into account wear leveling in
II. RELATED WORK flash-based consumer electronic devices. It has been shown
that the GR algorithm performs well in terms of wear leveling
A. System Architecture for NAND Flash-based Consumer for random memory accesses but does not perform well for
Electronics memory accesses with a high spatial locality of reference.
As shown in Fig. 1, the system architecture for NAND flash- Kawaguchi et al. proposed the cost-benefit (CB) algorithm
based consumer electronics consists of three layers: the user for flash memory [7]. CB calculates a cost-benefit value for
layer, the kernel layer, and the device layer. each block and selects the block with the highest value as a
victim. The cost-benefit value for a block is calculated as (age
User Applications User Layer • (1−u))/2u, where age is the elapsed time since the last
modification of a page within the block and u is the percentage
of valid pages within the block. Because the CB algorithm
Virtual File System (VFS) takes into account both the age of invalid pages and the
percentage of valid pages in a block, it could provide
improved wear leveling in flash-based consumer electronic
Existing File Systems Flash Specific File Systems
( Ext2, FAT, … ) ( JFFS, YAFFS, … ) devices. However, because the CB algorithm does not take
into account the erase count for each block, its wear leveling
Kernel Layer
performance is not sufficient.
Flash Translation Layer (FTL)
Chiang et al. proposed the cost-age-time (CAT) algorithm,
which extends the CB algorithm by considering the erase
count for each block when selecting a victim block [8]. The
Memory Technology Device ( MTD )
CAT algorithm attempts to maintain a balance between
reducing the garbage collection overhead and improving wear
leveling in flash memory.
NAND Flash Memory Device Layer
All these algorithms are focused at the level of physical
Fig. 1. System architecture for NAND flash-based consumer electronics. pages or blocks in the NAND flash memory, and none take the
associated file structure into account.
A virtual file system (VFS) is an abstraction layer on top of
an actual file system that allows user applications to access III. PROPOSED GARBAGE COLLECTION ALGORITHM
different file systems in a uniform way. The memory An efficient file-aware garbage collection algorithm, called
technology device (MTD) is a generic subsystem for handling FaGC, for NAND flash memory systems in consumer
memory devices under an operating system. NAND flash electronic devices is presented in this section. FaGC reduces
memory is usually used for storage in consumer electronic garbage collection overhead by reducing the number of copy
devices, and file systems manage the NAND flash memory and erase operations and extends the lifetime of NAND flash
device as a block device. memory through improved wear leveling.
H. Yan and Q. Yao: An Efficient File-aware Garbage Collection Algorithm for NAND Flash-based Consumer Electronics 625

A. File-aware Update Frequency Table where the initial value of Fi is Tfreq, which is the previously
The basic operations in user applications in consumer designated threshold for the update frequency of a chunk. An
electronic devices are performed at the file level. In page update frequency higher than Tfreq indicates a high update
mapping mode, a logical location is translated to a physical frequency; otherwise, the update frequency is low. Nblock is the
location at the page level. For convenience, a logical page in a total number of blocks in the NAND flash memory, and Nt is a
file is called a chunk in this paper. Therefore, a file consists of factor used to adjust the level of sensitivity to the time.
a series of chunks mapped to physical pages in the flash Typically, the value of Tfreq will be Nblock/4 or Nblock/2. Time ti
memory. Each file is assigned a unique number, called File ID, is the time of the ith update of a chunk, and ti+1 is the current
and each chunk in a file is assigned a unique number, called time for the (i+1)th update.
Chunk ID. B. Detailed Implementation of the Proposed Algorithm
In general, different files have different update frequencies,
The proposed algorithm, FaGC, is based on the accurate
and different chunks in the same file have different update
maintenance of the PLT and UFT. Every time the data in a
frequencies. In a file-aware system structure, an update
chunk is modified and then rewritten to a free page, the
frequency table (UFT) is built into random access memory
associated entries in the PLT and UFT must be updated
(RAM) to record the update frequency for each chunk in a file.
according to the above steps.
As illustrated in Fig. 2, each UFT entry contains four values:
In general, the block with the fewest valid pages is selected
File ID, Chunk ID, Time, and Freq. Time records the most
as the victim block to minimize the overhead for the copy
recent time that a chunk in a file has been updated, and Freq
operation, as in the GR algorithm.
records the frequency with which a chunk has been updated.
After the victim block is selected, the valid pages in the
Simultaneously, the physical-to-logical translation table (PLT)
victim block are copied to free space, and the victim block is
maintains the File ID and Chunk ID for each block and
then erased and reclaimed. Before the valid pages are copied,
physical page in the flash memory.
the update frequency of the chunk associated with each valid
File x C0 C1 C2 … File y C0 C1 … page in the victim block will be checked in the PLT and UFT.
If the value of Freq for the chunk associated with a valid page
Block m P0 P1 P2 P3 … Block n P0 P1 P2 P3 … in the victim block is not less than Tfreq, the page is copied to
the free block with the lowest erase count. If Freq is less than
Tfreq, the page is copied to the free block with the highest erase
Block … m m m m … n n n n …
count. Therefore, wear leveling is improved using this
Page … 0 1 2 3 … 0 1 2 3 …
PLT
File ID … x x -1 x … -1 y y -1 … clustering procedure based on the update frequency.
Chunk ID … 0 1 -1 2 … -1 0 1 -1 … Additionally, the decision of when to trigger garbage
collection affects the performance of NAND flash-based
consumer electronic devices. Garbage collection algorithms
File ID … x x x … y y …
are usually designed to execute garbage collection periodically
Chunk ID … 0 1 2 … 0 1 …
UFT or to trigger it when the free space in the NAND flash memory
Time … tx0 tx1 tx2 … ty0 ty1 …
Freq … fx0 fx1 fx2 … fy0 fy1 … is insufficient, without considering the efficiency of the
garbage collection process [9]. A scattering factor, fscattered, is
File chunk Valid page Invalid page defined as follows.
Fig. 2. Relationships between files, blocks, the physical-to-logical f scattered  ( N fp  N fb * N p ) / N fp (3)
translation table (PLT) and the update frequency table (UFT).
where Nfp is the number of free pages in the flash memory and
When a chunk in a file is modified or updated, it is rewritten Nfb is the number of free blocks in the flash memory. Np is the
to another physical page in flash memory according to the out- number of pages per block.
of-place update scheme. At that time, File ID and Chunk ID in The higher the value of fscattered is, the more likely pages are
the PLT are updated. to be scattered, and the more efficient garbage collection will
Additionally, when a chunk in a file is modified or updated, be in reclaiming space. Therefore, to avoid unnecessary
the current time is recorded in the UFT, and Freq is calculated garbage collection, only when fscattered is greater than a
and recorded as follows. threshold Tf will garbage collection be executed.
Assume that Fi is the update frequency of a chunk at the ith However, because the blocks with the lowest update
update or modification; Fi+1 can be calculated according to (1): frequencies are likely to accumulate more valid pages, these
 Fi * u , if 1  Fi 1  N block blocks are less likely to be selected as victims, resulting in
 (1) poor wear leveling. To improve wear leveling, an adaptive
Fi 1   1 , if Fi 1  1
victim selection policy, called max-min adjustment, is adopted
N
 block , if Fi 1  N block when a block’s erase count exceeds a value called Terase since
with: the last max-min adjustment victim was selected. Terase is
  ti 1  ti  
 1 
defined as follows.
1   N t  
(2)
u  ( )  Terase  Twl  ( emax  emin ) (4)
2
626 IEEE Transactions on Consumer Electronics, Vol. 60, No. 4, November 2014

where Twl is a previously designated threshold used to control 4000000

Number of copy operations


the degree of wear leveling, emax is the maximum erase count
3000000
for all blocks, and emin is the minimum erase count for all
blocks. As the difference between emax and emin increases, the 2000000
value of Terase decreases, and the likelihood of triggering the 1000000
max-min adjustment will increase. If Terase is less than 0, then
0
Terase is set to 0. GR CB CAT FaGC FaGC FaGC
When the max-min adjustment policy is triggered, the (Twl=300) (Twl=200) (Twl=100)

policy will select the block with the minimum erase count as Fig. 3. The number of copy operations performed by each algorithm.
the victim block to increase the rate of selection of blocks with
many chunks with low update frequencies. If there are

Number of erase operations


100000
multiple blocks with the minimum erase count, the block with
80000
the fewest valid pages is selected as the victim block. The
value of Terase is calculated and updated every time the max- 60000
min adjustment policy is activated. 40000

IV. EXPERIMENTS 20000


GR CB CAT FaGC FaGC FaGC
(Twl=300) (Twl=200) (Twl=100)
The experimental methods and results are presented in this
Fig. 4. The number of erase operations performed by each algorithm.
section.
A. Experimental Methods Fig. 3 and Fig. 4 show the results for the number of copy
To evaluate its performance, the proposed FaGC algorithm and erase operations, respectively, performed by each of the
is compared with three existing garbage collection algorithms: garbage collection algorithms. FaGC selects the block with the
the greedy algorithm (GR), the cost-benefit algorithm (CB), fewest valid pages as the victim, and the valid pages are
and the cost-age-time algorithm (CAT). clustered in the free blocks according to the update frequency
NANDsim is used to emulate NAND flash memory. The of the associated file chunk. If the valid pages associated with
YAFFS2 file system is used, which can directly support file chunks with similar update frequencies are clustered in
NAND flash memory. To provide a fair comparison, YAFFS2 one free block, then the valid pages in the same block are
is modified to support all of the above algorithms and more likely to become invalid at the same time. The more
configured with caches disabled and aggressive mode selected. likely the valid pages in a block are to become invalid at the
The trace used for the experiments consists of a series of same time, the greater the number of invalid pages will
files with sizes ranging from 16 KB to 1024 KB. The files are become in the block, and the smaller the number of valid
generated according to the Zipf distribution and show a high pages will become. Therefore, the number of copy and erase
locality of reference [10]. In the trace, all write operations are operations required is reduced. As shown in Fig. 3 and Fig. 4,
performed using 15% of the data, and the utilization of the the results are similar, and FaGC achieves the best
NAND flash memory is 90%. performance regardless of the value of Twl.
TABLE II 1000
The maximum erase count

PARAMETER VALUES FOR THE PROPOSED ALGORITHM


800
Nblock Np Nt Tfreq Twl fscattered 600
512 64 1 128 100,200,300 90% 400
200
The simulation parameters for the NAND flash memory used 0
in the experiments are listed in Table I. The page size, the GR CB CAT FaGC FaGC FaGC
(Twl=300) (Twl=200) (Twl=100)
number of pages per block, the total number of blocks, and the
Fig. 5. The maximum erase count for all blocks obtained with each
total size of the NAND flash memory are assumed to be 2 KB, algorithm.
64, 512, and 64 MB, respectively. The parameter values and
thresholds used in the proposed algorithm are listed in Table II. 1000
The maximum difference of

800
B. Experimental Results
erase counts

600
The performance metrics used are the number of copy
operations, the number of erase operations, the maximum 400

erase count for all blocks, and the degree of wear leveling. 200
The degree of wear leveling can be measured by the 0
GR CB CAT FaGC FaGC FaGC
maximum erase count for all blocks, the maximum difference (Twl=300) (Twl=200) (Twl=100)
in the erase counts, and the standard deviation of the erase Fig. 6. The maximum difference in the erase counts obtained with each
counts. algorithm.
H. Yan and Q. Yao: An Efficient File-aware Garbage Collection Algorithm for NAND Flash-based Consumer Electronics 627

120 results are obtained. Additionally, with limited cost, a desired


GR
110
The standard deviation of erase counts CB GR degree of wear leveling can be achieved using a pre-
100 CAT
FaGC (Twl=300) CB designated value, which is advantageous for NAND flash
90 FaGC (Twl=200)
FaGC (Twl=100)
memory systems in consumer electronic devices.
80
70
60
ACKNOWLEDGMENT
50 FaGC (Twl=300)
The authors thank the School of Computing, Informatics,
40 CAT
and Decision Systems Engineering (CIDSE) Center for
30 Embedded Systems at Arizona State University for providing
FaGC (Twl=200)
20
FaGC (Twl=100)
support.
10
0
10000 20000 30000 40000 50000 60000 70000 REFERENCES
The erase counts of all blocks [1] Jesung Kim, Jong Min Kim, Sam H. Noh, Sang Lyul Min and Yookun
Fig. 7. The standard deviation of the erase counts obtained with each Cho, “A space-efficient Flash Translation Layer for compact flash
algorithm. systems,” IEEE Transactions on Consumer Electronics, vol. 48, no. 2,
pp. 366–375, May 2002.
Fig. 5, Fig. 6, and Fig. 7 show the results for the [2] Han-Lin Li, Chia-Lin Yang, and Hung-Wei Tseng, “Energy-aware flash
memory management in virtual memory system,” IEEE Transactions on
maximum erase count for all blocks, the maximum Very Large Scale Integration (VLSI) Systems, vol. 16, no. 8, pp. 952-
difference in the erase counts, and the standard deviation of 964, August 2008.
the erase counts for each garbage collection algorithm. FaGC [3] Adam Leventhal, “Flash storage memory,” Communications of the
ACM, vol. 51, no. 7, pp. 47-51, July 2008.
adopts a hybrid wear-leveling policy. The valid pages [4] Tae-Sun Chung, Dong-Joo Park, Sangwon Park, Dong-Ho Lee,
associated with chunks with high update frequencies are SangWon Lee, et al., “A survey of Flash Translation Layer,” Journal of
copied to the free block with the lowest erase count, whereas Systems Architecture, vol. 55, no. 5-6, pp. 332-343, May 2009.
[5] Seung-Ho Lim, Kyu-Ho Park, “An Efficient NAND Flash File System
the valid pages associated with chunks with low update for Flash Memory Storage,” IEEE Transactions on Computers, vol. 55,
frequencies are copied to the free block with the highest no. 7, pp. 906-912, July 2006.
erase count. Additionally, the max-min adjustment policy [6] Michael Wu, and Willy Zwaenepoel, “eNvy: a non-volatile main
memory storage system,” in Proceedings of the 6th International
takes into account the data with the lowest update frequency. Conference on Architecture Support for Programming Languages and
As shown in Fig. 5 and Fig. 6, the proposed algorithm Operating Systems, San Jose, CA, USA, pp. 86-97, 1994.
outperforms the other garbage collection algorithms in terms [7] Atsuo Kawaguchi, Shingo Nishioka, and Hiroshi Motoda, “A flash
memory based file system,” in Proceedings of the USENIX 1995
of the maximum erase count for all blocks and the maximum Technical Conference, Berkeley, CA, USA, pp. 155-164, 1995.
difference in the erase count. In Fig. 7, the standard [8] Mei-Ling Chiang, Paul C. H. Lee, Ruei-Chuan Chang, “Cleaning
deviation from the GR, CB, and CAT algorithms increases policies in mobile computers using flash memory,” Journal of Systems
with time, whereas the standard deviation obtained using the and Software, vol. 48, no. 3, pp. 213-231, 1999.
[9] Guangxia Xu, Manman Wang, and Yanbing Liu, “Swap-aware Garbage
FaGC algorithm converges to a stable value. Additionally, Collection Algorithm for NAND Flash-based Consumer Electronics,”
by comparing Figs. 5 and 6 and Figs. 3 and 4, it can be seen IEEE Transactions on Computers, vol. 60, no. 1, pp. 60-65, Feb. 2014.
that better wear leveling can be achieved by decreasing the [10] Mingwei Lin and Shuyu Chen, “Efficient and Intelligent Garbage
Collection Policy for NAND Flash-based Consumer Electronics,” IEEE
value of Twl, and the increased overhead due to copy and Transactions on Consumer Electronics, vol. 59, No. 3, pp. 538-543,
erase operations is small. Therefore, the degree of wear August 2013.
leveling can be controlled using the threshold Twl. This is an
important advantage in NAND flash-based consumer BIOGRAPHIES
electronic devices. Hua Yan (M’13) received the B.S. degree in
Automation, the M.S. degree in Radio Electronics, and
V. CONCLUSION the Ph.D. degree in Mechanical Automation from
Sichuan University in 1993, 1996, and 2008,
An efficient file-aware garbage collection algorithm, called respectively. Currently, he is an associate professor in the
FaGC, is proposed in this paper for NAND flash memory College of Electronics and Information Engineering at
Sichuan University. His research interests include pattern
systems in consumer electronic devices. The FaGC algorithm recognition and intelligent systems.
copies valid pages in a victim block to clusters in free blocks
according to the calculated update frequency of the associated
Qian Yao received the B.S. degree in Electronics and
chunk. The FaGC algorithm adopts a hybrid wear-leveling Information Engineering from Henan University of
policy to improve the lifespan of NAND flash memory. To Technology in 2011. Currently, he is pursuing the M.S.
avoid unnecessary garbage collection, a scattering factor is degree in the College of Electronics and Information
Engineering at Sichuan University. His research interests
defined and calculated to determine when to trigger the include embedded systems, the Linux operating system,
garbage collection policy. and flash memory.
Simulation experiments are conducted to evaluate the
performance of the proposed algorithm, and encouraging

You might also like