You are on page 1of 33

HARDWARE DESIGN OF DATA

CONTROLLER FOR NAND FLASH


MEMORY

Under the esteem guidance of

Dr. Jai Gopal Pandey

Principal Scientist at CSIR-CEERI

&

Dr. Jeetendra Singh

Assistant Professor

Department of Electronics and


Communication Engineering
CERTIFICATE
This is to certify that the dissertation entitled “HARDWARE DESIGN OF DATA CONTROLLER
FOR FLASH MEMORY” submitted by Vikas Patel is the authentic record of work done by under our
guidance and supervision. This Partial Project is being submitted to the National Institute of Technology,
Sikkim towards the fulfilment of the requirements for the award of degree of Master of Technology in
Microelectronic and VLSI Design. This is further declared that to best of my knowledge, the matter
embodied in this thesis has not been submitted to any other University/Institute for award of any degree.

Dr. Jeetendra Singh

DECLARATION
I hereby declare that the work being presented in this dissertation entitled “HARDWARE DESIGN OF
DATA CONTROLLER FOR FLASH MEMORY”, submitted towards the fulfilment of the
requirements for the award of degree of Master of Technology in Embedded System Design to the School
of VLSI Design and Embedded Systems, National Institute of Technology, Sikkim, is an authentic record
of my work under the guidance of Prof. (Dr.) Jeetendra Singh, Assistant professor, ECE Department,
National Institute of Technology, Sikkim.

I also confirm that this report is only prepared for my academic requirement not for any other purpose.

Vikas Patel

ACKNOWLEDGEMENT
The work presented in this thesis would not have been possible without my close association with many
people. I take this opportunity to extend my sincere gratitude and appreciation to all those who made this
MTech thesis possible. First, I would like to express my deep sense of respect and gratitude to my mentor
Assistant prof. (Dr.) Jeetendra Singh, Department of Electronics and Communication Engineering, NIT
Sikkim, for his guidance and for sparking my interest in Memory Design through VLSI Design during my
master and gave me opportunity to do my dissertation on this important and interesting topic. His
professional knowledge, contagious enthusiasm and faith in me were very important and gave me the
strength to conclude this work. I want to thank Dr. Jai Gopal Pandey, Principal Scientist at CSIR-
CEERI, who taught me and gave me a goal path on what to do and what we will get by constant
encouragement and motivation, and directions which led me towards my goals.

I express my grateful thanks to all faculty members and Dr. Rashmi Dhara, Department of Electronics
and Communication Engineering, NIT Sikkim, for their encouragement and support to complete this
project. Finally, I express my deep sense of reverence to my dearest parents, family and my loving friends
for their unconditional support, patience and encouragement through all these years. This work would not
have been possible without their support.

Thank you everyone who directly or indirectly have an influence in this work.

Vikas Patel
CHAPTER 1
INTRODUCTION

For many years, semiconductor demand has been continuously rising due to wanting cheap, fast, and
splendid performance in the electronics market. The semiconductor memory has been divided into two
parts based on the total of complementary metal-oxide-semiconductor (CMOS) technology. The (Figure
1) is shown in below

 Volatile memories lose information when the power supply is turned off, like SRAM (Static
Random-Access Memory) or DRAM (Dynamic Random-Access Memory); however, speedy
writing and reading due to the smaller type of storage (SRAM) available or being very dense
(DRAM), it has enough capacity to receive data.
 Non-volatile memory is an endless memory. A system does not lose the data and information
stored within the memory even after the user shutdown or interrupts the power supply Like
EPROM, EEPROM, or Flash, it balances less aggressive programming and read performance
with non-volatility as power consumption is higher and access speeds are slower.

Gratitude to these distinctive memories, the system is offered a remarkable chance; it has covered many
applications from consumer and automotive to computers and communications by non-volatile memory.
Different types of non-volatile memory can be expressed somehow, like flexibility and cost (Figure 2).
Flexibility means we can be programmed and erased many times on the system with a more miniature
scale of detail in a set of data. Cost means here is very cheap. Per unit, memory is less expensive here. In
a non-volatile memory, the process can only be read. It means that the processor won’t have direct access
to the data and information. i.e., density or, in simpler words, cell size. Considering the flexibility-cost
plane, it turns out that Flash offers the best compromise between these two parameters, since they have
the smallest cell size. Where we talk about the storage element, that time just a name comes at first, it has
called flash memory which plays a vital role in the electronics industry; furthermore, its contribution was
well good for many years. We can access multiple applications in flash memory due to its high speed,
non-volatile memory, and low power consumption (Table 1). NVM qualitative comparison in the
flexibility–cost plane by this characteristic can differentiate cell type. Every cell has its organization
array. NAND flash memory is evolving better and more widely ripened due to solid-state drives (SSDs)
[2] and USB flash drives, commonly called flash storage devices. Non-volatile memory application
permits system reconfiguration, software updates, modifying stored identification codes, or regular
updating of stored information.

NAND NOR
Memory cell arrangement Cells are arranged in series with Cells are arranged in parallel
the adjacent cells sharing source with all the source node of the
and drain. Here, the source of cells connected to the bit line.
one cell is connected to the drain Because of this, the system can
of the next one. This series access individual memory cells.
connection reduces the number
of ground wires and bit lines, Disadvantage: The higher signal
and as a result of this count increases device size,
configuration, NAND has a requires more PCB area, and
smaller cell size. This does not makes PCB routing more
however allow for direct access difficult.
to individual cells.

Advantage: smaller die area and


lower cost per bit
Capacity 1Gb to 16Gb 64Mb to 2Gb
Because of its higher density. Because of lower dense.
NAND Flash is used mainly for NOR Flash is used for code
data storage applications. storage.
Non-volatile Yes, extremely high cell Yes, larger chip area per cell
densities (the number of cells
in this cross-sectional area is
too. Cells consume less space
to settle down in chip)
Interface I/O only, Requires toggling both Full memory interface / Random
CLE and ALE signals access
High-speed access Yes, Sequential Yes, Random
Performance Fast erase (3msec) VERY SLOW erase (5 sec)
Fast write Slow write
Fast read Fast read
Reliability Low: Standard:
Requires at least one bit for Bit-flipping issues reported Less
error management (bit-flipping than 10% the life span of
issue). Bad block management NAND.
required
Erase Cycles 100,000 – 1,000,000 10,000 – 100,000
Life Span Over 10 times more than NOR Less than 10% the life span of
NAND.
Ideal usage Data storage only – due to Code storage – limited capacity
complicated flash management. due to price in high capacity.
Code will usually not be stored May save limited data as well.
in raw NAND flash. Examples: Examples: Simple home
PC Cards Compact Flash Secure appliances Embedded designs
Digital media MP3 players Low-end set top boxes Low-end
(Data only) mobile handsets Code storage of
digital cameras

TABLE 1

EEPROM (Electrically Erasable Programmable Read-Only Memory) uses a large chip area. It is
extra expensive and will erase and program through electrically, which flexibility is higher than
other non-volatile memory (Figure 3). These days flash memory is one of the most widespread,
trustworthy, and adaptable non-volatile memories to keep consistent data values and software
code. Programable has been written many times by two memory architecture, one is NAND
Flash memory, and the other is NOR Flash memory architecture [3]. So, NAND flash chips are
not natively byte-addressable and can only be accessed in page granularity; we employ the
DRAM is current in the SSD controller and utilized as a cache for accessed page. The significant
difference between NOR and NAND is the below table. It illustrates why NAND and NAND-
based solutions are ideal for high-capacity data storage, while NOR is best used for code storage
and execution, typically in smaller capacities. This project will discuss the different features of Flash
memories created with the disparities between NOR Flash and NAND Flash. We can store data by
creating a memory cell from a floating gate transistor. We can explain that technique later, such as
reading, program, and deleting a cell. In NOR flash, the source line is connected by every endpoint of a
transistor, and the other end is connected directly to the bit line like the NOR gate is connected in parallel.
In NAND Flash, many memory cells (usually eight cells) are connected in the same series of NAND
doors. The NOR Flash architecture supplies enough address lines to map the entire memory range. It
benefits random access and short read times, making code execution ideal. Furthermore, pron is all over
known good bits for the part's life. Drawbacks include bigger cell sizes, higher cost per bit, and slower
write and erase speeds. On the other side, NAND Flash has significantly less cell size and much more
increased write and erase speeds than NOR Flash. Disadvantages include the slower read speed and an
I/O mapped type or indirect interface, which is more complicated and does not allow random access. It is
important to note that code execution from NAND Flash is achieved by shadowing the contents to a
RAM, which is different than code execution directly from NOR Flash. Another major disadvantage is
the presence of bad blocks. NAND Flash typically has 98% good bits when shipped with additional bit
failure over the part's life, thus requiring the need for error-correcting code (ECC) functionality within the
device. In both NOR and NAND Flash, the memory is organized into erase blocks. This architecture
helps maintain lower cost while maintaining performance. For example, a smaller block size enables
faster erase cycles. The downside of smaller blocks, however, is an increase in die area and memory cost.
Because of its lower cost per bit, NAND Flash can more cost-effectively support smaller erase blocks
compared to NOR Flash. The typical block size available today ranges from 8KB to 32KB for NAND
Flash and 64KB to 256KB for NOR Flash. Erase operations in NAND Flash are straightforward while in
NOR Flash, each byte needs to be written with ‘0’ before it can be erased. This makes the erase operation
for NOR Flash much slower than for NAND Flash. As mentioned earlier, NOR Flash memory has
enough address and data lines to map the entire memory region, like how SRAM operates. For example, a
2-Gbit (256MB) NOR Flash with a 16-bit data bus will have 27 address lines, enabling random read
access to any memory location. In NAND Flash, memory is accessed using a multiplexed address and
data bus. Typical NAND Flash memories use an 8-bit or 16-bit multiplexed address/data bus with
additional signals such as Chip Enable, Write Enable, Read Enable, Address Latch Enable, Command
Latch Enable, and Ready/Busy. The NAND Flash needs to provide a command (read, write or erase),
followed by the address and the data. These additional operations makes the random read for NAND
Flash much slower. For example, the S34ML04G2 NAND Flash requires 30µS compared to 120ns for
S70GL02GT NOR Flash. Thus, the NAND is 250 times slower. To overcome or to reduce the limitations
of slower read speeds, memory is often read as pages in NAND Flash, with each page being a smaller
sub-division of erase blocks.  The contents of one page is read sequentially with address and command
cycles only at the beginning of each read cycle. The sequential access duration for NAND Flash is
normally lower than the random-access duration in NOR Flash devices. With the random-access
architecture of NOR Flash, address lines need to be toggled for each read cycle, thereby accumulating the
random access for sequential read. As the size of block of data to read increases, the accumulated delay in
NOR Flash becomes greater than NAND Flash. Thus, NAND Flash can be faster for sequential reads.
However, due to the much higher initial read access duration for NAND Flash, the performance
difference is evident only while transferring large data blocks, often for sizes above 1 KB. In both Flash
technologies, data can be written to a block only if the block is empty. The already slow erase operation
of NOR Flash makes the write operation even slower. In NAND Flash, like read, data is often written or
programmed in pages (typically 2KB). For example, a page write alone with S34ML04G2 NAND Flash
takes 300µS. To speed up write operations, modern NOR flashes also employ buffer programming like
page writes. NOR Flash memories typically require more current than NAND Flash during initial
power on. However, standby current for NOR Flash is much lower than NAND Flash.
Instantaneous active power is comparable for both Flash memories. The active power is thus
decided by the time duration for which memory is active. NOR Flash holds an advantage when it
comes to random reads while NAND Flash consumes comparatively much lower power for
erase, write, and sequential read operations. The reliability of saved data is an important aspect
for any memory device. Flash memories suffer from a phenomenon called bit-flipping, where
some bits can get reversed. This phenomenon is more common in NAND Flash than in NOR
Flash. NAND Flashes are shipped with bad blocks scattered randomly throughout, due to yield
considerations. More memory cells go bad as erase and program cycles continue throughout the
life cycle of NAND Flash. Bad block handling is therefore a mandatory capability for NAND
Flash. NOR Flash, on the other hand, are shipped with zero bad blocks with very low bad block
accumulation during the life span of the memory. Thus, when it comes to the reliability of stored
data, NOR Flash has an advantage over NAND Flash. Another aspect of reliability is data
retention, where NOR Flash again holds an advantage. S70GL02GT NOR Flash offers 20 years
of data retention for up to 1K Program/Erase Cycles. S34ML04G2 NAND Flash offers a typical
data retention of 10 years. The number of program and erase cycles used to be an important
characteristic to consider. This is because NAND Flash memories used to offer 10 times better
program and erase cycles compared to NOR Flash. With today’s technological advancements,
this is no longer true as both memories are now comparable. For example, both the S70GL02GT
NOR and S34ML04G2 NAND support 100,000 program-erase cycles. However, due to the
smaller block size used in NAND Flash, a smaller area is erased for each operation. This results
in a higher overall life span compared to NOR Flash.
CHAPTER 2
FLOATING GATE MOS TRANSISTOR

A Flashing cell is essentially a floating-gate MOS transistor (see Fig. 3), that is, a device with a floating
gate (FG) fully enveloped by dielectric materials and electrical governed by a coupled control gate
(CG).The FG acts as the storage electrodes again for the cell since it is electrically separated; charge
placed in the FG is stored there, enabling adjustment of the cell transistor's "apparent" threshold voltage
(as viewed from the CG). The quality of dielectrics provides nonvolatility, whereas the thickness permits
the cell to be written or erased using electrical pulses.

Its gate dielectric, which sits here between the transistor channel and the FG, is usually a 9–10 nm oxide
called "tunnel oxide" because it allows FN electrons to tunnel through it. A triple layer of oxide–nitride–
oxide forms the dielectric that separates the FG from the CG (ONO). The corresponding oxide thickness
of the ONO is in the region of 15–20 nm. An ONO layer has been used as an interpoly dielectric to
increase the tunnel oxide quality. They are using thermal oxide above polysilicon results in a greater
growing temperature than 1100 C, which impacts the tunnel oxide beneath it. The thin oxide quality is
known to be harmed by high-temperature post-annealing. If the tunnel oxide and the ONO act as ideal
dielectrics, the energy band diagram of the FG MOS transistor can be graphically represented, as shown
in Fig.4. The FG functions as a possible well for the charge, as can be shown. The tunnel and ONO
dielectrics form potential barriers after the charge is in the FG. The logical state "1" corresponds to the
neutral (or positively charged) state, while the logical "0" corresponds to the negatively charged state,
which corresponds to electrons stored in the FG.The name "NOR" Flash refers to the way cells in an array
are organized in a NOR-like structure, with rows and columns.The word line (WL) is made up of flash
cells that share the same gate, while the bit line is made up of cells that share the identical drain electrode
(one contact shared by two cells) (BL).All of the cells share the source electrode in this arrangement [Fig.
5(a)].In Fig. 5(b), a scanning electron microscope (SEM) cross-section along a bit line of a Flash array is
shown, with three cells sharing the drain contact and the source line two by two. The cell area is
calculated by multiplying the pitch by the pitch. The active area breadth and space and the fact that the
FG must overlap the oxide field determine the pitch. The cell gate length, contact-to-gate distance, half
contact, and half source line make up the pitch. Both the contact and the source line are split between two
adjacent cells, as shown in Fig. 5(b).
2.1 BASIC OPERATION OF FLASH MEMORY:

The flash cell operating concepts are covered in the read, program, and erase. The focus of the
introduction is on rational behavior. More information on the direct impact of flash cell operations may be
found in the literature [10].

2.1.1 Read Operation.

The flash cell's transistor characteristic determines the read parameter. The operating read point is
determined by the slope of the transconductance curve [VGS against IDS]—the slope of the curve is
defined as gm. The read accuracy is influenced by the precision of all voltage levels applied to the cell
terminals. The read threshold level is determined by the gate voltage supplied. The cell's Vth is compared
to the read reference Vth level during a read operation (see Fig. 6):

 The cell is defined as programmed if the Vth level is higher than the read reference level.

 A cell is defined as erased if its Vth level is lower than the read reference level.

A cell's threshold voltage in a memory array cannot be directly monitored. The read current at the array's
edge is utilized to determine the cell's condition.

 A Vth level higher than the read reference level causes a leakage/subthreshold current for the
defined reference gate voltage, indicating that the cell has been programmed.

 When the Vth level is lower than the read reference level, the cell current is detectable, allowing
the sense amplifier to identify the cell as erased.
Figure 7 displays a read of the cell transistor current (current sensing on the drain side). IDS: Sense
amplifier circuits monitor the cell current at the memory array's border. The primary differential sensing
approach utilized in other memory can provide rapid sensing [11, 12]. The fundamental cell Vth operation
window, a crucial parameter for every flash memory product, is defined by read operation requirements.

2.1.2 FN Tunneling charge of the program's operation.

The program operation raises the Vth level of an erased cell to the desired Vth level, which is referred to
as programmed Vth. The shift is accomplished by introducing charge into the cell's storage element. The
tunneling procedure is the most energy-efficient approach to program a flash cell. A strong electric field
across the bottom oxide guarantees direct electron tunneling into the flash cell's storage element (floating
gate). Fowler-Nordheim (FN) tunneling [13] is used to achieve this. The gate voltage displayed in Fig.8
controls the FN programming tunneling begins when the electrical field is strong enough, and the storage
element's VSE voltage follows the gate voltage. The length of the program is determined by the gate
voltage used and the thickness of the bottom oxide, also known as tunnel oxide. The gate must be
accessible for FN programming to work. The potential of the other terminals is determined by the
cell/array combination chosen.

 To make the barrier small enough, a strong electric field is necessary (Fig.8).

 A high voltage is given to the gate, causing electrons to tunnel into the floating gate through the
bottom oxide (also known as a tunnel oxide).

 The program's efficiency is excellent (=1); all electrons are injected into the storage element.

 There is no significant current flow in the program.

The charge is injected into the floating gate—VFG—during the program period (VSE). The gate
voltage displayed in Fig.8 controls the FN programming. FN tunneling begins when the electrical
field is strong enough, and the storage element's VSE voltage follows the gate voltage. The electric
field over the bottom oxide is lowered as the program time (dependent on the gate) grows. The
electron flow is lowered when the field is reduced. This relationship is linear, and the VSE potential
changes in lockstep with the applied gate voltage [14]. Tunneling through a barrier in quantum
mechanics is a slow process, and the low tunneling current is proportional to the barrier height the
3/2 power. Because the tunneling process is reversible, the storage element may be erased by
applying opposite voltages.
2.1.3. Erase Operation

The transition from an EPROM (electrically programmable read-only memory) to an EEPROM


(electrically erasable read-only memory) substitutes the UV erase of the whole memory with an
electrically controlled erase operation of pre-defined erase blocks. The erase operation reverses the
program operation's threshold voltage shift. The tunneling current flows into the substrate, which is
shared by all cells in the erase block. The erase procedure moves many cells below the erase Vth
level, both scheduled and erased. Figures 9 and 10 show a tunnel method that may be used to remove
data. Diffusion Fowler-Nordheim Tunneling—DFN— was utilized to delete all cells in a FLOTOX
cell. A positive voltage to the relevant source or drain region and a negative voltage to the control
gate might create the needed electrical field. This voltage must be provided at a specific time interval
throughout the circuit a millisecond range. The voltage differential determines the erase time between
the gate and the source and the tunnel oxide thickness.

The ability to shrink the cell size is limited along with the shrink roadmap due to considerable overlap
between the diffusion and the gate. The Negative Gate Channel FN Tunneling—NFN—was devised
to remove the cell via the bulk. To assure electron flow from the floating gate into the bulk area, a
negative high voltage must be provided to the chosen cells over a period ranging from hundreds of
seconds to milliseconds. When high electric fields are applied to tunnel oxide, the quality of the
tunnel oxide degrades. Tunnel oxide deterioration accelerates with time, resulting in leakage routes
that compromise the integrity of the dielectric barrier and, as a result, the data the flash cell's retention
parameter. Another physical erases operation concept was used when erasing a flash cell utilizing the
band-to-band Hot Hole Injection (HHI) erase mode [16, 17]. Compared to the channel FN tunnel
erase, a negative high voltage pulse is applied to the gate with half the value. A snap-back condition
is imposed on the cell. Impact ionization at the drain area generates many hot holes, which are driven
back into the floating gate and erased quickly by a flow of holes recombining with the stored
electrons seen in Fig. 11. The circumstances for the Hot Hole Injection mechanism to function
demand a large amount of current flow. "The so-called Gate Induced Drain Leakage (GIDL) current
is caused by band-to-band tunneling"[18]. Hot hole injection from band to band benefits a lower
voltage bias and quicker erasing erasure operations based on Fowler–Nordheim. The relevance of the
erase operation for each flash memory's performance and reliability and the erase algorithm
complexity are examined in the respective chapters.
CHAPTER 3
SSD ARCHITECTURE AND FLASH MEMORY STRUCTURE

3.1 SSD Design


Processor, RAM, Host Interface, Buffer Manager, numerous Flash Controllers and Channels, and Flash-
Chips are shown in Fig. 12 of an SSD design [2]. THE PROCESSOR MANAGES most SSD activities,
including request/response flow and logical-to-physical address mapping. The RAM, which is made up of
SRAM and DRAM, supports the processor's functions. The Host Interface establishes a physical link
between the host and the SSD, such as PCIe, SATA, SAS, etc. A Flash Controller oversees the activities
of Flash-Chips on its channel and handles commands from the processor [2]. It's also in charge of data
transmission between the Flash-Chips and the Buffer Manager. I/O parallelism may be used thanks to the
many channels and flash-chips [21]. The Buffer Manager's primary job is to store instructions and data so
that the Host Interface and the Flash Controller can handle requests and answers. Figure 13 depicts a
flash-chip in greater detail than Figure 12.
Dies, planes, blocks, and pages make up each flash chip [16]. The Flash Controller is communicated
using the I/O Control. The flash array receives commands and addresses from the Control Logic.
Read/write operations to NAND flash memory are supported via data/cache registers.

3.1.1 Read Operation from an SSD


Read set (RS), read from NAND (RN), and read data are the three steps of an SSD read process (RD).
The read operation is depicted in Figure 14. Read the instruction, and the associated address is transmitted
to the Buffer Manager through the Host Interface during the RS phase. The Buffer Manager transmits a
command/address to the Flash Controller in the RN phase, and the Flash Controller subsequently passes
the command to the associated flash-chip (i.e., setup). During this time, the flash-chip is busy reading data
from the NAND flash array to the Data/Cache register. After that, the flash chip sends data to the Flash
Controller, sending it to the Buffer Manager (i.e., data transfer). During the RD phase, the host system
uses direct memory access (DMA) to receive data from the Buffer Manager via the Host Interface [11,
13].

3.1.2 Write procedure on an SSD

Write set (WS), write data (WD), and write to NAND are the three steps of an SSD write process (WN).
Figure 15 depicts the SSD write procedure. The host system transmits a program instruction and the
associated address to the Buffer Manager through the Host Interface during the WS phase. During the
WD phase, the host system then transmits data. The Buffer Manager issues a program command/address
during the WN phase, and the Flash Controller transfers the command to the associated flash-chip
(Setup). Following that, data is sent from the Buffer Manager to the Flash Controller (Data transfer). The
associated flash-chip then writes data to its NAND flash array, with the flash-status chips being busy
during the write process. When the write process is completed, the flash-chip transmits a status signal to
indicate that the operation is complete.
3.2 NAND Overview

Because of their tiny cell size, NAND flash memories are commonly employed in SSDs (i.e., a low cost
per bit). As a result, this research focuses on NAND flash memory. Pages, blocks, planes, and dies make
up a NAND memory chip.

A page comprises many flash cells that share the same word line, which is usually 512 B-16 KB. A few
bytes of management information, such as ECC and mapping tables, are stored in an accessible region on
each page to facilitate FTL operations [22]. Due to parallelism, read and program operations performance
improves as the page size grows; nevertheless, RC latency and lithography restrict the performance. In
contrast, Erase operations are executed on a block granularity of 32, 64, 128, or 256 pages to read and
program operations. Because more cells share the select gate and contact area, a greater block size
improves array efficiency. However, write amplification refers to the increase in the amount of valid data
pages that must be moved during trash collection (see Section 6) [23]. A plane comprises tens of
thousands of blocks (1024 or 2048). Finally, a die comprises two or four planes, and each flash-chip has
two or four dies. Figure 13 depicts a Micron 8 Gb flash-chip with two dies, an example NAND flash-chip.
The number of flash chips determines the capacity of an SSD, hence capacity = 8 Gb x n flash-chips. A
flash chip comprises two dies, each of which has 4 GB of flash memory. Each die is made up of two
planes, each with 2048 blocks. Each block comprises 64 pages, each 2 KB size [16].As a result, the
following are the relationships for various granularity levels:
 1 page = 2048 + 64 (spare area) bytes

 1 block = (2048 + 64) bytes x 64 pages = 132 KB

 1 plane = 132 KB x 2048 blocks = 2112 Mb

 1 die = 2112 Mb x 2 planes = 4224 Mb

 1 flash-chip = 4224 Mb x 2 dies = 8448 Mb

 1 SSD device = 8 Gb x n flash-chip

Figure 16 [13] depicts a NAND flash array, which resembles one of the planes in Fig. 13 A NAND string
is a serial connection that shares the same bit line (i.e., a vertical line in Fig. 16) linked all transistors
through their source and drain). In contrast, a NAND page is a serial connection that shares the same
word line (i.e., the horizontal line in Fig. 16 where all the transistors are connected through their control
gates). Because each NAND page has an identical control gate, read and program operations are done at
the page level. Individual cells on a page must be programmed using the program and program inhibit
states depicted in Fig. 17. When data "0" must be programmed into the cell at word line (WL) 2 and bit
line (BL) 1, for example (WL2, BL1), a high voltage is provided to WL2, and BL1 is grounded. The
voltage differential between the control gate and substrate at the cell (WL2, BL1) is sufficient to trigger
FN tunneling. In the meanwhile, while the cell (WL2, BL2) is not a target for programming data "0," it
does share the same control gate (WL2) as the cell (WL2, BL1). As a result, the program's inhibit state is
activated.

To keep this cell from being programmed, you'll need to do the following. This is accomplished by
applying a voltage to BL2 (i.e., 8 V) that does not promote FN tunneling, as illustrated in Fig. 17. As a
result, data "0" will be programmed into the cell (WL2, BL1), whereas data "1" will stay in the cell (WL2,
BL2). As previously stated, NAND memory read operations are done at page granularity. READ is
applied to all the gates of a page read in SLC, while the gates of the other pages are driven to Vpass
(typically 5 V) to operate as pass transistors. Then, depending on whether or not the current goes through
the transistor, the stored data may be determined. A read operation of a NAND memory array is shown in
Fig.18 Assume the following data is stored in the flash memory array: (WL1, BL1) = 1, (WL1, BL2) = 0,
(WL1, BL3) = 1, (WU, BL4) = 0, (WL2, BL1) = 1, (WL2, BL2) = 1, (WL2, BL3) = 0, (WL2, BL4) = 0.
Vread (2.5V) is applied to WL2, and Vpass (5V) is applied to aa other WLs to read the target page on
WL2 in fig.18. Because Vread is greater than Vt, the cells (WL2, BL1) and (WL2, BL2) enable the
current flow. Cells (WL2, BL3) and (WL2, BL4), on the other hand, restrict the current since Vread is
less than Vt in these cells (which carry data "0"). When sensing the target WL, data on other pages (i.e.,
other WLs) is unaffected since Vpass is always greater than either data state, as shown in Fig. 18.
CHAPTER 4
FLASH TRANSLATION LAYER [FTL]

The management of various restrictions when integrating flash memory into a computer system includes
erase-before-write, the difference in granularity between reads/writes and erase operations, and the
restricted number of write/erase cycles that a memory cell can endure. Due to these limits, appropriate
management procedures must be implemented. These can be implemented wholly in software on the host
system via specialized file systems or hardware, that is, incorporated into the flash memory controller on
the device in question. This is referred to as an FTL in the latter situation (for Flash Translation
Layer).The following are the main characteristics of FTLs that will be discussed in this chapter:

 Physical to logical mapping: Because of the erase-before-write requirement, changing data


necessitates rewriting it onto another page. As a result, it becomes vital to maintain track of the
data's position. This is accomplished through the use of mapping systems.

 Garbage collection: changing data entails primarily two steps: (1) transferring the data to a new
page, and (2) invalidating the original copy of the data. Because of the multiple changes that may
occur, faulty pages tend to collect over time. A garbage collector examines the flash memory and
recycles whole blocks in order to regain free space for future writes.

 Wear-leveling algorithms: Because a flash memory cell can only undergo a limited


number of erase/write cycles due to temporal and spatial localities observed in several
I/O application workloads, it is necessary to provide a mechanism that allows for uniform
wear out in all blocks of the flash memory to keep the entire flash space functional for as
long as possible. 

4.1 Basic mapping technique:

The mapping procedure entails converting a logical address from the host (the application or, more
precisely, the device driver at the system level) into a physical address. A mapping table is used in this
translation, which is controlled by the FTL layer. The granularity with which the mapping is
accomplished is connected to the basic mapping techniques. These systems may be divided into page-
level mapping, block-level mapping, and hybrid mapping schemes (page and block-level). All are
described as per its advanced improvement. Hybrid mapping is have combined with page and block
mapping techniques. It all has facts that are included in these two mappings.

4.1.1 Page level mapping:

NAND flash memory is controlled in page mapping on page level [Chiang et al. 1999]. As a result, a page
map table is created and maintained in both NAND flash memory and RAM. An LPN (logical page
number) and a PPN (physical page number) make up a map table entry (physical page number). When a
written request is issued to a logical page address, the page map table finds the matching physical page
number. If the page detected has a value at that point, the page is invalidated, and the required data is
written to an available free page. When a written request to logical page address 5 is sent to the FTL (see
Figure 19), the FTL looks for the appropriate physical page number using the logical page number, as
both have the same index in the page map table.

4.1.2 Block level mapping:

Block mapping can be used to minimize the size of the map table.
The logical page address is separated into a logical block number and a page offset in block mapping
[Ban 1995]. The page offset is used to identify the free page in the associated block, and the logical block
number is utilized to discover a free block that includes free pages. Because the map table is made up of
block number entries, the file size may be decreased from 1 MB to 16 KB [Samsung Electronics 2005].
As a consequence of its economic memory usage, this approach may be implemented in many embedded
devices. However, because the page offset is derived from the host's logical page location, the physical
and logical blocks' page offsets should be equal.

When a written request to logical page address 5 is sent to the FTL (see Figure 20), the logical page
address is separated into logical block number 1 and page offset 1. First, the physical block number for
the logical block number 1 is determined. The logical page offset is appended to the calculated physical
block number once the appropriate physical block number 0 is matched. The incoming data is then
written to physical page number 1. However, data has already been written to physical page number 1 in
this situation. As a result, the data should be written to a free block (physical block number 2) as soon as
possible. In this method, one logical block is connected with just one physical block; therefore, all other
pages in the same block where physical page number 1 is placed are copied to the same free block.

4.1.3 Hybrid level mapping


Many hybrid mapping systems have been developed to compromise page mapping and block mapping to
minimize the mapping table size and the block copy cost. Kim et al. [2002] were the first to introduce a
hybrid mapping system known as the log block scheme. The log scheme's fundamental idea is to save a
limited number of log blocks in flash memory to act as write buffer blocks for overwrite operations (a
write buffer block is referred to as a log block here because their functions are the same). As long as
empty pages remain in the log blocks, the incoming data can be added indefinitely using the log block
technique. When the same logical page data is overwritten, the incoming data is written to a free page,
and the primary data becomes invalid. When forthcoming write requests to logical page numbers (5, 7, 7,
5) are sent to the allocated log block, as shown in Figure 21, which implies that physical block number 0
includes the data of logical page numbers (4, 5, 6, 7). The last two writes are overwritten of the previous
two. As a result, for logical block number 1, only the most recent queries to logical page numbers (7, 5)
are legitimate.

These requests are depicted in the diagram as (5, 7). The log block and the accompanying data block are
merged into a free block when a log block has no more free pages or when the logical block that holds the
requested page is changed from the preceding logical block, as illustrated in Figure 21. Ultimately, the
merged free block is renamed the new data block, while the old and log blocks are split into two free
blocks. The free block map table is not shown in the diagram for clarity.

You might also like