You are on page 1of 7

Software Spotlight

ECC Options for Improving NAND Device Reliability

ECC Options for Improving NAND Flash Memory Reliability


A Micron Software Engineer Highlights Hardware and Software ECC Solutions
By Marina Mariano Micron Software Engineer NAND Flash memory devices cover a wide range of applications that go beyond traditional storage in USB flash devices, memory cards, and solid state drives. NAND is being used more and more as support for server solutions, networking products, embedded systems, and so forth. These segments require a high level of reliability that is in contrast with the technology shrinking, which can be a source of data corruption in memory storage. In addition to increasing densities and decreasing device sizes, Micron creates products that are more reliable than traditional NAND devices, thanks to a wide range of solutions that can be adapted to a variety of markets. The needs of embedded systems, in conjunction with our standards for excellence, have made it possible to design products aimed at a stable level of performance and reliability throughout their lifespan. To achieve the challenging level of reliability required by markets like embedded devices and mobile phones, Micron protects data integrity with the adoption of error correction code (ECC) in a broad variety of memory solutions. ECC was initially used in RAM, but now adds reliability to NAND devices. This article discusses some of the ECC solutions available.

Using ECC to Improve NAND Data Integrity


In this article, we discuss NAND Flash memory devices that use one of two memory cell technologies. The first is single-level cell (SLC) NAND in which each cell contains one bit, which can be a 0 or 1. The second is multilevel cell (MLC) NAND, which provides the ability to store two bits in each cell (or more in some cases). The voltage threshold determines the content of each cell, representing one of four possible values for the two bits. SLC NAND provides better data endurance compared to the same density MLC NAND because of the accuracy required for recognizing the four different thresholds. Both SLC and MLC NAND require an ECC algorithm to ensure data integrity. However, the data in MLC NAND needs a significantly more sophisticated correction scheme to obtain the same level of integrity.

PDF: 09005aef848aca27/Source: 09005aef848aca46 Rev. C, 01/12 2012 Micron Technology, Inc. Micron and the Micron logo are trademarks of Micron Technology, Inc. All other trademarks are the property of their respective owners. All information is provided on an AS IS basis, without warranties of any kind.

Software Spotlight
ECC Options for Improving NAND Device Reliability

Figure 1. ECC Bit Correction Requirements Trend for SLC and MLC NAND

ECC Algorithms for NAND


General definitions in ECC algorithms provide for both error detection and error correction. Error detection techniques enable the detection of errors, while error correction provides the ability to reconstruct the original data. The general idea behind detection and correction code is to introduce a redundancy to the data message, enabling the algorithm to check whether the stored data is correct. NAND devices provide an additional storage area (called the spare area) to add further information to the contents of a page. Typically, management data such as ECC redundancy code is stored in the spare area in NAND. Introducing a redundancy means setting the distance between valid words of the message, which is the significant part of the user data. For example, a distance up to 4 bits ensures that a 1-bit error and a 2-bit error can be distinguished from each other so that message contents can be recovered when up to 4-bit errors occur. Among the ECC code types, linear codes are the most common because of their algebraic structure. They are easier to describe, encode, and decode than nonlinear codes. The most widely adopted code for 1-bit ECC is the Hamming code, which detects and corrects data corruption during operation on SLC NAND devices with geometry of 70nm and above. The Hamming algorithm is an industry-accepted method for error detection and correction in many NAND-based applications. It can detect 2-bit errors and correct up to 1 bit. As previously stated, MLC and SLC NAND with high levels of shrink geometry require a more complex correction code. One family of cyclic codes is known as BCH codes. The binary BCH codes were discovered around 1960 by Hocquenghem and independently by Bose and Ray-Chaudhuri, whose names make up the algorithms acronym. Around the time BCH codes first appeared in literature, Reed and Solomon published a work describing a
PDF: 09005aef848aca27/Source: 09005aef848aca46 Rev. C, 01/12 2012 Micron Technology, Inc. Micron and the Micron logo are trademarks of Micron Technology, Inc. All other trademarks are the property of their respective owners. All information is provided on an AS IS basis, without warranties of any kind.

Software Spotlight
ECC Options for Improving NAND Device Reliability

subclass of these codes. Because of their error-correction capabilities, Reed-Solomon codes can be used to improve the general reliability of MLC NAND, compact discs, and other data storage equipment. There are several algorithms for calculating 4-bit (or more) ECC. BCH is popular because of its improved efficiency over Reed-Solomon codes on the computational side. The 4-bit BCH code can correct 4-bit errors and detect 5-bit errors. Typically, ECC is calculated on a chunk of 512 bytes. As long as the errors are confined to four or fewer bits within the 512 bytes, the algorithm can correct the errors and reconstruct the data contents.
Table 1. Bits Required for Each ECC Algorithm in 2KB NAND

Hamming Overhead Per Sector Bit Error 1-bit 2-bit 4-bit 8-bit 10-bit 14-bit Bit 13 Byte 2 Spare Area 1 Usage 64 Bytes 13% 112 Bytes 7%

Reed-Solomon Overhead Per Sector Bit 18 36 72 144 180 252 Byte 3 5 9 18 23 32 Spare Area 1 Usage 64 Bytes 19% 31% 56% 113% 144% 200% 112 Bytes 11% 18% 32% 64% 82% 114%

Binary BHC Overhead Per Sector Bit 13 26 52 104 130 182 Byte 2 4 7 13 17 23 Spare Area 1 Usage 64 Bytes 13% 25% 44% 81% 106% 144% 112 Bytes 7% 14% 25% 46% 61% 82%

Options for Implementing ECC


In the following sections, we describe some options available for implementing ECC, which include: x x x x Hardware ECC Software ECC On-die ECC for Micron NAND Flash memory devices Micron ClearNAND Flash

Hardware ECC A small number of system-on-chip (SOC) solutions have a built-in controller for 4-bit ECC. For example, the following ARM processors have built-in ECC that is sufficient to support NAND: x x NXP LPC31XX series based on ARM9 has built-in ECC controller OMAPL13x/C674x have a built-in ECC engine for 1-bit and 4-bit ECC

Spare area usage values less than 100% indicate that errors can be corrected. Otherwise, there is not enough room in the spare area to store ECC information.

PDF: 09005aef848aca27/Source: 09005aef848aca46 Rev. C, 01/12 2012 Micron Technology, Inc. Micron and the Micron logo are trademarks of Micron Technology, Inc. All other trademarks are the property of their respective owners. All information is provided on an AS IS basis, without warranties of any kind.

Software Spotlight
ECC Options for Improving NAND Device Reliability

The OMAP35x, AM35x, and AM/DM37x devices do not support 4-bit or 8-bit correction in hardware. However, they do support 1-bit, 4-bit (excluding the OMAP35x), and 8-bit hardware detection. It is important to note that using NAND that requires 8-bit ECC with 4-bit hardware ECC could have negative consequences, including boot failure. Software ECC In the absence of hardware dedicated to calculating ECC, we have implemented a Hamming code for 1-bit ECC and a multibit ECC encoder/decoder for NAND in software. The library we provide to implement BCH ECC calculates the ECC on 512 bytes for NAND sectors that consist of a 512-byte data area and a 16-byte spare area, totaling 528 bytes per sector. A binary N-error-correcting BCH code is implemented to ensure data integrity (N represents the number of bits that can be corrected). Inputs and outputs to or from the encoder/decoder are in the byte format. Data is read out in two hex words per byte and the information length of the BCH code must be a multiple of four. Even though the BCH algorithm is designed for optimizing the implementation provided in software, 4-bit BCH requires many microprocessor cycles to perform data computation. ECC correction can be implemented in software for errors that are 2 bits and greater, and optimized software can have good performance. However, the algorithm has a high computational complexity and requires a large amount of RAM, which should be taken into account when the code is implemented on embedded systems with limited resources. On-Die ECC for Micron NAND Devices Our MT29FxGxxAxxDAxx NAND is an example of a Micron device that provides built-in ECC, which is tailored to ensure the highest level of reliability for each device. This device includes a built-in 4-bit ECC internal controller. Although the adoption of a hardware controller implies better READ speed performance when compared to an on-die ECC implementation, the adoption of on-die ECC ensures smoother integration, and it is nearly effortless for the user to replace existing memory devices. This means that MT29FxGxxAxxDAxx devices can replace memory devices with 1-bit hardware support or no hardware ECC support with only minor changes to the system. Enabling internal on-die ECC ensures a high level of endurance for all types of Micron NAND family devices, providing the user with a reliable memory solution. Internal ECC enables 5-bit error detection and ensures 4-bit error correction. The on-die ECC implements a BCH error correction code, and as a result, the number of partial page programs cannot exceed four. Otherwise, the algorithm does not calculate the correction code properly. During the busy time of the PROGRAM operation, the internal ECC generates and stores the parity bit. Then, during a READ operation the on-die controller checks for the presence of errors. When an error larger than four bits occurs, which cannot be corrected, the engine signals the event and sets the least significant bit of the status register. Results are unpredictable for errors that are 6 bits and greater. Table 2 illustrates the usage of the NAND spare area with the limitation that a small part of the spare area is not protected by ECC and the ECC user area must be included in the writing of four partial pages.

PDF: 09005aef848aca27/Source: 09005aef848aca46 Rev. C, 01/12 2012 Micron Technology, Inc. Micron and the Micron logo are trademarks of Micron Technology, Inc. All other trademarks are the property of their respective owners. All information is provided on an AS IS basis, without warranties of any kind.

Software Spotlight
ECC Options for Improving NAND Device Reliability

Table 2. NAND Spare Area Usage

Max Byte Address 1FFh 3FFh 5FFh 7FFh 801h 803h 807h 80Fh 811h 813h 817h 81Fh 821h 823h 827h 82Fh 831h 833h 837h 83Fh

Min Byte Address 000h 200h 400h 600h 800h 802h 804h 808h 810h 812h 814h 818h 820h 822h 824h 828h 830h 832h 834h 838h

ECC Protected Yes Yes Yes Yes No No Yes Yes No No Yes Yes No No Yes Yes No No Yes Yes

Area

Description

Main 0 Main 1 Main 2 Main 3 Spare 0 Spare 0 Spare 1 Spare 1 Spare 2 Spare 2 Spare 3 Spare 3

User Data User Data User Data User Data Reserved User Metadata II User Metadata I ECC for Main/Spare 0 Reserved User Metadata II User Metadata I ECC for Main/Spare 1 Reserved User Metadata II User Metadata I ECC for Main/Spare 2 Reserved User Metadata II User Metadata I ECC for Main/Spare 3 Bad Block Info 2 bytes ECC Parity 8 bytes User Data (Metadata) 6 bytes

Code Example Enabling on-die ECC requires only minor modifications to existing projects. Only the following steps are required to enable on-die ECC: 1. Enable the internal controller to ECC.
/******************************************************************* NAND_EnableECC Function: Arguments: Return Value: Description: void NAND_EnableECC(void) na na Enable internal ECC feature

PDF: 09005aef848aca27/Source: 09005aef848aca46 Rev. C, 01/12 2012 Micron Technology, Inc. Micron and the Micron logo are trademarks of Micron Technology, Inc. All other trademarks are the property of their respective owners. All information is provided on an AS IS basis, without warranties of any kind.

Software Spotlight
ECC Options for Improving NAND Device Reliability

******************************************************************** **********/ void NAND_EnableECC(void) { /* sending SET features command */ NAND_CommandInput((NMX_uint8) 0xEF); /* sending feature address */ NAND_AddressInput((NMX_uint8) 0x90); /* sending data */ NAND_DataInput((NMX_uint8) NAND_DataInput((NMX_uint8) NAND_DataInput((NMX_uint8) NAND_DataInput((NMX_uint8) return; } 0x08); 0x00); 0x00); 0x00);

2. Check the status register ECC bit error.


/******************************************************************* NAND_VerifyECC Function: Arguments: Return Value: NAND_Ret NAND_VerifyECC(void) na NAND_ECC_ENABLED internal ECC is enabled NAND_ECC_DISABLED internal ECC is disabled Description: Return the state of internal ECC feature *******************************************************************/ NAND_Ret NAND_VerifyECC(void) { NMX_uint8 buffer[] = { 0x00, 0x00, 0x00, 0x00 }; NMX_uint8 i; /* sending GET features command */ NAND_CommandInput((NMX_uint8) 0xEE); //sending 0xEE /* sending feature address */ NAND_AddressInput((NMX_uint8) 0x90); for(i=0; i<4; i++) buffer[i] = NAND_DataOutput(); /* test first byte */ if (buffer[0] & 0x08) return NAND_ECC_ENABLED; else return NAND_ECC_DISABLED;

PDF: 09005aef848aca27/Source: 09005aef848aca46 Rev. C, 01/12 2012 Micron Technology, Inc. Micron and the Micron logo are trademarks of Micron Technology, Inc. All other trademarks are the property of their respective owners. All information is provided on an AS IS basis, without warranties of any kind.

Software Spotlight
ECC Options for Improving NAND Device Reliability

ClearNAND Flash ClearNAND Flash is our response to industry demands for a higher capacity solution that also addresses existing ECC concerns. While we have been aggressively shrinking our technology processes to meet those demands, we also recognize that process shrinks have a direct effect on NAND performance and endurance and make error correction increasingly difficult to manage. The adoption of an integrated solution for calculating ECC makes it possible for users to develop one solution that can be easily migrated to future applications. Our ClearNAND solution offers excellent performance in terms of throughput. For specific applications, an effective Flash translation layer (FTL) solution should be adopted to maximize the lifespan of the cells, even if this solution impacts overall performance.

Conclusion
As lithographies shrink, both SLC and MLC NAND increasingly require the use of ECC algorithms to insure data integrity. Solutions include hardware, software, on-die, and ClearNAND Flash. At Micron, weve developed a variety of ECC solutions for protecting data integrity in a wide variety of applications. This means that we have the right NAND device and ECC solution for every design. Microns NAND portfolio provides comprehensive, costcompetitive memory products for computing, networking, mobile, and embedded applications. For more information on our NAND and ECC solutions, contact your Micron representative.

References
x x TN-29-08: Hamming Codes for NAND Flash Memory Devices TN-29-62: Software Device Drivers for Large Page Micron NAND Flash Memory

About the Author


Marina Mariano is a Micron software engineer responsible for developing, debugging, and supporting software device drivers for NAND, NOR, and phase change memory (PCM). She has been instrumental in the development of low-level drivers for Micron.

PDF: 09005aef848aca27/Source: 09005aef848aca46 Rev. C, 01/12 2012 Micron Technology, Inc. Micron and the Micron logo are trademarks of Micron Technology, Inc. All other trademarks are the property of their respective owners. All information is provided on an AS IS basis, without warranties of any kind.

You might also like