3GPP2 Turbo Decoder v2.1
DS275 February 15, 2007
0 0

Product Specification

• Drop-in module for Spartan™-3, Spartan-3E, Spartan-3A/3AN, Virtex™-II, Virtex-II Pro, Virtex-4, and Virtex-5 FPGAs • Implements the CDMA2000/3GPP2 specification [1]. Core contains the full 3GPP2 interleaver • Full range of 3GPP2 block sizes supported, (122-12282) • Core implements the MAX*, MAX, or MAX SCALE algorithms • Dynamically selectable number of Iterations 1-15 • Number representation: two’s complement fractional numbers • Data input: 2 to 5 integer bits and 1 to 4 fractional bits • Internal Calculations: 6 to 9 integer bits and 1 to 4 fractional bits • Sliding window size of 32 or 64 • Works with all 3GPP2 code rates • Internal or external RAM data storage • To be used with Xilinx CORE Generator™ system
Provided by Xilinx, Inc.
Instantiation Template Supported Device Family

LogiCORE Facts Core Specifics
Virtex™-II, Virtex-II Pro, Virtex-4, Virtex-5, Spartan™-3, Spartan-3A/3AN, Spartan-3E

Provided with Core
Documentation Design File Formats Verification Product Specification VHDL VHDL Structural (UniSim) Model Verilog Structural (UniSim) Model VHDL Wrapper Verilog Wrapper

Design Tool Requirements
Xilinx Implementation Tools ISE 9.1i or later

Pay Core. Requires a full or evaluation license


This version of the Turbo Convolution Code (TCC) decoder is designed to meet the 3GPP2 mobile communication system specification [1].

© 2006-2007 Xilinx, Inc. All rights reserved. XILINX, the Xilinx logo, and other designated brands included herein are trademarks of Xilinx, Inc. All other trademarks are the property of their respective owners. Xilinx is providing this design, code, or information "as is." By providing the design, code, or information as one possible implementation of this feature, application, or standard, Xilinx makes no representation that this implementation is free from any claims of infringement. You are responsible for obtaining any rights you may require for your implementation. Xilinx expressly disclaims any warranty whatsoever with respect to the adequacy of the implementation, including but not limited to any warranties or representations that this implementation is free from claims of infringement and any implied warranties of merchantability or fitness for a particular purpose.

DS275 February 15, 2007 Product Specification



3GPP2 Turbo Decoder v2.1

General Description
The TCC decoder is used in conjunction with a TCC encoder to provide a reliable, extremely effective way to transmit data over noisy data channels. The Turbo Decoder core operates very well under low signal-to-noise conditions and provides a performance close to the theoretical optimal performance as defined by the Shannon limit [2]. When a decoding operation is started, the core accepts the block size and the number of iterations from two input ports. The systematic and parity data is read into the core in parallel on a clock-by-clock basis. The core then starts the decoding process and implements the required number of decode iterations. Finally, the decoded bit sequence is output. The entire sequence is automatically controlled from a single first data signal and requires no user intervention. In addition, all the interleaving operations required in the 3GPP2 specification are handled automatically within the core. The core expects two’s complement fractional numbers as inputs and also uses this format for the internal calculations. Each fractional input number represents the Log Likelihood Ratio (LLR) divided by 2 for each input bit. This LLR value can be considered to be the confidence level that a particular bit is a one or zero. The user can trade off accuracy against speed and complexity by selecting the numerical precision that is required. The input data can have 2 to 5 integer bits and between 1 and 4 fractional bits. The precision of the internal calculations can also be controlled with 6 to 9 integer bits and between 1 and 4 fractional bits. The number of internal integer bits must be greater than the number of input integer bits by 3 or more and the number of input fractional bits must be less than or equal to the number of internal fractional bits.

Algorithm Type
The full TCC decoder algorithm is extremely computational and, therefore, approximations must be made to make the algorithm usable in practice. The approach taken here is to provide the user with three algorithm choices: 1. MAX*. A very good algorithmic approximation used when accuracy, rather than algorithm simplicity, is required. BER performance of this approach is the best of all three algorithms although this increases core complexity and resource requirements. In this algorithm, a small lookup table is used to increase the accuracy of some non-linear operations. MAX. Produces lower BER performance than the MAX* algorithm, but provides the advantage of being less complex and, therefore, requires fewer resources. In this case, the lookup table is not used, which reduces the algorithm accuracy and subsequently produces a slightly degraded BER performance (approximately 0.5 dB compared to the MAX* algorithm). MAX SCALE. Produces BER performance very close to the MAX* (within approximately 0.1 dB to 0.2 dB) but with the complexity of the MAX algorithm. If the small reduction in BER performance is acceptable, this provides the best BER performance/resource requirement trade-off. Reference [3], Improving the MAX Log MAP Turbo Decoder, describes this approach in greater detail.



Sliding Window
A commonly used technique to reduce the resource requirements of the core is the use of a sliding window in the calculations. As the sliding window only stores a subset of the entire data set at any one time, the memory requirements are significantly reduced. Two sliding window sizes can be used with the core: 32 or 64.



DS275 February 15, 2007 Product Specification

New Data When asserted (High) on a valid rising clock edge. For different code rates. the decoder is reset. Read on a valid FD_IN assertion (High). Read on a valid FD_IN assertion (High). the decoder asynchronously resets. Block Size Select The block size of the current decode operation.xilinx. the appropriate parity bits in the sequence are replaced by zeros. Qualified by ND when present.com 3 . rising clock edges are ignored and the core is held in its current state. Ignored when RFD deasserted. Asynchronous Clear When asserted (High). Synchronous Clear When asserted (High) on a valid rising clock edge. Figure Top x-ref 1 bus input/output single-bit input/output FD_IN ND ACLR SCLR CE CLK ITERATIONS BLOCK_SIZE_SEL DIN RFFD RFD RDY DOUT S_ADDR P_ADDR WR_D_OUT WE RD_D_IN Figure 1: Input and Output Ports for TCC Decoder Table 1: Core Signal Pinout Pin Direction Port Width (bits) 1 Description First Data When asserted (High) on a valid rising clock edge. the decoding process is started. Clock All synchronous operations occur on the rising edge of the clock signal. 2007 Product Specification www. Input/Output Pins Signal names are shown in Figure 1 and described in Table 1. Clock Enable When this is deasserted (Low). Iterations The number of iterations that the core must implement.3GPP2 Turbo Decoder v2. allowing the core to implement any puncturing scheme.1 Code Rates The core operates with all the different code rates of the 3GPP2 specification and always assumes that rate 1/5 data is used as input. FD_IN Input ND Input 1 ACLR SCLR Input Input 1 1 CE Input 1 CLK ITERATIONS BLOCK_SIZE_ SEL Input Input Input 1 4 4 DS275 February 15. a new input value is read from the DIN port. A rising clock edge is only valid when CE is asserted (High).

Qualified by RDY. Data Out Decoded output data from the core. The systematic and parity data are written out to external memory from the core using this port. This indicates that the entire decode operation is complete. Write Enable Only used with an external memory interface. Ready For Data When asserted (High). then the first data is read from the DIN port. RD_D_IN Input RFFD RFD RDY Output Output Output S_ADDR Output 14 P_ADDR DOUT WR_D_OUT Output Output Output 14 1 Varies (see later) 1 WE Output Functional Description Clock: CLK All operations of the core are synchronized to the rising edge of the CLK signal. Clock Enable: CE CE is an optional pin used to indicate if the next rising clock edge is valid.3GPP2 Turbo Decoder v2. the data on the DOUT port is valid. the core is held in its current state. if CE is Low. All synchronous signals are ignored when CE is Low.com DS275 February 15. If CE is Low. When CE is High. Systematic Address Only used with an external memory interface. rising clock edges are valid and allow the decoding process to continue. Write Data Out Only used with an external memory interface. the core operations are suspended and the core remains in its current state.1 Table 1: Core Signal Pinout (Continued) Pin Direction Port Width (bits) Varies (see later) Varies (see later) 1 1 1 Description DIN Input Data Input Consisting of the systematic and parity data input. the core is ready to accept data on the DIN port. Ready For First Data When asserted (High). If the optional CE pin is present. Parity Address Only used with an external memory interface.xilinx. External RAM address to control the reading and writing of the parity data. 2007 Product Specification . The systematic and parity data are read into the core from external memory using this port. First Data: FD_IN FD_IN is used to start the decode operation. the same clock edge is used to read the values of the BLOCK_SIZE_SEL and the ITERATIONS ports which define the block size and the number of iter- 4 www. Simultaneously. Read Data In Only used with an external memory interface. Ready When asserted. the core is ready to start another decoder operation. External RAM address to control the reading and writing of the systematic data. a rising clock edge is only valid when CE is High. Indicates that the data on the WR_D_OUT port is valid and needs to be written to external memory. When FD_IN is High on a valid rising clock edge.

042 BLOCK_SIZE_SEL (binary) 0000 0001 0010 0011 0100 0101 0110 DS275 February 15. New Data: ND The ND signal is optional and is used to indicate that there is new input data to be read from the DIN port. The value read defines the size of block to be processed for this decode operation. Synchronous Clear: SCLR The SCLR signal is optional and when it is asserted High on a valid rising clock edge. BLOCK_SIZE_SEL port is read when a valid FD_IN occurs. the core is reset to its initial state. The value read defines the number of iterations to be implemented for that blocks decode operation. the SCLR signal is recommended for use in this core. the core is automatically in the reset state. For example. Asynchronous Clear: ACLR The ACLR signal is optional and when it is asserted High. the core is reset to its initial state.xilinx. that is. If the optional pin CE is present then SCLR is ignored if CE is Low. Following the initial configuration of the FPGA. The ITERATIONS port is read when a valid FD_IN occurs.530 2.) Table 2: Valid BLOCK_SIZE_SEL values Block Size 122 250 506 762 1. the ND signal is ignored until the next decoding block is started. 2007 Product Specification www. If optional pin ND is present then FD_IN is only valid when ND is High for the same valid clock edge. All other values on BLOCK_SIZE_SEL that are not covered in Table 2 are invalid.3GPP2 Turbo Decoder v2.1 ations to be implemented for this decode operation. The highly pipelined nature of the decoder core means that any ACLR signal actually creates some SCLR signals internal to the core. so no further ACLR is required before a decoding operation can take place. then 122+6 (tail bits) active High ND assertions are required to load in the complete block before the decoding operation commences. if ND is required and the input block size is 122. (Note that block size values do not include tail bits. ITERATIONS The 4-bit input port represents the number of iterations with valid values from 0001-1111 (binary) or 1-15 (decimal). Block Size Select: BLOCK_SIZE_SEL The 4-bit input port represents the 3GPP2 block sizes as detailed in Table 2. Like the ITERATIONS port. the core is ready to process a new block. that is. the core is automatically in the reset state so no further SCLR is required before a decoding operation can take place. Following the initial configuration of the FPGA. For this reason. FD_IN should only be held High for a single valid clock cycle.018 1. the core is ready to process a new block.com 5 . After all the expected input data has been read into the core.

In this case. Figure 2 shows how the user must map each of the systematic and parity bits to each of the DIN bits. The total bit width of the DIN port is therefore 5 x 5 = 25 bits.090 5.282 BLOCK_SIZE_SEL (binary) 0111 1000 1001 1010 1011 1100 Data In: DIN The DIN port. This port is always the same width as the DIN port because it is used to read the input data from external memory. 2007 Product Specification . Figure Top x-ref 2 RSC1_Systematic[4:0] RSC1_1[4:0] RSC1_0[4:0] RSC2_1[4:0] RSC2_0[4:0] DIN[24:20] DIN[19:15] DIN[14:10] DIN[9:5] DIN[4:0] DIN[24:0] Figure 2: Arrangement of the DIN Port Example Read Data In: RD_D_IN This port is only used where an external memory interface is required. while the two parity values from the interleaved data are represented by RSC2_0 and RSC2_1. see "Turbo Encoder" on page 7. if 2 integer and 3 fractional bits are required for each soft input value. which means that one external memory block is required for the systematic data and a separate block is required for the parity data. the 25-bit input port is represented by DIN[24:0]. 6 www. such as the Xilinx 3GPP2 Turbo Encoder v2. A basic description of the Turbo Encoder core is provided.3GPP2 Turbo Decoder v2. respectively. For example. a total of 5 bits is required for each of the five channels.com DS275 February 15.066 4. It is constructed in the same way as the DIN port in that it consists of the five data channels.1 Table 2: Valid BLOCK_SIZE_SEL values Block Size 3.0. indicating that bit 24 is the MSB and bit 0 the LSB. that is. Note that systematic and parity data is read in different orders during the decoding process. The arrangement of the DIN port for this example is shown in Figure 2. These two memory areas are addressed by the S_ADDR and P_ADDR signals.xilinx. accepts the five input channels of the systematic and parity data. a single-input bus. the systematic input and four parity data inputs (Figure 2).138 8.186 12.114 6. The data input to the decoder core is assumed to be encoded using a corresponding Turbo Encoder core. The two parity values from the non-interleaved data are represented by RSC1_0 and RSC1_1.

It consists of two identical Recursive Systematic Convolution (RSC) encoders: one processes the original input data. When the WE signal is High. Ready For Data: RFD When this signal is asserted. Immediately after a valid FD_IN signal is detected. Write Data Out: WR_D_OUT This port is only used where an external memory interface is required. This signal remains High until a block is initiated via a valid FD_IN signal and the corresponding block of data including tail bits has been input.com 7 . Systematic Address: S_ADDR This 14-bit address port is only used where an external memory interface is required. and the other processes an DS275 February 15. Each block of data is output on a serial basis with RDY indicating that the data is valid. This port will always be the same in width and configuration as the DIN and RD_D_IN ports(Figure 2). it indicates that the core is ready to accept an FD_IN signal to start a new decode operation. it indicates that the core is ready or in the process of accepting a new block of input data. RDY is always asserted for a number of valid clock cycles equal to the size of the current block being decoded. data is written to this memory and when WE is Low. will always be four times larger than the systematic memory block.0 supplied by Xilinx. data is read. the RFFD signal will go Low and remain Low until it is able to start another block.1 Ready For First Data: RFFD When this signal is asserted. When WE is asserted High. therefore. Turbo Encoder The data into the DIN port of the Turbo Decoder core must be generated by a TCC Encoder core that provides the correct data format. Figure 3 shows the basic structure of the Turbo encoder. Data Out: DOUT This is the hard-coded output from the decoding process. there is valid data on the WR_D_OUT port that must be written to memory. RDY is asserted High to indicate that the data on the DOUT port is now valid and forms the final result of the decode operation. Note that there are four soft input parity channels and only a single soft input systematic channel. such as the 3GPP2 Turbo Encoder v2. Data provided by the user on the DIN port appears on the WR_D_OUT port ready for storage in external memory.xilinx. Parity Address: P_ADDR This is similar to the S_ADDR port except that it provides the address for the parity memory block. Ready: RDY This signal is asserted after completing the number of iterations defined by the ITERATIONS port on the valid FD_IN signal. This port provides the address to the systematic data memory area. 2007 Product Specification www. The parity memory block. Write Enable: WE This port is only used where an external memory interface is required. A brief description of the data output requirements of the Turbo Encoder core is provided here for the purpose of identifying the input requirements of the decoder core.3GPP2 Turbo Decoder v2.

xilinx.1 interleaved version of the original input data.3GPP2 Turbo Decoder v2. Each of the two RSCs create three sets of soft values during the tail bit generation. the original input is delayed by the latency of the interleaver. these control switches are switched over to create tail bits. See the 3GPP2 specification for more details. Figure Top x-ref 3 RSC1_systematic data_in Delay RSC1_parity0 RSC1 RSC1_parity1 Puncture y RSC2_systematic Interleaver RSC2 RSC2_parity0 RSC2_parity1 Figure 3: Turbo Convolution Encoder Figure Top x-ref 4 Figure 4: Basic Recursive Systematic Convolution (RSC) Encoder 8 www.com DS275 February 15. For example. The output of the Turbo Encoder (and the input to the Turbo Decoder) always consist of block_size+6 sets of soft values. 2007 Product Specification . Some of the RSC bits are not transmitted depending on the selected encoding rate. When block_size values have been output from the encoder. in rate 1/5 for every one input bit. The output from each RSC consists of the original input bit or systematic bit and two parity bits that are created by the circuit shown in Figure 1. As a general rule. which are used to force the RSCs to a known state. Figure 4 shows that there is a control at the RSC input that switches between new data input and a feedback input. so that the first and successive outputs from both RSCs are synchronized and occur on the same clock cycle. five output bits are generated.

2007 Product Specification www.3GPP2 Turbo Decoder v2. except for the RSC1_systematic port which is used to input the RSC2_systematic values.) Where different puncture rates are used. The input data consists of the output of the two RSC encoders. Figure 5 shows the transition between data input at the end of the block and the tail bit input period.1 Data Input Sequence Figure 5 shows the data input sequence for an example case where block_size = 122. During the RSC1 tail bit input. The RSC1 inputs are also ignored during the RSC2 tail bit period. DS275 February 15.com 9 . The convention used here is that RSC1_systematic(0) to RSC1_systematic (121) represent the 122 values generated during the encoding process. The same convention also applies to the parity channels. Refer to the 3GPP2 specification for more details. a number of the decoder input values will not be transmitted and these must be set to zero during the decoder input stage. The example in Figure 5 assumes a rate of 1/5 where all five of the encoder outputs are used. This reduces the number of input bits required by the core. This is straightforward and is left to the user to implement. if required. (Note that the RSC2_systematic data is never transmitted except during the RSC2 tail-bit period. the RSC2 parity bits are not used and any value on these ports will be ignored by the core at this time.xilinx.

10 Figure Top x-ref 5 3GPP2 Turbo Decoder v2.xilinx.1 DIN Port RSC1_syst(120) RSC1_syst(121) RSC1_syst(T0) RSC1_P0(120) RSC1_P0(121) RSC1_P0(T0) RSC1_P0(T1) RSC1_P1(T1) not used not used RSC1_P1(121) RSC1_P1(T0) RSC2_P0(121) not used not used RSC2_P1(121) RSC1_P1(120) RSC2_P0(120) RSC2_P1(120) RSC1_syst(T1) RSC1_P0(T2) RSC1_P1(T2) not used not used Time (clock cycles) RSC1_syst(T2) RSC2_syst(T0) RSC2_syst(T1) not used not used RSC2_P0(T0) RSC2_P1(T0) not used not used RSC2_P0(T1) RSC2_P1(T1) RSC2_syst(T2) not used not used RSC2_P0(T2) RSC2_P1(T2) RSC1_systematic RSC1_P0 RSC1_P1 RSC2_P0 RSC2_P1 Input data RSC1_syst(120) = RSC1 systematic value for input number 120 RSC1_P1(T0) = RSC1 parity 1 input value for Tail bit zero Figure 5: Data and Tail Bit Input Sequence into the Decoder Core www. 2007 Product Specification .com Tail bits from RSC1 Tail bits from RSC2 DS275 February 15.

The actual required input of the Turbo Decoder core is LLR(x)/2. Eb/No can be related to the noise input by Eb 1 -----.⎬⎟ 2σ 2 ⎭⎟ ⎩ ⎜ 2πσ 2 LLR ( x ) = ln ⎜ ---------------------------------------------------------.exp ⎨ – -------------------.= 10 × log ⎛ -------------------------------⎞ dB ⎝ 2 × rate × σ 2⎠ No Plotting the noise variance against Eb/No produces the results shown in Figure 6. Assume that there is a Binary Phase Shift Keying (BPSK) input signal. the decoder only operates optimally at Eb/No values of 0 dB. DS275 February 15. If for example.= -----2 σ2 For a mean of 1. that is.com 11 .0 represents a transmitted logic 1.xilinx.1 Data Input Format The input to the DIN port of the decoder is proportional to the Log Likelihood Ratio (LLR) of the received data. This figure shows that with the noise variance set to 1.⎬⎟ ⎝ 2πσ 2 2σ 2 ⎭⎠ ⎩ This produces: 2μx LLR ( x ) = --------σ2 meaning that the required input into the DIN port of the decoder is given by: μx LLR ( x ) DIN = ------------------. The Turbo Decoder still functions if an estimate of the LLR is made. LLR(x). 2007 Product Specification www. x. where a value of 1. Pr ( x = 1 ) LLR ( x ) = ln ⎛ -----------------------⎞ ⎝ Pr ( x = 0 )⎠ This is the natural logarithm of the probability that a received symbol is a one or zero.⎟ ⎜ 1 ⎧ ( x + μ 2 ) ⎫⎟ ⎜ ----------------. Knowledge of the input data format and the noise characteristics is required to calculate LLR(x) accurately.3GPP2 Turbo Decoder v2. but best performance is obtained with an accurate calculation of the LLR. This ensures optimal operation of the Turbo Decoder. assuming that the input has been corrupted by Random Gaussian noise with a mean μ and a variance σ2 the LLR(x) can be calculated from ⎧ ( x – μ 2 ) ⎫⎞ ⎛ 1 ⎜ ----------------. Data Input Format in Relation to Eb/No It is common practice to measure the error correction performance of different algorithms using the standard measurement: Eb = Energy per bit No Noise Density It can be shown that for white Gaussian noise. Also. the input to the DIN port of the decoder is simply the values output from the encoder divided by the variance.exp ⎨ – ------------------. and -1. the input values are simply divided by 3 so that they are rescaled to have a mean of 1. the mean of the input signals is 3 instead of 1.0 represents a transmitted logic 0.

For example. If the memory storage is external to the core.8 dB. the greater the degradation in BER performance. 1/3.xilinx.3GPP2 Turbo Decoder v2. if the decoder is expected to operate when Eb/No is around 2 dB. 1/3.com DS275 February 15. For example. If these are external to the FPGA. so a RAM must be provided to store the data. then appropriate variance values are 0. if the user requires the decoder to operate at different Eb/No values. Figure 7 shows the detail of how the external memory ports are connected. then it is likely that there is a 12 www.63. Figure Top x-ref 6 Figure 6: Plot of Noise Variance against Eb/No for Different Rates External Memory Interface The raw input data from the DIN port is used during each decoding iteration. when Bit Error Rate (BER) figures are quoted as using scaled values.1 1. These ports are defined in a previous section. the external memory ports are used to create the memory interface. Assuming that a 25 bit wide DIN port has been selected (as shown in Figure 2). If the data memory is internal to the core no user action is required. respectively. and 1/4. and 3 dB for rates 1/2. This data store can be internal or external to the core.95. The core handles all storage and addressing functions internally. the user can either calculate the variance value in real time or use Figure 6 to provide an estimate of the variance. The systematic and parity memory blocks are assumed to be generic memory blocks. and 1. respectively. For the purposes of this data sheet. it is assumed that the noise variance is accurately known.25 for rates 1/2. 2007 Product Specification . The less accurate the noise variance of the input. and 1/4. but additional detail is included in this section. 0.

the core state does not change when CE is Low. 2007 Product Specification www. All input signals are read and all output signals can be changed on the rising edge of the clock. CE and ND are all High on a rising clock edge. ACLR.1 single bi-directional data port on the memories. This memory block must therefore be capable of true random access at the full system clock rate. The data input process is started when FD_IN.3GPP2 Turbo Decoder v2. d0. rather than the separate ports shown in Figure 7. compared to the parity memory which is 20 bits wide. is read from the DIN port on the same clock edge as the valid FD_IN pulse is detected (Figure 8). the core operates as though CE is permanently High (enabled). RFFD remains Low until the core is ready to process another block of data.440 bits Parity Memory = 4 channels x 5 bits x 12288= 245.760 bits It is important to note that the systematic memory is read in a linear and an interleaved sequence. The parity memory is always 4 times as wide as the systematic memory due to the fact that there are four parity input channels and only one systematic. On receiving a valid FD_IN pulse. The maximum block size in the 3GPP2 specification is 12288 (including 6 tail bits). all input signals are ignored and the core outputs remain the same. giving a total memory requirement for this example as follows: • • Systematic Memory requirement = 5 bits x 12288 = 61. the ITERATIONS and BLOCK_SIZE_SEL inputs are read to determine the size of block to be processed and the number of decode iterations to be implemented. Figure Top x-ref 7 TCC_DECODER S_ADDR WR_D_OUT[24:20] RD_D_IN[24:20] WE WE P_ADDR WR_D_OUT[20:0] RD_D_IN[20:0] ADDR READ_DATA WRITE_DATA Parity RAM ADDR READ_DATA WRITE_DATA WE Systematic RAM Figure 7: Example of External Memory Interface Signal Timing The Turbo Decoder core is a synchronous core operating on the rising edge of the clock. When an optional CE signal is used. These simple changes are left for the user to implement. DS275 February 15. The only exception to this is the asynchronous clear signal. The core will read the next input data values on successive rising edges of the clock unless CE or ND is Low. the RFFD signal goes Low to indicate that the core is no longer ready to receive a first data pulse.xilinx.com 13 . If the optional CE signal is not used. Figure 8 shows the input timing for the decoder. in which case the input data is ignored. as they will be specific to the design. The first input data. At the same time. Note that the systematic memory is only 5 bits wide in this example. During the data input process the RFD signal remains High to indicate that the core is ready to accept further input data. Parity memory is always addressed as an increasing count from zero to block size + tail bits.

xilinx. after BLOCK_SIZE+6 data values have been input.3GPP2 Turbo Decoder v2. The RFD signal going Low also indicates that the core is moving from its input to its decoding phase. the RFD signal will go Low to indicate that all input data has been read. 2007 Product Specification . it takes RFD and RFFD High to indicate that the core is ready to accept a new block. at the end of the input cycle. 14 www.com DS275 February 15. Figure Top x-ref 9 CLK RFD DIN ND CE Figure 9: End Of Input Timing dn-3 dn-2 dn-1 dn After the decoder has performed the required number of iterations. Figure Top x-ref 10 CLK RDY DOUT d0 d1 dn-1 dn Figure 10: Output Timing When the core approaches the end of the data output phase. These signals go High before the last data has been output to maximize throughput.1 Figure Top x-ref 8 CLK FD_IN RFFD DIN ND CE Figure 8: Start of Input Timing d0 d1 d2 d3 d4 As shown in Figure 9. Once RFD is Low further ND signal changes are ignored. the RDY signal is driven High to indicate that there is valid data on the DOUT port (Figure 10).

i5 Where: 1r3 = code rate 1/3 2i3 = 2 input integer bits and 3 fractional input bits 6m3 = 6 metric integer bits and 3 fractional metric bits scale = max scale algorithm (alternatively.com 15 .2i3. DS275 February 15. For example: 1r3. Information is provided on both performance and resource use to allow the user to make informed design decisions based on the requirements of their application. w64 = window size of 64) bs122 = block size of 122 excluding tail bits i5 = 5 iterations These results have been generated in hardware using a Virtex-4 device. Table 3: BER Performance Plots Plot BER vs.scale. Eb/No See Figure 11 See Figure 12 See Figure 13 See Figure 14 See Figure 15 Parameter Being Varied Block size Code rates Algorithms Iterations Window size The core configuration. Additional logic in the FPGA was used to record both the throughput and the bits in error between the input data to the encoder and the output data from the decoder. 2007 Product Specification www.w32. The input data shown in the plots has been scaled as described in the Data Input Format section. star = max star algorithm) w32 = window size of 32 (alternatively. rates and other parameters on the BER performance of the core has been measured and plotted against Eb/No. and number of iterations implemented for each trace in the BER plots are identified in the legend. block size used. This allows the designer to determine the optimum trade off between core performance and complexity. Table 3 indicates which parameter is varied in each of the BER Performance Plots shown in Figure 11 through Figure 15.3GPP2 Turbo Decoder v2. code rate.bs122.xilinx. and the decoder. The device was configured with a setup consisting of an encoder. BER Performance The effect of different block sizes.1 Performance and Resource Usage The core has been extensively tested to optimize performance. noise channel.6m3.

Area and maximum clock frequencies are provided as a guide.985 3.com DS275 February 15. Table 5: Performance and Resource Requirements for Virtex-5 XC5VSX35 (speed grade -1) Case A B C D Max. 16 www.xilinx. when FD_IN is active) to the first decoded data output from the core (when RDY is asserted). 2. code rate used.541 1. Clock frequency does not take clock jitter into account and should be derated by an amount appropriate to the clock source jitter specification.617 1. 9m4. The core latency in clock cycles is given using the following formula: Core Latency = (window_size+3)+2*iterations*(2*window_size+19+block_size) The core latency is defined as the number of clock cycles measured from the first input to the core (that is. 6m3. etc. Area and maximum clock frequencies are provided as a guide.658 2. Clock Rate 150 MHz 148 MHz 122 MHz 144 MHz LUTs 2. 2007 Product Specification . no optional ports • Case D: 5i4. with optional ports CE.496 2. They may vary with new releases of Xilinx implementation tools. Table 4: Performance and Resource Requirements for Virtex-4 XC4VSX25 (speed grade -10) Case A B C D Max. 6m3.3GPP2 Turbo Decoder v2.572 3.135 Hardware Multipliers 4 4 4 4 Block RAMs 27 27 27 43 Notes: 1. Each case is identified below where the bit widths are represented in a similar way to that described on the previous page. max-scale algorithm. 2. no optional ports • Case B: 2i3.678 1. They may vary with new releases of Xilinx implementation tools. RFFD is asserted prior to RDY being deasserted indicating that another block’s decode operation can be initiated before the previous block’s data has been completely output.590 1.474 3.030 Hardware Multipliers 4 4 4 4 Block RAMs 15 15 15 23 Notes: 1. etc. A total of four core configurations have been considered for both devices.204 3. Clock frequency does not take clock jitter into account and should be derated by an amount appropriate to the clock source jitter specification.743 2. Note that both resource use and the maximum achievable clock rate are irrespective of the block size. max-scale algorithm. • Case A: 2i3. or number of iterations implemented. 6m3. Clock Rate 130 MHz 126 MHz 110 MHz 122 MHz LUTs 2. no optional ports A window size of 32 and internal data memory has been used for all of the above cases. max-star algorithm. ND and SCLR • Case C: 2i3.886 Flip-Flops 1. max-scale algorithm.060 3.1 Resource Requirements and Core Performance The resource requirements and associated core performance have been given for Virtex-4 and Virtex-5 FPGAs in Table 4 and Table 5.318 Flip-Flops 1. respectively.

0 13.4 6.3 5 Iterations 8.1 The number of clock cycles between successive decode operations is given by the formula: Clock cycles between blocks = Core Latency + block_size .8 7.6 7 Iterations 6.0 5.3 19.9 6.2 6.2 7.1 9.7 7.282 3 iterations 14.3GPP2 Turbo Decoder v2.8 20.530 3.0 12.5 Figure Top x-ref 11 Figure 11: Measured BER Performance of the MAX_SCALE Algorithm DS275 February 15.1 19.xilinx.6 21.7 9.com 17 . These assume a window size of 32 and a performance equal to that of a Virtex-5 XC5VSX35 device implementing Case A (150 MHz from Table 5).8 9.9 9 Iterations 4.4 6.3 13.5 7.3 8. Table 6: Throughput Rates (Mbits/s) Block Size 122 506 762 1.8 11 Iterations 4.066 6. it is possible to calculate the throughput achievable. Table 6 gives some typical values for throughput of the core for a selection of block sizes and numbers of iterations.3*(window_size-32) Given the number of clock cycles defined in the above equation and the maximum clock frequency of the core. 2007 Product Specification www.5 13.9 7.9 6.138 12.9 9.109 .0 21.7 5.2 21.5 9.5 13.7 12.

xilinx. 2007 Product Specification .com DS275 February 15.1 Figure Top x-ref 12 Figure 12: Measured BER Performance for Different Code Rates 18 www.3GPP2 Turbo Decoder v2.

3GPP2 Turbo Decoder v2.1 Figure Top x-ref 13 Figure 13: Comparison of MAX* and MAX_SCALE Algorithms DS275 February 15.xilinx.com 19 . 2007 Product Specification www.

com DS275 February 15.xilinx.1 Figure Top x-ref 14 Figure 14: Measured BER Performance for Different Numbers of Iterations 20 www. 2007 Product Specification .3GPP2 Turbo Decoder v2.

Vogt and A.3GPP2 Turbo Decoder v2. Electronics Letters 9th November 2000. 3GPP2 C.1 Figure Top x-ref 15 Figure 15: Measured BER Performance for Window Size of 32 and 64 References 1. 3. International Conference Committee. DS275 February 15. Glavieux.xilinx. Finger.S0024-B CDMA2000. A. Improving the MAX Log MAP Turbo Decoder. pp1064-1070. 2007 Product Specification www.com 21 .0. Near Shannon Limit Error-correcting Coding and Decoding Turbo Codes. C Berrrou. pp1937-1939. High Rate Packet Data Air Interface Specification Version 1. IEEE Procedures 1993. Volume 36 No. and P Thitimajshima. 2. J. 23.

or GET. For additional information about the core and how to obtain a license for the core. Date 12/11/03 04/22/04 04/28/05 02/15/07 Version 1.xilinx.com DS275 February 15.1 Initial Xilinx release. Please contact France Telecom for information about its Turbo Codes Licensing Program at the following address: France Telecom R&D. VAT/TURBOCODES 38. Updated for version 2. rue du Général Leclerc 92794 Issy Moulineaux Cedex 9.1 Ordering Information This Xilinx LogiCORE™ product is provided under the terms of the SignOnce IP Site License. TDF. and has decided to license these rights under a licensing program called the Turbo Codes Licensing Program. please contact your local Xilinx sales representative or visit the Xilinx Silicon Xpresso Cafe. 2007 Product Specification .3GPP2 Turbo Decoder v2. Revision 22 www. for itself and certain other parties.1. France Telecom. Supply of this IP core does not convey a license nor imply any right to use any Turbo Codes patents owned by France Telecom.1 2.0 2. claims certain intellectual property rights covering Turbo Codes technology.0 1. For pricing and availability of Xilinx LogiCORE products and software. please see the TCC Decoder product page. Revision History The following table shows the revision history for this document. Added further performance data. Added support for Spartan-3E.

Sign up to vote on this title
UsefulNot useful