This action might not be possible to undo. Are you sure you want to continue?

Welcome to Scribd! Start your free trial and access books, documents and more.Find out more

**Implement OFDMA, MIMO for WiMAX, LTE
**

By Sam Jenkins Principal Engineer picoChip E-mail: sam.jenkins@picochip.com

The wireless industry is in transition. All of the new 4G air-interfaces—WiMAX, Long Term Evolution, Ultra Mobile Broadband, 802.20, Wireless Broadband, next-generation PHS and the like—share some things in common: all are based on OFDM Access (OFDMA); all exploit MIMO; and all have «flattened architecture» and all are Internet Protocol-based. This article discusses WiMAX and LTE: in particular, how to implement the core DSP algorithm of OFDMA, then the novel variant used by LTE for the uplink, and finally a brief discussion on MIMO for both WiMAX and LTE. It describes these in the context of a software-defined flexible architecture. OFDM uses a large number of closely spaced orthogonal subcarriers. Each sub-carrier is modulated with a conventional modulation scheme (such as quadrature amplitude modulation) at a low symbol rate, maintaining data rates similar to conventional sin-

Figure 1: (a) The elements of the FFT implementation are shown. (b) It is possible to combine the ‘building block’ FFT to achieve higher throughput.

gle-carrier modulation schemes in the same bandwidth. OFDMA is the enhancement that enables multiple users to share the channel by allocating them particular tones. Channel equalization The primary advantage of OFDM over single-carrier schemes is its ability to cope with severe channel conditions—for example, attenuation of high frequencies in a long copper wire, narrowband interference and frequency-selective fading due to multipath—without complex equalization filters. Channel equalization is simplified because OFDM may be viewed

as using many slowly modulated narrowband signals rather than one rapidly modulated wideband signal. Low symbol rate makes the use of a guard interval between symbols affordable, making it possible to handle time spreading and eliminate inter-symbol interference. In most systems to date—WiFi, both 16d and 16e WiMAX and LTE downlink—the core algorithm has been FFT. However, the uplink of LTE adds an innovation, which requires use of a more complex discrete Fourier transform (DFT). All these systems demand high-speed FFT processing, and require flexibility. Frequent mar-

Figure 2: The implementation of the SC-FDMA uplink shows the additional steps compared to standard OFDMA.

ket pressure can drive vendors to release products that comply with early versions of a standard, but which must have the flexibility to be upgraded to the final version using only a simple software upgrade. It is also desirable to have one system supporting different modes or different standards (e.g. a common platform for LTE and WiMAX). It is possible to adopt a programmable platform that allows the efficient implementation of hardware-oriented algorithms on a flexible software-based engine. An example is the high-performance PC102 from picoChip that combines the time-to-market and abstraction benefits of a software development environment with the performance benefits gained by exploiting parallelism within an algorithm. The FFT is simply an efficient implementation of the DFT. For an N-point DFT, a direct implementation requires of the order of N2 complex multiply-and-add operations, but as a perfect example of how a clever algorithm can deliver incredible efficiency gains, the classic FFT only requires of the order of N * log2N operations. Two approaches exist for reducing the DFT into a series of simpler calculations. One is to perform decimation in frequency and the other is to perform decimation in time. Both approaches

EE Times-Asia | March 16-31, 2008 | eetasia.com

A hardware-orientated approach strives to minimize the cost or area of silicon by minimizing the number of complex multipliers and storage needed. whereas decimation in frequency takes normal-order inputs and generates digit-reversed outputs. The key difference between the two is that decimation in time takes digit-reversed inputs and generates normal-order outputs. This shows that a single 10MSps FFT requires about 1. some claim it offers the “best of both worlds”: combining the low PAPR of single carrier with the robustness of multicarrier. Because samples will be taken from different points in the input stream. each loop on the each array element should. This scheme maintains the best possible precision of intermediate values resulting in a high SNR of the output data. This is shown in Table 1. if each array element processes each sample in eight cycles. The example FFT is based on the standard radix-4 decimation in frequency algorithm. As the overall throughput is limited by the slowest array element. The manipulation of inputs and outputs is carried out by so-called butterfly stages. FFT implementation The picoChip PC102 is multi-core DSP. then the maximum throughput is 20Msample/s at 160MHz. While most standards to date have used OFDM (Wi-Fi. Figure 1a shows the elements within this. along with the maximum number of FFTs that can be performed on a PC102 at each of these rates. with factors specified by the ‘Controller’. The bit growth that occurs in each butterfly stage. and so improves battery life. take the same number of cycles for optimum performance. The advantage over conventional OFDMA is that the signal has lower peak-to-average power ratio (PAPR) because of its inherent single carrier structure. a task that is performed by the butterfly stage. e-j2πn/N . each radix-4 butterfly stage involves 4 complex multiplies (note that the fourth butterfly consists only of complex additions). require the same number of complex multiplications and additions. Of course. Flash OFDM) or OFDMA (16e WiMAX). As such. the uplink transmission scheme selected for LTE is a new variant: single carrier FDMA (SC-FDMA). non-stop processing of a sequential input stream. also in-order. is easily managed in the 40bit accumulators of STAN2 array elements (AEs) using a round-to-nearest policy. and this is the method used to implement the FFT. 2008 | EE Times-Asia . The picoArray programming model makes it easy to assemble pipelined structures. 16d WiMAX. showing the resources that are required for a 256point FFT at complex sample rates between 10MSps and 80MSps. It integrates over 300 individual processors of slightly different kinds (or “Array Element”) onto a single die: each of which is conventional 16bit. for eight antennas. The implementation of the FFT takes 16+j16. Various architectures exist for this. also known as DFTspread OFDM. Each array element takes its input from the internal bus.com | March 16-31. This is especially important in the uplink where lower PAPR greatly benefits the mobile terminal in terms of transmit power efficiency. Harvard architecture DSP with local memory. For example. ideally. Figure 4: The MIMO downlink system shows two separate burst chains. optimized for wireless. This system also includes beam forming. Figure 1b shows how it is possible to combine the “building block” FFT to achieve higher throughputs—clearly a parallel architecture is well suited to this. A pipeline FFT is characterized by the real-time.5 percent of resources. in-order inputs and provides 16+j16 outputs. The FFT algorithm involves the temporal separation of data. For example. left aligned. processes it and then provides an output to the next DSP in the pipeline.Figure 3: The logical structure of the LTE iDFT is shown. and is mapped to a separate processor. A performance summary of a 256-point FFT on the PC102 is given in Table 2. pipeline FFTs need a way of buffering and reordering data. Each step represents an iDFT engine. That allows more elements to be computed in parallel for a given area. 2bits for the additions and 16bits for the complex multiply. The use of each butterfly stage involves multiplying an input by a complex twiddle factor. there is no such thing as a free lunch and these advantages eetasia.

A more optimal solution would acknowledge that only one stage need implement a 9 point engine. N = 12. Figure 3 shows the logical structure of the LTE iDFT to realize a 20MHz LTE eNodeB on the PC102/PC20x. b.5 percent of resources. The simplest architecture is for reorder + stage buffer pairs A. These are the iDFT “engines. Table 2: The performance summary of a 256-point FFT on the PC102 shows that a single 10MSps FFT requires about 1.25-20MHz with both TDD/FDD options). Table 3 shows the implementation varies for different modes. Table 1: The picoChip PC102 integrates over 300 individual processors of different kinds. a single resource block) through to 1.783. which has the benefit of reducing latency. each of which is a conventional 16bit Harvard architecture DSP with local memory. However that is at the handset transmitter: because the base station receiver is dealing with many users. the simpler will be the implementation.320 (for a 20MHz bandwidth). One complication is that LTE is a scalable bandwidth system (simplistically. 8 or 4 is needed for any iDFT length.2a3b5c ≥ 1. another one an 8 point engine and the third a 4 point engine.e. the classic FFT uses a single prime factor of 2. It is known that efficient implementation of a DFT is possible if the size of the transform can be factorized into a small number of prime numbers. i. each can be broken down into three short iDFT’s of length 2. 3 and 5 engines as never more than one 9. The technique used to break down the iDFT is “divide and conquer”. The size of the DFT precoder in LTE depends on the number of subchannels allocated to the uplink data transmission for a given user. The pipeline of stages must be able to implement all 35 possible iDFT’s functionally as well as dynamically reconfigure and avoid any pipeline hazards caused by different length iDFT’s flowing through at the same time. 4. b. EE Times-Asia | March 16-31. c = 0 = 1 * 12. but the long list of iDFT’s cannot be broken down into a single prime factor.com . the total number of allowed permutations for all possible frame configurations is 531. where a DFT precedes an OFDM modulator.” In this implementation. B and C all to be instantiations of the same functional block that implements all 6 of the iDFT engines (7 if a 1 point iDFT is counted.e. This flexibility obviously complicates the receive iDFT. the principle is similar to the familiar FFT. each of which will be selecting from those choices. plus the 2. This shows the additional steps compared to standard OFDMA.296—in all 35 different choices—these tones are then modulated together to form the single carrier uplink. pass through unchanged). 2008 | eetasia. The fewer prime numbers in the factorization. as described here. Instead. While the flexibility is at a cost compared to the FFT (qv Table 2) it is notable how the architecture is still extremely efficient at implementing these configurations: even for 20MHz+20MHz FDD (worst case) resources needed are still only about 10 percent of a PC102. some iDFTs have been broken down into factors other than their prime factors (i.320. 1. 3 and 5. 8 and 9) to reduce the maximum number of pipeline stages to 3. and c are all ≥ 0 with the condition that N ≤ 1. N can range from 12 tones (a. Of course. For a given user.569.come at the cost of increased complexity in digital processing. Where N is the number of subcarriers and a. An implementation of the SC-FDMA uplink is illustrated in Figure 2.

In this. the newest techniques. the receiver has much better chances of recovering that data. each operating on separate symbols. such as LTE seek to improve on this: with more complex techniques such as SC-FDMA and consequent requirement for flexible DFT techniques. It is possible to use softwareprogrammable architectures to emulate the advantages and flexibility of hardware-oriented tradeoffs. 2-2). Indeed. The former. and rely on ever more complex algorithms to optimize performance. This particular design has eight antennas. the channels are not wholly independent (there is some correlation) and so the advantage drops. This aligns well to a multicore architecture. 2-2. There are several different forms in which MIMO can be used. the symbol rate block is not affected (the one symbol sent) but there are now two burst chains to feed two antennas with the differently modulated forms of the information. the data rate doesn’t increase over SISO. but also the extra complexity of the receiver to distinguish the different signals. is relatively straightforward. there are two standard MIMO modes: Matrix A and Matrix B. Matrix B. based around an FFT. eetasia. It offers significant increases in data throughput and link range without additional bandwidth or transmit power. With m antennas at the TX and n at the RX this is an m x n MIMO. In reality. otherwise called Space Time Coding (STC) sends the same signal in two different forms out of the two transmit antennas. efficiency and range. However. Of the 4 “channels” you can only send 2x the information as you need to be able to ‘solve’ the channel matrix to extract the information. OFDMA. as well as next generation PHS and UMB. a whole range of standards such as WiMAX 16d and 16e. At the receiver. LTE. each with an individual weight for steering. has become the standard for next-generation wireless. and hence robustness and range (for a given data rate) are improved. “null steering” or Space Division Multiple Table 3: The resource usage for scalable iDFT on picoArray for different modes is shown. To take WiMAX downlink as an example. enabling systems manufacturers an early entrance to markets that call for these algorithms such as WiMAX and LTE. They can thus bring products to market ahead of competitors yet still be able to ensure compliance with standards as they are ratified. transmits two different symbols to get double data rate. the symbol rate section will be designed rather faster and then send output alternately to the two TX branches. rather than duplicating it. In practice. configured as four per MIMO branch. from one common platform. paradoxically. many systems combine MIMO with spatial techniques such as beam forming. real systems support both and select Matrix A or B on a per user basis: sending data faster to those with better conditions. there are two burst chains (for the two antennas). To implement this in the downlink. as the channels are indeed less correlated—in free space the 4 channels would all be so similar that the benefits would be limited. very simple for the engineer. This particular diagram is actually slightly more complex: in reality. with a combination of higher spectral efficiency (more bits per second per Hertz of bandwidth) and link reliability or diversity (reduced fading). an appropriate architecture can implement. 2008 | EE Times-Asia . Because the same symbol is sent. and performance could be double the Shannon limit of a Single Input Single Output (SISO) system.com | March 16-31. the worse the channel (more multi-path etc) the more MIMO can help. Access. 1-2. the signal processing is considerably more complex: both because of the higher peak data rates of Matrix B. Extending this architecture to support MIMO as well. by contrast. As shown in Figure 4.Multiple antennas MIMO is the use of multiple antennas at both the transmitter and receiver to improve communication performance and is a feature of all the 4G systems. and using STC to benefit those at the cell edge. and the number of channels at once is the total of all combinations: for example a 2 x 2 MIMO could have 4 “channels” (1-1. but because the two forms (s and –s*) are different. with two independent burstchains feeding two antennas: the same architecture is simply instantiated twice. Air-interfaces are becoming more sophisticated. Indeed.

- Blind ML Detection Abstract
- Dassault Mirage III variants and spec.pdf
- Dassault Mirage III.pdf
- key-features-of-the-lte-radio-interface.pdf
- 02-10-09
- 01 IELTS questions - enjoyable family event .pdf
- Mikoyan MiG 29
- CDMA2000
- Duplex Mismatch
- Duplexing
- TD-SCDMA
- Time Division Long Term Evolution
- LTE Concepts Basic
- LTE Interference Mitigation
- Max 2383
- Umts course manual
- 3G Explained
- Data Tac
- Short Message Service
- Integrated Digital Enhanced Network
- Mobile communications
- Total Access Communication System
- Enhanced Data Rates for GSM Evolution
- RNC_1
- UMTS LAC &Numbering

LTE MIMO

LTE MIMO

- 51
- Analytical Calculations of CCDF for Some Common
- Paper2
- OFDM
- Khaleghi-MIMO Systems Theory and Applications
- Blok Sema OFDM
- Mach Bestak PWC07
- 0_(1)Model-Based Channel Estimation for OFDM Signals in Rayleigh Fading
- Mi Mof or Lte
- 06823437
- Cambium Networks PMP 450 Subscriber Module Specification
- CoMP (1)_CoMP Types - CS, CB, JT and DPS
- A New Radix-4 Fft Algorithm
- 01062013-006
- 4g
- 72 (1)
- Frederique Oggier, Jean-Claude Belfiore, Emanuele Viterbo-Cyclic Division Algebras_ a Tool for Space-Time Coding (Foundations and Trends in Communications and Information Theory) (2007)
- Untitled
- 03
- SIR Based Adaptive Sub Channel Allocation in OFDMA Cellular System
- 1MA197_1e
- Lte Sms & Voice
- Voice & Sms in LTE
- Wimax
- RG231 Datasheet V1.0
- WiMAX Opportunities and Challenges in a Wireless World
- LTE ROAMING.docx
- EION_Subscriber_Manager_Datasheet_Apr12.pdf
- rept4

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue reading from where you left off, or restart the preview.

scribd