This action might not be possible to undo. Are you sure you want to continue?

# A Tutorial on MIMO

Craig Wilson

EE381K-11: Wireless Communications

Spring 2009

May 9, 2009

Contents

1 Introduction 1

2 Beneﬁts of MIMO 1

2.1 Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2.1.1 Union Bound on Probability of Error . . . . . . . . . . . . . . . . . . . . 3

2.1.2 Outage Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.2 Spatial Multiplexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

3 Basic Schemes for Multiple Antennas 5

3.1 Channel Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

3.2 Scalar Rayleigh Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

3.3 Maximal Ratio Combining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

3.4 Selection Combining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3.5 Equal Gain Combining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

3.6 Transmit Maximal Ratio Combining . . . . . . . . . . . . . . . . . . . . . . . . 8

3.7 Alamouti Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

4 MIMO Channel Modeling and Capacity 10

4.1 Narrowband MIMO Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

4.1.1 Narrowband MIMO Channel Capacity . . . . . . . . . . . . . . . . . . . 11

4.1.2 Rank and Condition Number . . . . . . . . . . . . . . . . . . . . . . . . 12

4.2 Physical Modeling of MIMO Channels . . . . . . . . . . . . . . . . . . . . . . . 12

4.2.1 LOS SIMO and MISO Channel . . . . . . . . . . . . . . . . . . . . . . . 13

4.2.2 LOS MIMO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

4.2.3 Geographically Separated MIMO . . . . . . . . . . . . . . . . . . . . . . 15

4.2.4 Two-Ray MIMO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

4.3 Statistical Modeling of MIMO Channels . . . . . . . . . . . . . . . . . . . . . . 17

4.3.1 Frequency Selective MIMO Channel . . . . . . . . . . . . . . . . . . . . . 19

i

5 Diversity-Multiplexing Tradeoﬀ 20

5.1 Scalar Rayleigh Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

5.1.1 QAM over the Scalar Rayleigh Channel . . . . . . . . . . . . . . . . . . . 21

5.2 MISO Rayleigh Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

5.3 MIMO Rayleigh Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

6 Space-Time Coding over Narrowband Channels 23

6.1 Error Motivated Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

6.2 Space-Time Block Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

6.2.1 Linear STBCs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

6.3 Bell Labs Space Time Architectures . . . . . . . . . . . . . . . . . . . . . . . . . 30

6.3.1 V-BLAST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

6.3.2 D-BLAST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

6.4 Space-Time Trellis Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

6.4.1 Trellis Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

6.4.2 Delay-Diversity Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

7 Space-Time Coding for Frequency Selective Channels 40

7.1 Single Carrier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

7.2 MIMO-OFDM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

7.2.1 OFDM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

7.2.2 Extension to MIMO-OFDM . . . . . . . . . . . . . . . . . . . . . . . . . 42

7.2.3 Space-Frequency Coded MIMO-OFDM . . . . . . . . . . . . . . . . . . . 43

7.2.4 Space-Time Coded MIMO-OFDM . . . . . . . . . . . . . . . . . . . . . . 43

7.2.5 Space-Time Frequency Coded MIMO-OFDM . . . . . . . . . . . . . . . . 44

8 Multiuser MIMO 44

8.1 Precoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

8.1.1 Linear Precoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

8.1.2 Nonlinear Precoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

ii

8.2 Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

8.3 Working with Partial CSIT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

9 MIMO in Wireless Standards 46

9.1 3GPP LTE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

9.2 WiMAX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

9.3 802.11n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

10 Conclusion 49

A Math Review 50

A.1 Rank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

A.2 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

A.2.1 Diagonalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

A.2.2 Connection To The Determinant and Trace . . . . . . . . . . . . . . . . . 51

A.3 Inner Product Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

A.4 Singular Value Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

A.4.1 Pseudoinverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

A.4.2 Condition Number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

A.5 Lagrange Multipliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

References 53

iii

1 Introduction

Wireless systems face several challenges including demands for higher data rates, better quality

of service, and increased network capacity while working with limited amounts of spectrum.

Multiple Input and Multiple Output (MIMO) wireless communication systems have become

a hot research topic because they promise to deal with all of these issues by providing both

increased resilience to fading and increased capacity without using more bandwidth or power.

Methods to take advantage of multiple antennas at the receiver or the transmitter were known

from the 1950’s onward. Early methods provided for spatial diversity to improve error per-

formance and beamforming to increase SNR by focusing the energy from an antenna into a

desired direction. In the 1990’s MIMO systems with multiple antennas at both the transmitter

and receiver were proposed. Instead of just using diversity to combat fading MIMO systems

actively take advantage of multipath to work. One of the early seminal works in MIMO was

Telatar’s paper, which demonstrated the potential for improved capacity with no extra spectrum

[Tel95]. Around the same time Bell Labs developed the BLAST architectures, which achieved

high spectral eﬃciency on the order of 10-20 bits/s/Hz [Fos96]. Also around the same time the

ﬁrst space-time coding methods were proposed [TSC98]. In the 2000’s MIMO has continued to

be developed and there are now plans to implement MIMO in several new wireless standards

such as 802.11n, WiMAX, and LTE.

This tutorial paper focuses on the following major topics in MIMO:

1. MIMO Channel Modeling and Capacity

2. Diversity-Multiplexing Tradeoﬀ

3. Space-Time Coding and Architectures

4. Space-Time Coding in Frequency Selective Channels

5. Multi-User MIMO and Applications

6. MIMO in Wireless Standards

2 Beneﬁts of MIMO

The two major beneﬁts of MIMO are diversity gain, increased resilience to fading in the form

of better error performance, and multiplexing gain, increased rate of transmission by exploiting

the increased degrees of freedom oﬀered by the spatial MIMO channel. The ﬁgure below shows

a simple MIMO setup with n

t

transmit antennas and n

r

receive antennas.

1

Figure 1: MIMO System Concept [Gold05]

2.1 Diversity

Diversity is an attempt to exploit redundancies in the way information is sent to achieve bet-

ter error performance by cleverly using multiple copies of the same signal. Three fundamental

types of diversity are time, frequency, and antenna diversity. Time diversity involves averaging

the fading eﬀects of the channel over time. The simplest example is the repetition code, which

transmits the same symbol multiple times with the transmissions separated by more than the co-

herence time of the channel. The receiver decodes each symbol independently and estimates the

transmitted symbol by majority rule. Frequency diversity exploits the variations in a frequency

selective channel. For example, orthogonal frequency division multiplexing(OFDM) can apply

modulation order adaption to each subcarrier depending on the quality of a given subchannel.

Finally, there are several diﬀerent types of antenna diversity. The most obvious type is to simply

use multiple antennas. This type of antenna diversity is one of the main focuses of this paper.

The second type of antenna diversity is to use multiple antennas with diﬀerent polarizations.

The third type of antenna diversity is to use multiple antennas with diﬀerent non-overlapping

beam patterns.

It is generally of interest to quantify exactly how much diversity a given scheme provides. This

can be done through calculating either the average probability of error or the outage probability.

Both of these expressions can generally be approximated as

SNR

−L

(1)

at high SNR. L is the diversity gain. This diversity gain can be more rigorously deﬁned as

L = − lim

SNR→∞

log(P

e

)

log(SNR)

This is just a formalization of the intuition above that replaces “at high SNR” with a limit. For

the case of n

t

n

r

narrowband MIMO the maximum possible diversity gain is n

t

n

r

, which is

the maximum number of independent copies of the same signal that the receiver sees.

2

2.1.1 Union Bound on Probability of Error

It can be diﬃcult to calculate an exact expression for the probability of error for an arbitrary

modulation, so it is useful to calculate an upper bound on the probability of error. Consider an

arbitrary constellation ( containing M points. Write the constellation as

( = ¦c

1

, c

2

, . . . , c

M

¦ (2)

Let P

e

be the probability of symbol error. Let P

e|c

i

be the probability of symbol error given

c

i

∈ ( was sent. Assuming all symbols are equally likely then

P

e

=

1

M

M

m=1

P

e|cm

(3)

The conditional probability of symbol of error can be expanded as

P

e|cm

= P[c

m

is detected incorrectly [ c

m

was transmitted]

=

M

l=1

l=m

P[c

m

is estimated as c

l

[ c

m

was transmitted]

Computing each of these probabilities is diﬃcult and requires integration over a possibly com-

plicated Voronoi region speciﬁc to each type of modulation. A simplifying approximation is the

pairwise error probability(PEP) in which it is assumed for the purposes of calculation that only

c

m

and c

l

are in the constellation. This is denoted P[c

m

→ c

l

]. For complex AWGN

P[c

m

→ c

l

] = Q

_

_

¸

E

x

N

o

[[c

m

−c

l

[[

2

2

_

_

(4)

PEP overestimates the probability of decoding c

m

as c

l

, so

P

e|cm

≤

M

l=1

l=m

Q

_

_

¸

P

N

o

[[c

m

−c

l

[[

2

2

_

_

(5)

3

Then let d

2

min

be the square of the minimum distance between points in the constellation (.

Then

P

e

≤

1

M

M

m=1

M

l=1

l=m

Q

_

_

¸

P

N

o

[[c

m

−c

l

[[

2

2

_

_

≤

1

M

M

m=1

M

l=1

l=m

Q

_

_

¸

P

N

o

d

2

min

2

_

_

=

1

M

M

m=1

(M −1)Q

_

_

¸

P

N

o

d

2

min

2

_

_

= (M −1)Q

_

_

¸

P

N

o

d

2

min

2

_

_

(6)

The Chernoﬀ bound on the Q-function is

Q(x) ≤

1

2

e

−x

2

/2

(7)

So then the probability of error can be approximated as

P

e

≤

M −1

2

e

−

P

No

d

2

min

4

(8)

This bound is very useful in calculating the diversity gain for simple multiple antenna systems.

2.1.2 Outage Probability

Formally the channel is in outage if the rate of transmission exceeds the channel capacity. The

outage probability is the probability that this situation occurs: P [C < R]. From expressions

for outage probability one can also ﬁnd the diversity gain in a manner similar to the average

probability of error method. For the typical Gaussian memoryless channel the channel capacity

is C = Blog

2

(1 + SNR). Thus

P [Blog

2

(1 + SNR) < R] = P

_

SNR < 2

R/B

−1

¸

Generally for other channels the outage condition reduces to the SNR being below a certain

threshold. Thus the diversity gain can generally be found from the probability:

P [SNR < γ]

4

2.2 Spatial Multiplexing

Besides providing diversity gain and improved error performance MIMO can also provide in-

creased data rates and spectral eﬃciency through spatial multiplexing. To see how MIMO

achieves this ﬁrst consider QAM. The transmitted signal can be expressed as

x

n

(t) = a

n

(t)cos(2πf

c

t) −b

n

(t)sin(2πf

c

t) (9)

assuming the appropriate normalizations have been made. This system has two real degrees of

freedom (1 complex degree of freedom) because independent streams of bits could be transmitted

on the cosine and sine terms. In practice, however, the two independent streams usually come

from one original stream, which is demultiplexed by the symbol mapping operation into two

streams, which are placed on the cosine and sine terms.

Fundamentally MIMO provides increased rates in a similar way by providing even more degrees

of freedom. The degrees of freedom in MIMO come from the multiple antennas transmitting

independent streams. In the 4 4 MIMO case, for example, each transmit antenna can send

an independent stream, which will be received by all four antennas simultaneously, so there

are four degrees of freedom. The maximum possible complex degrees of freedom for MIMO is

min¦n

t

, n

r

¦.

3 Basic Schemes for Multiple Antennas

Now consider a few basic multiple antennas schemes that can provide diversity. The major

assumptions underlying these schemes are whether the receiver has channel information(CSIR)

or the transmitter has channel information(CSIT). CSIR is a pretty common assumption and

can be achieved through several estimation methods. CSIT is trickier to achieve as the receiver

must estimate the channel and feed the estimate back to the transmitter through a feedback

channel in FDD or the transmitter must assume that it sees the same channel as the receiver in

TDD. Feedback entails a cost in terms of lost capacity and bandwidth.

3.1 Channel Models

The channel model for these basic MIMO schemes is a simple extension of the scalar Rayleigh

channel. The channels are now modeled as complex Gaussian vectors with (^(0, I) distribution.

This channel model is justiﬁed in terms of physical propagation models in the next section.

5

3.2 Scalar Rayleigh Channel

For comparison consider the scalar Rayleigh channel

y[n] = hx[n] + v[n] (10)

with h¬ (^(0, 1) and v[n]¬ (^(0, N

o

).

For a complex gaussian vector of length m with correlation matrix R the pdf for the vector h

is given by

f

h

=

1

π

m

[ det R[

e

−h

∗

R

−1

h

(11)

Also by the deﬁnition of the pdf

_ _

_

1

π

m

[ det R[

e

−h

∗

R

−1

h

dh = 1 (12)

Then the average probability of error for the Rayleigh channel is

P

e

≤ E

_

M −1

2

e

−

h

∗

hP

No

d

2

min

4

_

=

M −1

2

_

1

π

e

−

h

∗

hP

No

d

2

min

4

e

−h

∗

h

dh

=

M −1

2

_

1

π

e

−h

∗

„

1+

P

No

d

2

min

4

«

−1

h

dh

=

M −1

2

_

_

1 +

P

No

d

2

min

4

_

−1

π

_

1 +

P

No

d

2

min

4

_

−1

e

−h

∗

„

1+

P

No

d

2

min

4

«

−1

h

dh

=

M −1

2

_

1 +

P

No

d

2

min

4

_ (13)

At high SNR

P

e

≈ SNR

−1

(14)

This corresponds to a diversity gain of 1, which is to be expected as there is only one copy of

the signal.

3.3 Maximal Ratio Combining

Consider a system with a single transmit antenna and n

r

receive antennas - the Single-In

Multiple-Out (SIMO) case. This system can be modeled as

y[n] = hx[n] +v (15)

6

This model is basically an extension of the scalar Rayleigh channel to the vector iid Rayleigh

channel. The receiver must take the received parallel signal and estimate the transmitted symbol.

In MRC this is done with a weighted summation of the received branches performed by a complex

vector q.

z[n] = qhx[n] +qv (16)

In this case the SNR can be calculated and bounded with the Cauchy-Schwarz inequality.

SNR =

[qh[

2

P

[q[

2

N

o

≤

[q[

2

[h[

2

P

[q[

2

N

o

≤

[h[

2

P

N

o

(17)

Thus the optimal choice for q is q = h

∗

, which achieves the maximum SNR [Kah54]. This is

eﬀectively a matched ﬁlter. Multiplying by the conjugate of the channel co-phases the signals

and then weights the branches by the channel amplitude [Rapp02]. This kind of action is

similar to the RAKE receiver for CDMA. Now to calculate the diversity gain of MRC through

the average probability of error with the union upper bound consider:

P

e

≤ E

_

M −1

2

e

−

||h||

2

P

No

d

2

min

4

_

= E

_

M −1

2

e

−

h

∗

hP

No

d

2

min

4

_

Then since R = I,

P

e

≤

M −1

2

_ _

_

1

π

nr

e

−

h

∗

hP

No

d

2

min

4

e

−h

∗

h

dh

=

M −1

2

_ _

_

_

P

4No

d

2

min

+ 1

_

−nr

π

nr

_

P

4No

d

2

min

+ 1

_

−nr

e

−h

∗

“

(

P

4No

d

2

min

+1)

−1

I

”

h

dh

=

M −1

2

_

P

4No

d

2

min

+ 1

_

nr

(18)

Then at high SNR

P

e

≈ SNR

−nr

(19)

From this calculation it is evident that the diversity gain is n

r

- the number of receive antennas

and also the number of copies of the symbol that the receiver sees.

3.4 Selection Combining

This method has the same general setup as MRC but the receiver selects the best receive

antennas with largest [h

i

[ as opposed to combining the signal from all antennas [Jak71]. This

7

method can achieve the same diversity gain as MRC - n

r

. Assuming each branch has amplitude

s

k

, then as outlined in [SA00]

P [max¦s

1

, s

2

, . . . , s

nr

¦ < S] =

_

1 −e

−S

2

_

nr

(20)

So the pdf of s

max

is given by

p

smax

(S) = n

r

2Se

−S

2

_

1 −e

−S

2

_

nr−1

(21)

Then the average received SNR is

SNR = P

nr

i=1

1

i

(22)

It is obvious from this equation that increasing the number of receive antennas provides a

diminishing return. Most of the gain comes from going from one receive antenna to two and

three receive antennas.

3.5 Equal Gain Combining

The branches from each antenna are ﬁrst co-phased to cancel out the eﬀects of the channel and

then they are simply added together to produce the output. If the channel tap h

i

= α

i

e

jθ

i

, then

the co-phasing operation is simply a multiplication of each branch by e

−jθ

i

. Equal gain combining

produces performance similar to MRC and achieves the full diversity gain as demonstrated in

[Yac93], but with a 1-3 dB penalty depending on the exact setup and number of antennas.

3.6 Transmit Maximal Ratio Combining

We have considered the SIMO case and now it is time to consider the Multiple-In Single-Out

(MISO) case with n

t

transmit antennas. With multiple transmit antennas it is important to

keep the total transmit power P constant to allow a fair comparison to the cases with only one

transmit antenna. The question now is if any diversity gain can be achieved and if so how? A

ﬁrst attempt to achieve diversity gain with multiple transmit antennas is to simply transmit the

same symbol on each branch. However, this method does not achieve any diversity gain. To get

an intuitive explanation for this one can consider the eﬀective channel that the receiver sees to

be

h

1

+ h

2

+ + h

nt

√

n

t

¬ (^(0, 1) (23)

The

1

√

nt

normalizes the transmit power. This eﬀective channel behaves like a scalar Rayleigh

channel, which provides no diversity gain beyond the scalar Rayleigh channel.

An approach that does work is Transmit Maximal Ratio Combining (TMRC), which requires

CSIT and is a close analog of MRC. The system can be modeled as

y = hx + v (24)

8

A weighting vector q sends a weighted version of the current symbol x to each antenna. So then

y = hqx + v (25)

Then by a derivation similar to MRC it can be shown that the optimal choice for q is q = h

∗

[God97]. The diversity gain for this scheme is n

t

.

3.7 Alamouti Code

Using TMRC requires CSIT, which entails a host of other problems including delay issues and

channel estimation accuracy issues. However, Alamouti in [Ala98] showed that in the 2 n

r

case it is possible to achieve the full diversity, 2n

t

, without CSIT using a clever transmit scheme

with minimal drawbacks. To get the idea of the Alamouti code consider the 2 1 case. For a

narrowband channel the system can be modeled as

y[n] = h

1

x

1

[n] + h

2

x

2

[n] + v[n] (26)

with h

1

and h

2

the channel coeﬃcients. To transmit two symbols u

1

and u

2

do the following

over two symbol times:

1. During the ﬁrst symbol time send x

1

[n] = u

1

and x

2

[n] = u

2

.

2. During the second symbol time send x

1

[n + 1] = −u

∗

2

and x

2

[n + 1] = u

∗

1

.

The system can then be written in matrix form as

_

y[n] y[n + 1]

¸

=

_

h

1

h

2

¸

_

u

1

−u

∗

2

u

2

u

∗

1

_

+

_

v[n] v[n + 1]

¸

(27)

The receiver is trying to detect u

1

and u

2

, so it is more convenient to write the system in the

following form obtained by conjugating y[n + 1]:

_

y[n]

y

∗

[n + 1]

_

=

_

h

1

h

2

h

∗

2

−h

∗

1

_ _

u

1

u

2

_

+

_

v[n]

v

∗

[n + 1]

_

(28)

The two columns of the square matrix are orthogonal:

_

h

∗

1

h

2

h

∗

2

−h

1

_ _

h

1

h

2

h

∗

2

−h

∗

1

_

=

_

[h

1

[

2

+[h

2

[

2

0

0 [h

1

[

2

+[h

2

[

2

_

Thus this detection problem can be decomposed into simple scalar detection problems by pro-

jecting the receiver vector y onto each column of the H matrix. Then the received signal for

each symbol that is used for detection is

r

i

= [[h[[

2

u

i

+ ˜ v

i

(29)

9

with ˜ v

i

¬ (^(0, N

o

). In detection the vector channel is decomposed into a scalar Rayleigh

channel for each symbol. The Alamouti code is representative of a larger class of codes call

orthogonal space-time block codes (O-STBCs) that also have easy detection due to orthogonality.

It can be shown that the diversity gain is 2. Since each symbol is transmitted twice, the

transmit power of each antenna must be reduced by 3 dB compared to the single antenna case

to normalize the total power. This power loss hurts detection but no so much as to make the

Alamouti code useless. In fact, there are some advantages to using antennas transmitting with

lower power. At lower power it is easier to ﬁnd cheap ampliﬁers that can operate in the linear

region. Also the Alamouti code transmits two symbols over two symbol periods, so its eﬀective

rate of transmission is the same as the original symbol rate.

Finally, the Alamouti code can be extended to the full 2n

r

case by using the same transmission

scheme as the 2 1 case and MRC. This method provides the full 2n

r

diversity gain.

Figure 2: Comparison of Alamouti and MRC Error Performance [OC06]

4 MIMO Channel Modeling and Capacity

In this section we will consider several MIMO channels and the physical meaning behind these

channels. Of particular interest is how the structure of a MIMO channel suggests the gains of

MIMO. For all MIMO channels we will assume the rate of transmission is high enough that the

channel will be slow fading, which is a reasonable assumption in any modern high speed wireless

system.

10

4.1 Narrowband MIMO Channel

First, consider the narrowband MIMO channel in which the channel is modeled as a single

complex coeﬃcient h

ij

between the jth transmit antenna and the ith receive antenna [Gold05].

In this case the system can be modeled with matrices as y = Hx +v. This can be written as

_

¸

_

y

1

.

.

.

y

nr

_

¸

_

=

_

¸

_

h

11

h

1nt

.

.

.

.

.

.

.

.

.

h

nr1

h

nrnt

_

¸

_

_

¸

_

x

1

.

.

.

x

nt

_

¸

_

+

_

¸

_

v

1

.

.

.

v

nr

_

¸

_

(30)

with v

i

¬ (^(0, N

o

). This is a nice mathematical formulation, but it oﬀers little insight into

what constitutes desirable properties for H. The singular value decomposition (SVD), however,

can provide the desired insight. The SVD of H is

H = UΣV

∗

(31)

with U ∈ C

nr×nr

, V ∈ C

nt×nt

, and Σ ∈ R

nr×nt

. Both U and V are unitary, which means

UU

∗

= U

∗

U = I

nr

(32)

VV

∗

= V

∗

V = I

n

t

(33)

Then the system becomes

y = (UΣV

∗

)x +v (34)

Deﬁne ˜ y = U

∗

y, ˜ x = V

∗

y, and ˜ v = U

∗

v. Then

˜ y = Σ˜ x + ˜ v (35)

with ˜ v¬ (^(0, N

o

I

nr

), since U

∗

is unitary. Let n

min

= min¦n

r

, n

t

¦. Then the matrix Σ is zero

except on the diagonals where Σ

ii

= σ

i

is the ith singular value of H. In addition by convention,

σ

1

≥ σ

2

≥ ≥ σ

n

min

. This coordinate change transforms the complicated system described by

H into the simple system with independent parallel channels described by Σ.

4.1.1 Narrowband MIMO Channel Capacity

Since the MIMO channel has been decomposed into several parallel channels, the capacity is

easy to compute. The capacity that a MIMO system can support in this case assuming CSIR

and CSIT is

C

sum

=

n

i=1

Blog

2

_

1 +

P

i

σ

2

i

N

0

_

(36)

11

as demonstrated in [CT91, CKT98, Tel95]. The power allocation P

i

can be chosen by trying

to maximize C

sum

subject to the constraint

n

min

i=1

P

i

= P. Lagrange multipliers can be used in

this case to compute the optimal power allocation.

∂

∂P

i

_

n

min

i=1

Blog

2

_

1 +

P

i

σ

2

i

N

0

_

_

= λ

∂

∂P

i

_

n

min

i=1

P

i

_

Bσ

2

i

(P

i

σ

2

i

+ N

o

) log(2)

= λ

P

i

=

B

λlog(2)

−

N

o

σ

2

i

(37)

with λ chosen such that

n

min

i=1

P

i

= P. This power allocation method is known as the waterﬁlling

power allocation. The term

B

λlog(2)

represents the surface of the water and the

No

σ

2

i

term represents

the depth of the water for any singular value.

4.1.2 Rank and Condition Number

Let k be the number of nonzero singular values of H, which is also the rank of H. At high SNR

the waterﬁlling allocation is close to the uniform power allocation, so

C ≈

k

i=1

Blog

2

_

1 +

Pσ

2

i

kN

o

_

≈ k log

2

(SNR) +

k

i=1

log

2

_

σ

2

i

k

_

(38)

k is thus the parameter that controls the number of spatial degrees of freedom and hence the

number of independent streams that can be multiplexed [TV05]. So obviously we want k as

large as possible, which is n

min

at most in the case that H has full rank. That the channel

capacity increases linearly in n

min

at high SNR is one of the most attractive features of MIMO.

Jensen’s inequality can give more information about behavior of the capacity with respect to

H.

k

i=1

Blog

2

_

1 +

Pσ

2

i

kN

o

_

≤ Blog

2

_

1 +

P

kN

o

k

i=1

σ

2

i

_

(39)

This suggests that the quantity

k

i=1

σ

2

i

should be maximized. This is achieved precisely when

all the singular values are roughly equal. In other words

σmax

σ

min

≈ 1. In matrix theory this quantity

is the condition number, κ(H), and a matrix with κ(H) ≈ 1 is said to be well-conditioned. Thus

H should be well conditioned to ensure a large capacity.

4.2 Physical Modeling of MIMO Channels

The major goal of this section is to see how MIMO’s ability to spatially multiplex depends on

the actual propagation environment. Also, this section will examine what must be true of the

propagation to ensure that the rank and condition number criteria are satisﬁed. All antenna

arrays in this section are assumed to be linear and uniformly spaced.

12

4.2.1 LOS SIMO and MISO Channel

Suppose the antennas are uniformly and linearly spaced by ∆

r

λ

c

where ∆

r

represents the spac-

ing as a fraction of the wavelength. This normalization eliminates many λs from subsequent

equations.

Figure 3: LOS MISO and SIMO [TV05]

13

The impulse responses between the transmit antenna and each receive antenna are

h

i

(τ) = aδ(τ −d

i

/c) (40)

a models the path loss of the propagating wave and the d

i

/c term models the time it takes for a

propagating EM wave to reach the ith receive antenna [SMB01]. At baseband the channel gain

is given by

h

i

= a e

−j2πd

i

/λc

(41)

So then the channel can be modeled with AWGN as

y = hx +n (42)

with h = [h

1

, h

2

, . . . , h

nr

] and w¬ (^(0, N

o

I

nr

). For large d

d

i

≈ d + (i −1)∆

r

λ

c

cos(φ) (43)

Deﬁne Ω = cos(φ). Deﬁne the following quantity from [Fle00]:

ˆa

r

(Ω) =

1

√

n

r

_

¸

¸

¸

_

1

e

−j2π∆rΩ

.

.

.

e

−j2π(nr−1)∆rΩ

_

¸

¸

¸

_

(44)

Then the following important identity holds:

ˆa

∗

r

(Ω)ˆa

r

(Ω) =

1

n

r

_

1 e

j2π∆rΩ

e

j2π(nr−1)∆rΩ

¸

_

¸

¸

¸

_

1

e

−j2π∆rΩ

.

.

.

e

−j2π(nr−1)∆rΩ

_

¸

¸

¸

_

=

1

n

r

_

(1) (1) +

_

e

j2π∆rΩ

_ _

e

−j2π∆rΩ

_

+ +

_

e

j2π(nr−1)∆rΩ

_ _

e

−j2π(nr−1)∆rΩ

__

ˆa

∗

r

(Ω)ˆa

r

(Ω) = 1 (45)

Then the channel h can be written as

h = a e

−j2πd/λc

√

n

r

ˆa

r

(Ω) (46)

as demonstrated in [SMB01]. The channel capacity is

C = Blog

2

_

1 +

P[[h[[

2

N

o

_

= Blog

2

_

1 +

Pa

2

n

r

N

o

_

(47)

as given in [TV05]. Thus there is a power gain and increased capacity potentially but no degree

of freedom gain and so no spatial multiplexing is possible.

The MISO case is similar and involves the use of

ˆa

t

(Ω) =

1

√

n

t

_

¸

¸

¸

_

1

e

−j2π∆tΩ

.

.

.

e

−j2π(nt−1)∆tΩ

_

¸

¸

¸

_

(48)

14

4.2.2 LOS MIMO

Similarly to the SIMO case the baseband equivalent channel is

h

ij

= ae

−j2πd

ij

/λc

(49)

If d is large then

d

ij

≈ d + (i −1)∆

r

λ

c

cos(φ

r

) −(j −1)∆

t

λ

c

cos(φ

t

) (50)

as shown in [TV05]. Deﬁne Ω

r

= cos(φ

r

) and Ω

t

= cos(φ

t

). Then the channel matrix is given

by

H = a

√

n

t

n

r

e

−j2πd/λc

ˆa

r

(Ω

r

)ˆa

∗

t

(Ω

t

) (51)

In this case H has rank 1 and the only singular value is a

√

n

t

n

r

. Then the capacity is

C = Blog

2

_

1 +

Pa

2

n

t

n

r

N

o

_

(52)

This is the same result as the SIMO/MISO case: no degree of freedom gain.

4.2.3 Geographically Separated MIMO

Still consider LOS propagation and the narrowband case.

Figure 4: Geographically Distributed Antenna Arrays [TV05]

15

Then the channel between the kth transmit antenna and all the receive antennas is

h

k

= a

k

√

n

r

e

−j2πd

k

/λc

ˆa

r

(Ω

rk

) (53)

with d

k

the distance between the kth transmit antenna and the ﬁrst receive antenna [PNG03,

Her04]. ˆa

r

(Ω) is periodic with period 1/∆

r

. Also, the function ˆa

r

(Ω) doesn’t take on the same

value twice in one period, so ˆa

r

(Ω

r1

) and ˆa

r

(Ω

r2

) are linearly independent as long as Ω

r1

−Ω

r2

is

not an integer multiple of 1/∆

r

. In the 2 n

r

case as long as the two angles are not a multiple

of 1/∆

r

the two rows of H are linearly independent and thus H has full rank. Thus in this

case spatial multiplexing is possible. Now what remains to be considered is whether H is well-

conditioned. To determine this consider the angle θ between the two columns of H associated

with the two transmit antennas. This angle satisﬁes

[ cos(θ)[ = [ˆa

∗

r

(Ω

r1

)ˆa

r

(Ω

r2

)[ (54)

=

¸

¸

¸

¸

sin(πL

r

Ω

r

)

n

r

sin(πL

r

Ω

r

/n

r

)

¸

¸

¸

¸

(55)

with L

r

= n

r

∆

r

. Then the two singular values are

λ

1

=

_

a

2

n

r

(1 +[ cos θ[), λ

2

=

_

a

2

n

r

(1 −[ cos θ[) (56)

Thus

κ(H) =

¸

1 +[ cos θ[

1 −[ cos θ[

(57)

Thus the matrix is ill conditioned whenever [ cos(θ)[ ≈ 1, which occurs when

[Ω

r

−

m

∆

r

[ <<

1

L

r

(58)

for some integer m. So basically when the diﬀerence between two directional cosines of two

angular paths are within

1

Lr

the receiver can’t distinguish between the two paths. This is similar

to the case in frequency selective channels in which the bandwidth of the system controls which

multipath delays can be resolved.

4.2.4 Two-Ray MIMO

Consider the full MIMO case with antenna arrays at both the transmitter and receiver. Let d

(i)

be the distance between transmit antenna 1 and receiver antenna 1 along path i.

Deﬁne

a

i

= a

i

√

n

t

n

r

e

−j2πd

i

/λc

(59)

Then the channel matrix can be expressed as

H = a

1

ˆa

r

(Ω

r1

)ˆa

∗

t

(Ω

t1

) + a

2

ˆa

r

(Ω

r2

)ˆa

∗

t

(Ω

t2

) (60)

16

Figure 5: Two-Ray MIMO [TV05]

as in [PNG03, Her04]. This expression for the channel can be put in matrix form as

H =

_

a

1

ˆa

r

(Ω

r1

) a

2

ˆa

r

(Ω

r2

)

¸

_

ˆa

∗

t

(Ω

t1

)

ˆa

∗

t

(Ω

t2

)

_

(61)

To ensure H has rank 2 the following two conditions must hold:

Ω

t1

,= Ω

t2

mod

1

∆

r

(62)

Ω

r1

,= Ω

r2

mod

1

∆

r

(63)

H has rank 2 so spatial multiplexing is possible. To ensure that H is well conditioned it is

necessary that Ω

r2

−Ω

r1

≥

1

Lr

and Ω

t2

−Ω

t1

≥

1

Lt

that is to say there must be suﬃcient angular

separation at the transmitter and receiver to ensure that the paths can be resolved.

4.3 Statistical Modeling of MIMO Channels

In the case of a frequency selective channel the channel can be modeled as an FIR ﬁlter with

taps ¦h[n]¦. In this case not all individual multipath components can be resolved but only mul-

tipath components that diﬀer in delay by a suﬃcient amount related to the system bandwidth.

In modeling a MIMO channel the interest is not in time resolution of multipath but angular

resolution at the transmitter and receiver [Par00].

Suppose the transmit and receive antenna lengths are L

t

and L

r

. Paths that have Ωs that diﬀer

by less than

1

Lt

at the transmitter or

1

Lr

at the receiver can not be resolved. The term h

ij

is

the aggregation of all paths of angular spacing

1

Lt

about

j

Lt

and angular spacing

1

Lr

about

i

Lr

.

If there are an arbitrary number of paths then the channel is given by

H =

i

a

i

ˆa

r

(Ω

ri

) ˆa

∗

t

(Ω

ti

) (64)

The received and transmitted signals can always be expressed in terms of the follow pair of

17

basis:

o

r

=

_

ˆa

r

(0), ˆa

r

(

1

L

r

), . . . , ˆa

r

(

n

r

−1

L

r

)

_

(65)

o

t

=

_

ˆa

t

(0), ˆa

t

(

1

L

t

), . . . , ˆa

t

(

n

t

−1

L

t

)

_

(66)

which represent the angular bins.

Figure 6: Angular Domain MIMO [TV05]

Each basis can be used to represent transmitted and received signals in the angular domain in

terms of the directional cosine Ω. Let U

t

be the n

t

n

t

matrix with columns from o

t

. If x is a

vector transmitted by the antennas, then in the angular domain x

a

are related by

x = U

t

x

a

, x

a

= U

∗

t

x (67)

By examining the matrix U

t

it can be seen that x

a

is the IDFT of x. Then deﬁne y

a

= U

∗

r

y.

In this coordinate system

y

a

= U

∗

r

HU

t

x

a

+v

a

= H

a

x

a

+v

a

(68)

Each element h

a

ij

can be reasonably modeled as independent circularly symmetric complex Gaus-

sian r.v. like the Rayleigh channel. The validity of this assumption rests on two key factors

18

• Amount of scattering and reﬂection in the multipath environment - this model needs

several multipath components in each angular bin

• The lengths of L

t

and L

r

- Short antenna arrays lump many multipath components into

the same angular bin. A longer antenna array results in better angular resolution of paths

and more non-zero entries in H

a

.

Since U

t

and U

r

are unitary and

H = U

r

H

a

U

∗

t

(69)

H has the same iid Gaussian distribution [CT91]. Thus in the narrowband case the MIMO

channel is basically an extension of the scalar Rayleigh channel where each coeﬃcient of the

channel matrix is a complex Gaussian random variable. In addition, results from random matrix

theory show that H with this distribution has full rank with probability 1. Thus the channel in

this model can support spatial multiplexing.

If there is a strong line-of-sight component, then the fading is not Rayleigh but Ricean.

Antenna Spacing The assumption that the coeﬃcients of H are independent or at least

uncorrelated depends heavily on the antenna spacing. As a rule of thumb antenna spacing of

at least

λ

2

is desirable and results in uncorrelated coeﬃcients [FG98]. As the antenna spacing

increases there is still a diversity gain but it is not quite as large as if the antennas were

spaced further. As the antenna spacing decreases towards

λ

4

the channel coeﬃcients become

strongly correlated. The exact amount of correlation depends on the angular spread of the

antennas. For antennas with small angular spread at separations on the order of

λ

4

or smaller

the coeﬃcients are highly correlated. Since the coeﬃcients are highly correlated the receiver

does not see as many independent copies of the transmitted signal, so the achievable diversity

gain is reduced. In practice the channel coeﬃcients are never completely uncorrelated but

as a simplifying assumption to make analysis tractable we assume they are uncorrelated and

independent.

4.3.1 Frequency Selective MIMO Channel

The extension of the preceding ﬂat MIMO channel model to the frequency selective MIMO

channel model is fairly straightforward. The channel in this case can be modeled as

y[n] =

N

l=1

H

l

x[n −l] +v[n] (70)

as in [TV05]. In this model the channel between any two pairs of antennas is modeled as a

scalar frequency selective channel in which the output is a convolution of the input and the

channel taps. The justiﬁcation for this model is a straightforward extension of the angular

model outlined in the previous sections.

19

5 Diversity-Multiplexing Tradeoﬀ

A MIMO system can transmit one symbol on all the transmit antennas and use the right pro-

cessing to obtain the full diversity gain n

t

n

r

. On the other hand a MIMO system can transmit

n

min

independent streams to provide the maximum possible rate with the minimum error protec-

tion. The diversity-multiplexing tradeoﬀ involves investigating what happens between these two

extremes and in particular what constitutes the optimal tradeoﬀ. In particular, transmitting at

a given rate what is the maximum possible diversity gain. This kind of analysis leads to a curve

relating the transmit rate and the optimal diversity gain. Of great interest is whether a given

space-time code or modulation can achieve this frontier and thus be optimal.

This tradeoﬀ curve is diﬃcult to compute, but some methods have been proposed to simplify

the study of this tradeoﬀ. Tse and Zheng proposed in [ZT03] studying this tradeoﬀ by making

assumptions on the possible rates of transmission and letting the SNR approach inﬁnity. At

high SNR the MIMO capacity is

C ≈ n

min

log

2

(SNR) (71)

for a channel with full rank. Tse and Zheng assumed that only rates R = r log(SNR) are

possible with r = 0, 1, . . . , n

min

. The optimal diversity gain, d

∗

(r), is the exponent in the outage

probability, so

p

out

≈ SNR

−d

∗

(r)

(72)

Thus it makes sense to deﬁne

d

∗

(r) = − lim

SNR→∞

log p

out

(r log SNR)

log SNR

(73)

Alternatively d

∗

(r) can be deﬁned in terms of the probability of error

d

∗

(r) = − lim

SNR→∞

log P

e

(r log SNR)

log SNR

(74)

Before tackling the full MIMO channel it is useful to consider the diversity-multiplexing tradeoﬀ

in scalar and SIMO/MISO channels.

5.1 Scalar Rayleigh Channel

The scalar channel is in outage if the capacity it supports falls below the rate of transmission.

So p

out

is given by

p

out

= P

_

log

_

1 +[h[

2

SNR

_

< r log SNR

¸

= P

_

[h[

2

<

SNR

r

−1

SNR

_

[h[

2

is chi-squared distributed. For suﬃciently large , P[[h[

2

< ] ≈ . Thus the outage

probability is approximately

p

out

≈

1

SNR

1−r

(75)

Thus d

∗

(r) = 1 −r is the optimal tradeoﬀ.

20

5.1.1 QAM over the Scalar Rayleigh Channel

It can be demonstrated that for QAM that P

e

≈

2

R

SNR

. Then

d(r) = − lim

SNR→∞

log P

e

log SNR

= − lim

SNR→∞

log

_

2

r log SNR

/SNR

_

log SNR

= − lim

SNR→∞

r log SNR −log SNR

log SNR

= 1 −r (76)

Thus QAM achieves the optimal diversity-multiplexing tradeoﬀ of the scalar Rayleigh channel.

5.2 MISO Rayleigh Channel

In this case the system can be modeled as

y[n] = hx[n] + w[n] (77)

Taking the rate R = r log SNR as usual the outage probability is

p

out

= P

_

log

_

1 +[[h[[

2

SNR

n

t

_

< r log SNR

_

(78)

[[h[[

2

is χ

2n

distributed so the approximation P[[[h[[

2

< ] ≈

nt

can be used. Then the outage

probability is roughly

p

out

≈ SNR

−nt(1−r)

(79)

So it is apparent that the optimal tradeoﬀ d

∗

(r) = n

t

(1 −r).

The Alamouti code eﬀectively decomposes the MISO channel into parallel Rayleigh channel. It

can be easily demonstrated that the optimal tradeoﬀ curve for this parallel Rayleigh channel is

d

∗

(r) = 2(1 − r). So if QAM is used on each of the scalar channels along with the Alamouti

code, then the resulting system is tradeoﬀ optimal for the MISO channel.

5.3 MIMO Rayleigh Channel

The outage probability is given by

p

out

= min

Kx:Tr[Kx]≤SNR

P [log det (I

nr

+HK

x

H

∗

) < r log SNR] (80)

The matrix K

x

is the covariance matrix of the input and basically represents a power allocation.

The power allocation at the transmitter directly aﬀects the SNR at the receiver. This scheme

21

makes a speciﬁc assumption about the rate R at a given SNR, so the input covariance matrix

must be chosen not to exceed the limit. The worst covariance matrix K

x

is approximately

1

nt

I

nr

,

so

p

out

= P

_

log det

_

I

nr

+

SNR

n

t

HH

∗

_

< r log SNR

_

(81)

This outage probability can be written in terms of the singular values of H as

p

out

= P

_

n

min

i=1

log

_

1 +

SNR

n

t

σ

2

i

_

< r log SNR

_

(82)

There are no neat approximations to evaluate this outage probability but there is a neat geo-

metric argument to evaluate the outage probability [TV05, ZT03]. First consider r close to 0.

Outage occurs when H is close to 0. Close can be evaluated in terms of the Froebnius norm

[[H−0[[

F

= [[H[[

F

=

n

min

i=1

σ

2

i

=

i,j

[h

ij

[

2

Thus the magnitude of each channel coeﬃcient [h

ij

[ must be close to 0 for the channel to be in

outage.

Now if r is an integer greater than 0 the situation becomes considerably more complicated, since

there are more ways to choose bad λ

i

to put the channel in outage. The situation seems hopeless

but it has been shown by Tse and Zheng that although there are many ways for the channel to

be in outage the most common way is for r eigenchannels to be good and the remained to be

bad. In this case H has rank r and H is in the space 1

r

of rank r matrices in the space C

nt×nr

.

So the question of whether H puts the channel in outage is the question of whether H is close

to 1

r

in the appropriate sense.

This question is tractable but also a little tricky, since 1

r

is not a linear space. The following

paragraph is very technical but the fundamental result is simple: 1

r

can be considered to be a

linear space in a suﬃciently small neighborhood. To see that 1

r

is not linear consider that if 1

r

were a linear space, then 0 ∈ 1

r

. But clearly 0 has rank 0, so 0 / ∈ 1

r

. Thus 1

r

is not a linear

subspace of C

nt×nr

. However, although it turns out that 1

r

may not be a linear subspace, it is a

manifold embedded in C

nt×nr

. A manifold is a space with the property that small neighborhoods

of a point look like linear subspaces of R

k

or C

k

. For example, the surface of Earth is a manifold

since a small neighborhood looks like a portion of R

2

even though the overall space is clearly

not linear. The question of interest is what happens when H is close to 1

r

, so it is suﬃcient to

consider a small neighborhood N of a point of 1

r

containing H. N looks like a linear subspace

of C

nt×nr

. For the remainder of this argument restrict our consideration to N.

Since 1

r

can be considered locally linear, the notion of orthogonality can be used. Then H can

be decomposed into a portion in 1

r

and a portion in the space 1

⊥

r

, which is orthogonal to 1

r

. If

the portion of H in 1

⊥

r

vanishes, then H is basically in 1

r

, H has rank r, and so the channel is

22

in outage as discussed before. The probability that the portion of H in 1

⊥

r

vanishes(the outage

probability) is SNR

−d

, where d is the dimension of 1

⊥

r

. If H is of rank r, then r rows of length

n

t

can be chosen and the remaining n

r

rows can be written as linear combinations of the ﬁrst

r rows. From this it follows that dim1

r

= n

t

r + (n

r

− r)r. Since V

r

and V

⊥

r

decompose the

n

t

n

r

space,

n

t

n

r

= dimC

nt×nr

= dim1

r

+ dim1

⊥

r

Thus

dim1

⊥

r

= n

t

n

r

−(n

t

r + (n

r

−r)r) = (n

t

−r)(n

r

−r) (83)

Thus p

out

≈ SNR

−(nt−r)(nr−r)

and so the optimal tradeoﬀ is given by d

∗

(r) = (n

t

−r)(n

r

−r) for

r = 0, 1, . . . , n

min

.

Figure 7: Diversity-Multiplexing Tradeoﬀ For MIMO [TV05]

6 Space-Time Coding over Narrowband Channels

There are two major types of space-time codes: block codes and trellis codes. There names

imply their structures, which are derived from the similar structures in the single antenna case.

The basic idea of a space time block code is to map Q symbols into a block of transmitted

symbols of size n

t

T for some integer T. A trellis code is a convolutional code in which the

current output depends on a block of input bits and the previous input bits represented by the

state of the trellis code.

One general assumption on almost all space-time codes is the quasi-static assumption, which

assumes that the channel remains constant over the duration of a code. The rate at which the

23

Figure 8: Space-Time Encoder Structure

channel changes is related to the coherence time, which is in turn related to the Doppler spread.

The system must be designed to ensure that the duration of a codeword is less than the coherence

time. The channel can change between codewords, but not in the middle of codewords.

6.1 Error Motivated Design

It is important and interesting to ﬁnd conditions that will guarantee a good error performance

for a space-time code. One approach to ﬁnding these conditions for the slow fading MIMO

channel is to consider what factors aﬀect ML decoding of the codewords. The optimal way to

detect a codeword is with ML detection is given by

ˆ

( = arg min

C∈C

[[¸ −H([[

2

(84)

The operation of this detector is limited mainly by the closest pair of codewords. If two code-

words are close together, then noise can lead to incorrect estimation of a codeword as another

codeword. Then the error probability of interest is the paired error probability(PEP) that a

codeword C is incorrectly decoded as E. Conditioning on the channel matrix H the PEP is

[Pro01]

P[C → E[H] = Q

_

_

¸

¸

¸

_

SNR

2

T

k=0

[[H(c

k

−e

k

)[[

2

F

_

_

(85)

Averaging over all channel realization gives the average PEP: P[C → E]. In a way similar to

the diversity-multiplexing tradeoﬀ the diversity gain, d

g

can be deﬁned in terms of the PEP as

d

g

= − lim

SNR→∞

log P[C → E]

log SNR

(86)

Generally at high SNR the PEP is of the form (c SNR)

−dg

. The quantity c improves per-

formance and is called the coding gain. A good space-time code should then achieve a high

diversity gain and a high coding gain.

The relevant question now is how to achieve diversity and coding gains. The covariance of two

24

codewords C and E is the matrix

˜

E = (E−C)(E−C)

∗

. Then the PEP is given by [SA00, Sim01],

P[C → E] =

1

π

_

π/2

0

_

det

_

I

nt

+

SNR

4 sin

2

β

˜

E

__

−nr

dβ

=

1

π

_

π/2

0

rank(

˜

E)

i=1

_

1 +

SNR

4 sin

2

β

λ

i

_

−nr

dβ

≤

rank(

˜

E)

i=1

_

1 +

SNR

4

λ

i

_

−nr

with the second expansion due to expressing the determinants in terms of the eigenvalues λ

i

(

˜

E)

and the last expansion valid at high SNR. This expression can be further bounded to yield

P[C → E] ≤

_

SNR

4

_

−nrrank(

˜

E)

_

_

rank(

˜

E)

i=1

λ

i

_

_

−nr

(87)

Thus the diversity gain is n

r

rank(

˜

E) and the coding gain is

rank(

˜

E)

i=1

λ

i

. Given these two gains

there are two criterion for a good space-time code at high SNR are as follows [TSC98]:

• Rank Criterion - Maximize the minimum rank of the codeword diﬀerence matrix to

achieve a good diversity gain always:

max

_

_

min

C,E∈C

C=E

rank(

˜

E)

_

_

(88)

• Determinant Criterion - Maximize the product of the nonzero eigenvalues to achieve

coding gain

d

λ

= min

C,E∈C

C=E

_

_

rank(

˜

E)

i=1

λ

i

_

_

(89)

In the case where the codeword matrix always has full rank this becomes maximize

d

λ

= min

C,E∈C

C=E

det

˜

E (90)

These criteria guarantee good codes at high SNR.

6.2 Space-Time Block Codes

A space-time block code(STBC) maps a block of Q input symbols into a block of symbols of

size n

t

T to be transmitted on the antennas. A quantity of interest is the eﬀective symbol rate

of the code:

r

s

=

Q

T

(91)

25

For r

s

= 1 the system eﬀectively transmits one symbol per symbol period. For r

s

< 1 the system

on average transmits less than one symbol per symbol period. Codes with r

s

< 1 eﬀectively

reduce the rate of transmission.

6.2.1 Linear STBCs

There are many diﬀerent classes of space-time block codes, but one of the most common is the

linear block code. The codeword of the linear block matrix can be expressed as a linear function

of complex n

t

T basis matrices φ

q

and input symbols c

1

, c

2

, . . . , c

Q

as follows [HH01]:

( =

Q

q=1

φ

q

'¦c

q

¦ + φ

q+C

·¦c

q

¦ (92)

It may seem a little odd to break up the real and imaginary components of the symbols, but the

advantage of this approach is that conjugation of symbols can be used in linear STBCs. The

following example with the Alamouti code shows that this is possible.

Example: Alamouti code The two complex symbols c

1

and c

2

are mapped into the following

matrix, which represents the Alamouti code:

_

c

1

−c

∗

2

c

2

c

∗

1

_

(93)

Then the code can be represented with basis matrices as:

φ

1

=

_

1 0

0 1

_

φ

2

=

_

0 −1

1 0

_

φ

3

=

_

1 0

0 −1

_

φ

4

=

_

0 1

1 0

_

(94)

Code Design Criteria for Linear STBCs As we saw in the previous section minimizing

the worst PEP is a good strategy to develop a good space-time code. In the case of linear

STBCs if the basis matrices are unitary meaning φ

∗

φ = I

nt

if T ≤ n

t

(Tall matrix) or φφ

∗

= I

T

if T ≥ n

t

(Wide matrix), then the PEP condition is

φ

q

φ

∗

p

+ φ

p

φ

∗

q

= 0 q ,= p (Wide) (95)

φ

∗

q

φ

p

+ φ

∗

p

φ

q

= 0 q ,= p (Tall) (96)

Orthogonal STBCs There are a special class of linear STBCs that have special orthogonality

property that leads to easy decoding [TJC99]. An orthogonal STBC has codewords ( that satisfy

the following key property

((

∗

=

T

Qn

t

_

Q

q=1

[c

q

[

2

_

I

nt

(97)

26

This property is very nice because it implies that easy decoding is possible due to the or-

thogonality. The key example of an O-STBC is the Alamouti code, which works on complex

constellations. It is clear in the case of Alamouti that it takes two symbol times to transmit

two symbols, so the transmit rate r

s

= 1. However, it turns out that the Alamouti code is the

only O-STBC that works on complex symbols that achieves a transmit rate r

s

of one symbol

per second. For more than two transmit antennas, r

s

< 1 always. If r

s

<

1

2

then it is always

possible to ﬁnd an O-STBC that achieves good diversity. For a purely real constellation it is

always possible to ﬁnd a real O-STBC for an n

t

that achives r

s

= 1. However, this is not very

useful as many constellations such as QAM are complex.

The diversity multiplexing tradeoﬀ for O-STBCs is given by [OC06] as

d

∗

(g

s

) = n

t

n

r

(1 −

g

s

r

s

) g

s

∈ [0, r

s

] (98)

for QAM constellations.

Quasi Orthogonal STBCs O-STBC achieve full diversity but at the expense of any spatial

multiplexing. Quasi Orthogonal STBCs (QO-STBCs) attempt to achieve some of the beneﬁts

of O-STBCs while also providing for some spatial multiplexing by using smaller O-STBCs as

building blocks. For example a QO-STBC could be

Q(c

1

, . . . , C

2Q

) =

_

O(c

1

, . . . , c

Q

) O(c

Q+1

, . . . , c

2Q

)

O(c

Q+1

, . . . , c

2Q

) O(c

1

, . . . , c

Q

)

_

(99)

were each O is a codeword matrix for a smaller O-STBC on only Q input symbols [TBH00]. If

the O represent Alamouti codewords, then the codeword matrix is

Q(c

1

, c

2

, c

3

, c

4

) =

1

2

_

¸

¸

_

c

1

−c

∗

2

c

3

−c

∗

4

c

2

c

∗

1

c

4

c

∗

3

c

3

−c

∗

4

c

1

−c

∗

2

c

4

c

∗

3

c

2

c

∗

1

_

¸

¸

_

(100)

Then during decoding the codeword matrix is multiplied by its conjugate, which yields

QQ

∗

=

1

4

_

¸

¸

_

a 0 b 0

0 a 0 b

a 0 b 0

0 a 0 b

_

¸

¸

_

(101)

where

a =

4

q=1

[c

q

[

2

b = c

1

c

∗

3

+ c

3

c

∗

1

−c

2

c

∗

4

−c

4

c

∗

2

27

The codeword matrix doesn’t nicely decouple like in the case of O-STBC, but at least the

ﬁrst/third and second/fourth columns can be decoded separately, which greatly reduces com-

plexity. Other combinations of O-STBCs have been proposed including the following Alamouti

like scheme [Jaf01]

Q(c

1

, . . . , C

2Q

) =

_

O(c

1

, . . . , c

Q

) −O(c

Q+1

, . . . , c

2Q

)

∗

O(c

Q+1

, . . . , c

2Q

) O(c

1

, . . . , c

Q

)

∗

_

(102)

Decoding with this scheme has complexity similar to the previous case of QO-STBCs.

Rotated QO-STBCs Because of the way quasi orthogonal matrices are constructed if two

codewords E and C each contain one point from the constellation, then det(

˜

E) = 0, which means

the QO-STBC fails the rank condition. This implies that in some cases QO-STBCs will have

bad diversity gain. A way to improve on this is to use rotated variations of the base constellation

to prevent rank deﬁciencies and achieve good diversity gain [SP03, SX04, WX05, XL05].

Linear Dispersion Codes The BLAST architecture achieves high multiplexing gain at the

expense of diversity gain. O-STBC in contrast achieve high diversity gain at the expense of

multiplexing gain. Linear dispersion codes(LDC) try to achieve a little of both. LDCs are

derived through numerical optimization to determine, which basis matrices are optimal relative

to some criteria that balances diversity and multiplexing gain. There have been several LDCs

proposed including

1. Hassibi and Hochwald LDCs [HH01]

2. Heath and Sandhu LDCs [Hea01, San02]

Algebraic STBCs The Alamouti code works by transmitting two symbols and then their

conjugates arranged in the appropriate way. Algebraic codes also transmit a symbol twice, but

instead of transmitting a conjugate transmit a rotated version of the ﬁrst set of symbols. In

terms of the codeword marix this can be written as

C =

_

u

1

φ

1/2

v

1

φ

1/2

v

2

u

2

_

(103)

with

_

u

1

u

2

_

= M

1

_

c

1

c

2

_ _

v

1

v

2

_

= M

2

_

c

3

c

4

_

(104)

M

1

and M

2

are unitary matrices and the constellation points come from QAM that represent the

rotations. Fundamentally designing an algebraic code comes down to choosing the appropriate

matrices M

1

, M

2

, and φ.

28

B

2,φ

code In this code [DTB02]

M

1

= M

2

=

1

2

_

1 e

jω

1 e

−jω

_

(105)

and ω is chosen by numerical optimization to ﬁt the given constellation. Finally, φ = e

jω

.

Threaded Algebraic Space-Time Code(TAST) This code [GD03] is similar to the B

2,φ

code

M

1

= M

2

=

1

2

_

1 e

jπ/4

1 e

−jπ/4

_

(106)

Tilted QAM This code [YW03] is given by

M

i

=

1

√

2

_

cos ω

i

sin ω

i

−sin ω

i

cos ω

i

_

(107)

This choice of M

i

is literally a rotation matrix that rotates points about the origin by ω

i

radians.

Optimization methods can be used to ﬁnd φ.

Golden Code This code [BRV05] is given by

M

1

=

1

√

10

_

α αθ

α αθ

_

(108)

M

2

=

1

√

10

_

1 0

0 j

_

(109)

α and θ are chosen in terms of the golden ratio

1+

√

5

2

and the constellation.

The ﬁgure below shows how these space-time codes compare to the optimal diversity-multiplexing

tradeoﬀ:

29

Figure 9: Diversity-Multiplexing Tradeoﬀ For Several Techniques [OC06]

The ﬁgure below shows the error performance of several space-time codes:

Figure 10: Error Performance For Several Techniques [OC06]

6.3 Bell Labs Space Time Architectures

The sections on the MIMO channel have demonstrated that MIMO can provide both a de-

gree of freedom gain (increased capacity) and a diversity gain (better error performance). The

Diagonal and Vertical Bell Labs Space Time Architectures (D-BLAST/V-BLAST) suggest gen-

eral architectures to achieve the gains of MIMO. The general idea of the BLAST architectures

30

is to multiplex several streams of symbols (possibly demultiplexed from one original stream)

onto the multiple antennas and then receive and decode the streams. Historically, G. Foschini

suggested the D-BLAST architecture ﬁrst and then V-BLAST was developed later as a simpliﬁ-

cation. However, logically it makes more sense to present V-BLAST ﬁrst and then discuss how

D-BLAST is logically an extension of V-BLAST.

6.3.1 V-BLAST

The general architecture of V-BLAST is described in the ﬁgure below [GFVW99]

Figure 11: VBLAST Architecture

The independent streams are multiplexed by the matrix Q onto the transmit antennas. At the

receiver the streams are decoded jointly or individually. In V-BLAST there is a large degree of

freedom in choosing the exact receiver structure. The choice of receiver structure aﬀects error

rates, capacity, and the complexity of decoding. The design of eﬃcient V-BLAST receivers is

an active area of research.

There are two natural choices for Q depending on whether there is CSIT or not. If there is

CSIT, then the matrix V from the SVD of H can be used. At the receiver the received vector y

is multiplied by the matrix U from the SVD of H. These actions create an equivalent channel

model:

˜ y = Σ˜ x + ˜ v (110)

The complex MIMO channel is reduced to several parallel scalar channels with each subchannel

carrying one stream. The action of Q is to rotate the input streams, so that the action of

the channel can be expressed in a simple form. Sometimes a system provides a codebook of

Q matrices that the transmitter can use. The feedback from receiver is just an index into the

codebook that tells the transmitter, which Q to use. This form of feedback massively reduces

the required bandwidth in the feedback channel.

If there is not CSIT, then the situation is considerably more complicated and interesting. In this

case the best choice for Q is simply the identity matrix I

n

t

. In this case the choice of receiver

is an interesting problem and there are many choices all with diﬀerent choices.

V-BLAST Receiver Structures There are two general steps in the V-BLAST receiver. The

ﬁrst is demodulation in which the receiver estimates what symbol was sent and hence which bits

were sent. The next step is decoding in which any codes that were applied to individual streams

31

are decoded. Basically any convolutional and block code can be applied to individual stream, so

we are primarily interested in diﬀerent architectures for demodulation. The optimal V-BLAST

receiver is the ML-receiver that jointly decodes the streams. The ML receiver estimates the

transmitted streams by the rule [TV05]

ˆs = arg min

s∈C

[[y −Hs[[

2

(111)

Practically what this method does is pick the closest point to the received vector in the lattice of

points formed by Hs, where s is a point in the original constellation. This problem is known as

the integer least squares problem. Although this method is optimal it is computationally complex

(NP-hard) as it must be performed over all possible transmit vectors. This computational

complexity generally makes it infeasible to use an ML detector.

Sphere Decoding Although the ML detector is basically computationally infeasible in many

practical system there has been considerable interest in algorithms that are similar to ML-

detection in methodology and performance but with considerably less complexity. In addition,

these ML-like algorithms can feed soft decisions to the decoders to improve their performance.

Sphere decoding is one such algorithm [VB99].

The basic idea behind sphere decoding is to look only at points within a sphere of radius d about

the received vector and then choose the closest point inside the sphere [HV05]. If the sphere

actually contains any points, then obviously it must contain the closest point, which is what

ML detection would pick as an estimate of the transmitted lattice point. In this case sphere

decoding agrees with ML detection.

Figure 12: Idea Behind Sphere Decoding [HV05]

This process reduces the search space and necessary number of computations. In addition, since

the transmitted vector is corrupted by AWGN, the actual transmitted lattice point is likely to

be close by the received vector and in the sphere. Then there are two key problems that sphere

decoding has to deal with [HV05].

1. How to ﬁnd lattice points inside the sphere?

The detector can not compare the received vector to every point in the lattice to ﬁnd

the points inside the sphere or it would be performing an exhaustive search oﬀering no

advantage over normal ML-detection.

32

2. How to choose the sphere radius?

If d is too large, then the detector considers too many points. If d is too small, then the

sphere may contain no points. One way to choose the sphere radius is to compute the

Babai estimate for the transmitted symbol ˆ s

B

. This estimate is not actually a point in

the lattice, but the least squares solution (not constrained to the lattice) given by

ˆ s

B

= arg min

s

[[y −Hs[[

2

(112)

Then choose d = [[y −H ˆ s

B

. There are other heurestic methods to choose d.

A solution to problem number one above is based on a simple observation: the problem is

diﬃcult in general but easy in one dimension. In one dimension the sphere is simply an interval,

so the problem reduces to ﬁnding the lattice points inside this interval. Now the algorithm

proceeds inductively by assuming that all k-dimensional points within the sphere of radius d

have been found. Then the set of k+1-dimensional points that lies within radius d is an interval,

which is the easy one-dimensional problem. This process continues until the full dimension of

the search space is reached. This process is usually visualized as a tree where the kth level of

the tree corresponds to the points of dimension k inside the sphere of radius d.

Figure 13: Tree for Sphere Decoding [HV05]

To see exactly how sphere decoding works suppose the lattice we are working on is the integer

lattice Z

m

[HV05]. Fix a sphere radius d. Suppose the channel matrix H ∈ R

n×n

and that

n ≥ m. The goal is to ﬁnd the points s ∈ Z

m

such that

[[y −Hs[[

2

≤ d

2

(113)

where y is the received vector. The algorithm proceeds ﬁrst by calculating the QR factorization

of the matrix H:

H = Q

_

R

0

(n−m)×m

_

(114)

33

where Q is an n n orthogonal matrix and R is an m m upper triangular matrix. This

decomposition will make later calculations simpler. Expand the orthogonal matrix Q as

Q =

_

Q

1

Q

2

¸

(115)

with Q

1

∈ R

n×m

and Q

1

∈ R

n×(n−m)

. Then the points inside the sphere satisfy:

d

2

≥ [[y −

_

Q

1

Q

2

¸

_

R

0

(n−m)×m

_

s[[

2

= [[

_

Q

∗

1

Q

∗

2

_

y −

_

R

0

_

s[[

2

= [[Q

∗

1

y −Rs[[

2

+[[Q

∗

2

y[[

2

This expression can be rearranged to the condition:

d

2

−[[Q

∗

2

y[[

2

≥ [[Q

∗

2

y −Rs[[

2

(116)

Deﬁne

˜

d

2

= d

2

−[[Q

∗

2

y[[

2

and ˜ y = Q

∗

1

y. Then the condition to be in the sphere is given by

˜

d

2

≥ [[˜ y −Rs[[

2

(117)

=

m

i=1

_

˜ y

i

−

m

j=i

R

ij

s

j

_

2

(118)

The sum can be written term by term as

˜

d

2

≥ (˜ y

m

−R

mm

s

m

)

2

+ (˜ y

m−1

−R

m−1,m

s

m

−R

m−1,m−1

s

m−1

)

2

+ (119)

We observe that the ﬁrst term depends on only ¦s

m

¦, the second term depends on only

¦s

m−1

, s

m

¦ and so on. Then the following is a necessary condition for any point s to be in

the sphere:

˜

d

2

≥ (˜ y

m

−R

mm

s

m

)

2

(120)

Basically the last coordinate of s must be within

˜

d of ˜ y. Finding the integers that satisfy this

necessary condition is easy; they are simply the integers

¸

−

˜

d + ˜ y

m

R

mm

| ≤ s

m

≤ ¸

˜

d + ˜ y

m

R

mm

| (121)

The key step in this process is how to proceed from ﬁnding the s

m

in the sphere to ﬁnding which

¦s

m−1

, s

m

¦ are in the sphere. This is done by ensuring the ﬁrst two terms in equation 119 are

less than

˜

d

2

:

˜

d

2

≥ (˜ y

m

−R

mm

s

m

)

2

+ (˜ y

m−1

−R

m−1,m

s

m

−R

m−1,m−1

s

m−1

)

2

(122)

To make use of this condition proceed as follows: For each s

m

deﬁne

˜

d

2

m−1

=

˜

d

2

−(y

m

−R

mm

s

m

)

2

(123)

34

Then we can obtain a condition that s

m−1

must satisfy to be in the sphere:

¸

−

˜

d

m−1

+y

m−1

−R

m−1,m

s

m

R

m−1,m−1

| ≤ s

m−1

≤ ¸

˜

d

m−1

+y

m−1

−R

m−1,m

s

m

R

m−1,m−1

| (124)

By applying this method to each s

m

the points ¦s

m−1

, s

m

¦ inside the sphere of radius d can be

found. This process can be continued until the full m-dimensional problem has been solved. It

is clear why a tree is an appropriate structure to represent the operation of sphere decoding,

since each leaf gives rise to some number of children (possibly zero) in the next iteration all

of whom are inside the sphere as one more dimension of the problem is solved. It is also clear

that if we choose the radius to be too small one of the conditions like equation 124 may not be

satisﬁed by any integer and thus no points are in the sphere. If the sphere radius is too large,

then too many points may satisfy equation 124 making computing the closest point tricky.

Non-Joint Detection Besides joint detection there are a wealth of detectors that work on

detecting individual streams from the received signal and don’t attempt to decode all the streams

simultaneously. Consider trying to decode one stream x

k

. The system in this case can be

modeled as

y[n] = h

k

x

k

[n] +

i=k

h

i

x

i

[n] +v[n] (125)

where h

i

is the ith column of the channel matrix H. In this system there is a stream of interest

plus several interfering streams represented by the sum terms plus a noise terms. To successfully

decode the stream of interest the receiver must deal with the interference term and the noise

term.

Zero Forcing Nulling At high SNR performance will be interference limited not noise limited

[GFVW99]. ZF-Nulling attempts to remove all the interfering terms in the sum to leave only the

stream of interest. This can be done linearly with a single vector multiplication. The weighting

vector q

k

to decode the kth stream satisﬁes

q

T

k

h

j

= δ

kj

(126)

where δ

kj

is the Kronecker delta which is 1 when k = j and 0 otherwise. Then

q

T

k

y[n] = q

T

k

h

k

x

k

[n] +

i=k

q

T

k

h

i

x

i

[n] +q

T

k

v[n]

= δ

kk

x

k

[n] +

i=k

δ

ki

x

i

[n] +q

T

k

v[n]

= x

k

[n] +q

T

k

v[n]

This weighting vector has an obvious geometric interpretation; the weighting vector projects the

received vector y onto a subspace orthogonal to h

1

, . . . , h

k−1

, h

k+1

, . . . , h

nt

.

35

Figure 14: Zero Forcing Nulling MIMO

The weighting vectors are just the columns of the pseudoinverse of Hgiven by H

†

= (H

∗

H)

−1

H

∗

,

so it is not too diﬃcult to compute the appropriate weighting vectors given the channel matrix

H. It is easy to calculate the SNR out for each stream using weighting vectors as

SNR

k

=

P

[[q

k

[[

2

N

o

(127)

ZF-Nulling with Successive Interference Cancellation The SNR has an inverse relation

to [[q

k

[[

2

, so if [[q

k

[[

2

can be reduced the SNR will be increased. Results from linear algebra

indicate that the higher the dimension of the space that q

k

must be orthogonal to the larger

[[q

k

[[

2

is. So if q

k

must be orthogonal to fewer vectors, then [[q

k

[[

2

will be reduced. Successive

interference cancellation(SIC) can reduce the dimension and increase the SNR. The diagram

below shows the operation of SIC.

Figure 15: Successive Interference Cancellation [TV05]

With this scheme as each stream is decoded it is subtracted from the received vector. As a

result the subtracted scheme does not interfere with any subsequent streams. So then q

k

must

be orthogonal to h

k+1

, . . . , h

nt

. The reduced number of vectors means [[q

k

[[

2

is reduced and

SNR

k

is increased.

36

One practical issue when implementing SIC is the order of cancellation. The last decoded stream

has the least interference and achieves the best performance. It has been demonstrated that a

greedy choice of order is optimal relative to the maximin criteria [GFVW99]. This means that

the kth stream to be decoded should be chosen from the reaming streams as the one that will

achieve the highest SNR of the remaining streams if it is decoded now. The maximin criteria

means that the smallest SNR

k

is maximized by choosing the optimal order.

The major drawback to SIC is error propagation. Mistakes at the beginning of the decoding chain

can introduce mistakes later on. So if one stream is inaccurately decoded, then all subsequent

streams will likely be decoded inaccurately.

Matched Filter At very low SNR noise is the problem, so a matched ﬁlter can be used to

deal with the noise. In the MIMO case the matched ﬁlter for each stream is simply maximum

ratio combining(MRC) performed on the appropriate column of H.

MMSE Receiver The matched ﬁlter performs well at low SNR and ZF-nulling performs well

at high SNR. But at high SNR the matched ﬁlter has bad performance and at low SNR ZF-

nulling has bad performance. So naturally one may wonder if there is a receiver that operates

well at both low and high SNR. The MMSE receiver is such a receiver [TV05].

To understand how the MMSE receiver works consider the following SIMO system modeled as

y = hx +z (128)

with z colored noise having invertible correlation matrix K

z

. The ﬁrst operation is to whiten

the noise by multiplying by K

−

1

2

z

. Then the system becomes

K

−

1

2

z

y = K

−

1

2

z

hx +K

−

1

2

z

z (129)

Then apply a matched ﬁlter (K

−

1

2

z

h)

∗

to yield the system

h

∗

K

−1

z

y = (h

∗

K

−1

z

h)x +h

∗

K

−1

z

z (130)

Thus the receiver simply multiplies the received signal by h

∗

K

−1

z

and performs normal demod-

ulation. This is the MMSE receiver, which maximizes the SNR, while minimizing the MMSE

between the estimate of x and x itself.

For V-BLAST the corrupting non-white noise is the interference terms plus the additive noise.

The covariance matrix for this noise is given by

K

z

k

= N

o

I

nr

+

i=k

P

i

h

i

h

∗

i

(131)

A similar derivation shows that the MMSE receiver in this case the weighting vector is

q

k

=

_

N

o

I

nr

+

i=k

P

i

h

i

h

∗

i

_

−1

h

k

(132)

37

It is pretty easy to see that the MMSE receiver is a tradeoﬀ between the matched ﬁlter and

ZF-Nulling. At low SNR

K

z

k

≈ N

o

I

nr

(133)

so the receiver is given by h

k

, which is a matched ﬁlter. At high SNR

K

z

k

≈

i=k

P

i

h

i

h

∗

i

(134)

and it can be seen that q

k

is simply the kth column of the pseudoinverse of H. Thus the MMSE

receiver is like ZF-Nulling at high SNR. In addition, the MMSE receiver has good performance

in the region between high and low SNR. If SIC is used in conjunction with MMSE, then

MMSE-SIC can achieve the channel capacity.

6.3.2 D-BLAST

Consider the kth stream. It is transmitted by one antenna and received by all n

r

receive

antennas. Thus the maximum possible diversity gain for any individual stream is n

r

and there

is a limit to how much MIMO diversity techniques can protect a stream [Fos96]. If a SIC

structure is used with either the MMSE receiver or ZF-nulling, then if one stream is incorrectly

decoded, then subsequent streams will likely be incorrectly decoded. The main reason for this

problem is that no coding is performed spatially across the multiple streams. Coding across

streams is used to ensure each stream is reliably decoded, but in order to decode the spatial

code across the streams each stream must already be decoded in V-BLAST. The solution to this

problem is to alter the way the streams are transmitted.

Consider the case with two transmit antennas. Suppose that there are two separate streams

each consisting of two blocks. Denote this by a

(i)

and b

(i)

for i = 0, 1. Then the D-BLAST

codeword is

( =

_

a

(1)

b

(1)

a

(2)

b

(2)

_

(135)

From this codeword matrix it is obvious where D-BLAST gets its name from, since the layers

are now diagonal. The receiver works as follows:

1. First receive a

(1)

with MRC

2. Next receive a

(2)

with MMSE or ZF-nulling, while ignoring b

(1)

.

3. Next decode the spatial code across the ﬁrst layer [a

(1)

a

(2)

].

4. Now a

(2)

has been reliably decoded, so it can be cancelled out and b

(1)

can be received.

5. Finally, b

(2)

can be received with MRC. Then the second layer [b

(1)

b

(2)

] can be decoded.

Now both streams have been decoded reliably. The key observation is that for a single layer

if one of the blocks for one stream is initially decoded incorrectly, there is still a chance to ﬁx

38

the error with the code applied across the layer. The major price to pay for using D-BLAST

is the lost capacity during the startup process due to the blank spots in the codeword. For

example, during the ﬁrst block the second transmit antenna transmits nothing, and so some

capacity is lost. Finally, there is also the cost in implementation complexity of applying coding

and decoding across streams.

6.4 Space-Time Trellis Codes

A space-time trellis code (STTC) is an extension of normal convolutional codes to multiple

antennas [TSC98]. The key idea behind a STTC is to make the output of the encoder a function

of the input bits and the state of the encoder, which is in turn a function of the previous inputs.

Trellis codes provide better error performance compared to block codes and coding gain at the

expense of implementation complexity.

6.4.1 Trellis Representation

Suppose B bits are input into the encoder, which has 2

ν

states. A trellis diagram is a way of

representing the action of a STTC [OC06]. The diagram below shows a trellis. The number of

nodes is the number of states in the code. The left column represents the current state of the

code and the right column represents the next state. The possible outputs are listed on the left

hand side of the trellis. If the output is 02 for example then the 0

th

is sent on the ﬁrst antenna

and the 2

nd

symbol is sent on the second antenna. The transition arrows are driven by the input

bits. There are 2

B

arrows from each state on the left to states on the right for each possible

combination of inputs.

Figure 16: Trellis Coding [OC06]

The decoder’s job is estimate which sequence, path through the trellis, was sent with. One way

39

to do this is with a Maximum Likelihood Sequence Estimator (MLSE). There is a well known

algorithm - the Viterbi algorithm - to eﬃciently estimate, which sequence was sent.

Trellis Complexity There is a fundamental lower bound to the complexity of a STTC. For a

STTC with B input bits and minimum rank r

min

has at least 2

B(r

min

−1)

states. Obviously as the

number of states increases the complexity of decoding increases, so this lower bound on states

puts a lower bound on the possible complexity.

6.4.2 Delay-Diversity Scheme

This is one of the simplest trellis codes to achieve diversity [Wit93, SW94]. The codeword for

T transmitted symbols is given by

( =

1

√

2

_

c

1

c

2

c

T

0

0 c

1

c

T−1

c

T

_

(136)

The trellis diagram below represents this code

The eﬀect of this code is to convert spatial diversity to frequency diversity. Consider a 2 1

MIMO system. When c

1

is transmitted during the ﬁrst symbol period, it sees the channel h

1

.

When c

1

is transmitted during the second symbol period, it sees channel h

2

. This is equivalent

to passing c

1

through a frequency selective channel with two taps in the frequency domain: h

1

and h

2

. So spatial diversity becomes frequency diversity by applying this code.

7 Space-Time Coding for Frequency Selective Channels

There are two basic approaches to MIMO over frequency selective channels as in normal SISO

frequency selective channels:

1. Single carrier

2. Multicarrier - OFDM

Many modern wireless standards that use MIMO also use OFDM, so MIMO-OFDM is of par-

ticular interest.

7.1 Single Carrier

In this case the system can be modeled as

y

k

=

L−1

l=0

H[l]c

k−l

+v

k

(137)

40

This complicated system involving a summation can be expressed as a simple system of the form

y

k

= [H[0] H[L −1]]

_

c

T

k

c

T

k−L+1

¸

T

+v

k

(138)

which is similar to the narrowband MIMO case.

7.2 MIMO-OFDM

MIMO-OFDM is an extension of normal OFDM to the MIMO case where there are multiple

antennas.

7.2.1 OFDM

OFDM uses the FFT and IFFT to decompose the wideband frequency selective channel into

several smaller narrowband frequency ﬂat channels. The cyclic preﬁx is added to prevent ISI.

Figure 17: OFDM System Model [OC06]

The system model can be expressed in the DFT frequency domain as

Y [k] = H[k]X[k] + V [k] (139)

with V [k] the corrupting noise. H[k]X[k] results in a circular convolution in the time domain,

which can be expressed in matrix form as

_

¸

¸

¸

_

y[0]

y[1]

.

.

.

y[N −1]

_

¸

¸

¸

_

=

_

¸

¸

¸

_

h[0] h[N −1] h[N −2] h[1]

h[1] h[0] h[N −1] [2]

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

h[N −1] h[N −2] h[N −3] h[1]

_

¸

¸

¸

_

_

¸

¸

¸

_

x[0]

x[1]

.

.

.

x[N −1]

_

¸

¸

¸

_

+

_

¸

¸

¸

_

v[0]

v[1]

.

.

.

v[N −1]

_

¸

¸

¸

_

The singular value decomposition of H is TΛT

∗

. Where T is the matrix that performs the

DFT. The matrix Λ is a diagonal matrix speciﬁc to each H

Λ =

_

¸

¸

¸

_

λ

1

0

λ

2

.

.

.

0 λ

N

_

¸

¸

¸

_

(140)

but T is not speciﬁc to each H.

41

7.2.2 Extension to MIMO-OFDM

A MIMO-OFDM system can be modeled like a SISO OFDM system with the channel taps

replaced by channel matrices [OC06]. First, start with the frequency selective MIMO channel

y

k

=

L−1

l=0

H[l]x

k−l

+v

k

(141)

Then append a cyclic preﬁx, X

g

, to prevent ISI to produce the modiﬁed system

˜ y = H

g

[X

g

X] +v (142)

with the channel matrix given by

H

g

=

_

¸

¸

¸

_

H[l −1] H[1] H[0] 0 0

0 H[l −1]

.

.

. H[1] H[0] 0

.

.

.

0 0 H[l −1] H[l −2] H[0]

_

¸

¸

¸

_

(143)

As in SISO OFDM the cyclic preﬁx, which is necessary for practical implementation, can be

removed from the analytical model. A large blockwise circulant matrix can represent the eﬀective

channel seen by the whole MIMO-OFDM codeword.

H

cp

=

_

¸

_

H[0] 0 0 H[l −1] H[1]

.

.

.

0 0 H[l −1] H[1] H[0]

_

¸

_

(144)

Since H

cp

is blockwise circulant, the SVD of H

cp

is given by

H

cp

= T

∗

Λ

cp

T (145)

where T is the IDFT matrix as usual. Thus the complicated MIMO-OFDM channel can be

regarded as a diagonal channel with the appropriate coordinate change given by the DFT.

Given

H

k

=

L−1

l=0

H[l]e

−j2π/Tkl

(146)

then ML detection is given by

ˆ

X = arg min

C

T−1

k=0

[[y

k

−H

k

c

k

[[

2

(147)

42

Like OFDM, MIMO-OFDM has issues with PAPR and frequency oﬀset estimation. A block

diagram for MIMO-OFDM follows below:

Figure 18: MIMO OFDM [OC06]

7.2.3 Space-Frequency Coded MIMO-OFDM

For normal OFDM the frequency domain channel coeﬃcients H[k] can be viewed as the channel

coeﬃcients in a narrowband fast fading time channel. The frequency index k can be reinterpreted

as a time domain index. Thus codes designed for fast fading time channels can be applied across

the subcarriers. In MIMO-OFDM the same idea can be used to code across the subcarriers

[TV05].

7.2.4 Space-Time Coded MIMO-OFDM

This is the simplest MIMO-OFDM system with no coding across the subcarriers. Instead the

OFDM part of the system chops the frequency selective channel into frequency ﬂat channels on

which normal space time coding techniques can be applied. For example in the 2 n

r

case the

Alamouti code can be used on each subcarrier through the following process:

1. Transmit [c

1

c

2

]

T

on a given tone during the ﬁrst OFDM symbol

2. Transmit [−c

∗

2

c

∗

1

]

T

on the same tone during the second OFDM symbol

3. Perform normal Alamouti decoding

This idea certainly works, but it limits the system, since the channel has to remain static for the

duration of two OFDM symbols. Depending on system parameters this may not be a reasonable

43

assumption. In general all space-time codes discussed before assume the channel is static over

the duration of a codeword, so this is a general problem in Space-Time Coded MIMO-OFDM.

7.2.5 Space-Time Frequency Coded MIMO-OFDM

In a Space-Time Frequency Coded MIMO-OFDM system coding is performed over all three

available dimensions: space, time, and frequency. Below are several examples of this idea.

Generalized Delay Diversity This code [GSP02] has matrix form

C =

1

√

2

_

¸

¸

_

c

1

c

2

c

T

0 0

0 c

1

c

2

c

T

0

0 c

1

c

2

c

T

0

0 0 c

1

c

2

c

T

_

¸

¸

_

(148)

This code provides a diversity gain of 3.

Lindskog-Paulraj Scheme This code [LP00] basically extends Alamouti in a natural way

to MIMO-OFDM. The fundamental units of transmission are two blocks of length T: c

1

[k] and

c

2

[k]. The scheme is then

1. Send [c

1

[k] c

2

[k]]

T

2. Send [−c

∗

2

[k] c

∗

1

[k]]

T

3. Decode like Alamouti except use two independent MLSE estimators. This can be accom-

plished with two parallel copies of the Viterbi algorithm.

8 Multiuser MIMO

Historically MIMO was developed for use in point to point situations. However, MIMO can also

be used as a multiple access technique to allow multiple users to seamlessly share the spatial

channel. This type of MIMO is called Multiuser MIMO(MU-MIMO). The typical application of

MU-MIMO is in a cellular system with multiple antennas at the base station and only one or two

antennas at each mobile [GKHCS07]. The collection of antennas at all the mobile users in a cell

is regarded as one big antenna array. One of the key advantages of having a distributed array

comprised of all the mobiles is that the channel matrix rarely suﬀers from rank deﬁciencies, so

spatial multiplexing is almost always possible. However, in order to actually get the beneﬁts

of MU-MIMO the base station needs CSIT or at least partial CSIT, which entails increased

complexity.

44

For a MU-MIMO system having N transmit antennas at the base station and U users each with

M

k

antenna the downlink, broadcast channel, for each user can be modeled as

y

k

= h

k

N

l=1

x

l

+v

k

(149)

The uplink, MAC channel, can be modeled as

y =

U

i=1

h

k

x

k

+v (150)

8.1 Precoding

Information theoretic results have shown that using a type of coding called dirty paper cod-

ing(DPC) at the transmitter N users streams can be multiplexed and transmitted [SB07, GC80].

Eﬀectively what the coding at the transmitter does is pre-cancel out interference at the receivers

like ZF-nulling does in BLAST.

8.1.1 Linear Precoding

The downlink channel can be written in a simple for making explicit how other users’ streams

produce interference.

y

k

= H

k

s

k

+H

k

N

l=1,l=k

s

k

+v

k

(151)

The simplest form of precoding is to multiply the transmit symbols by a matrix, W

k

, that will

cancel out the interferers [SSH04].

y

k

= H

k

W

k

s

k

+H

k

N

l=1,l=k

W

l

s

k

+v

k

(152)

In the case when each user has one receive antenna this problem is identical to canceling inter-

ference in BLAST. So the proper choice for W

k

is the kth column of the pseudoinverse of the

eﬀective channel matrix H =

_

h

1

h

1

h

N

¸

.

8.1.2 Nonlinear Precoding

Nonlinear precoding is more like DPC than linear precoding and can produce better results at

the cost of increased complexity. Well known nonlinear precoding methods include perturbation

methods and Tomlinson-Harathisma codes [PHS05, HPS05].

45

8.2 Scheduling

If the number of users U is greater than the number of transmit antennas N, then the base

station can’t transmit to all the users simultaneously. So at any given time the base station

must choose some subset of the users to transmit to [GKHCS07]. The optimal scheduling

algorithm is to simply perform an exhaustive search over all possible combinations of users.

This is not computationally feasible though, so heurestic methods must be used to choose a

subset of users. A simple choice is a greedy algorithm, which selects the N users with the best

channels.

8.3 Working with Partial CSIT

To achieve CSIT each user must feedback its channel estimate to the base station, which is tricky

and reduces capacity [GA04]. To combat this problem some research has been performed into

MU-MIMO systems with only partial CSIT. Basic results have demonstrated that the gains of

MU-MIMO can still be achieved with only partial CSIT, which entails less system complexity.

9 MIMO in Wireless Standards

Many emerging wireless standards provide for MIMO to provide both diversity and multiplexing

gain as needed. This section examines three prominent new wireless standards that employ

MIMO.

9.1 3GPP LTE

The Third Generation Partnership Project Long-Term Evolution (3GPP LTE) is the emerging

4G standard that is currently being implemented and tested. The major features of LTE are

outlined below [3GPP07, 3GPPRel8, 3GPPRel9]:

• High data rates - 100 Mbps in the downlink using 2 2 MIMO and 50 Mbps in the uplink

using no MIMO

• Mobility - Best performance for 0-15 km/hr and good performance of 15-120 km/hr.

• Spectrum - No ﬁxed spectrum size. Allowed sizes are 1.25, 1.6, 2.5, 5, 10, 15, and 20 MHz.

• OFDM - LTE uses OFDM with a variable number of subcarriers.

• IP Network - No circuit switched domain but all IP based network.

The downlink in LTE provides several options for using MIMO. The basic option for the downlink

is two antennas at the base station and two at the mobile station. Extensions to LTE allow

4 2 and 4 4 MIMO. If the base station has CSIT, then there are two methods it can apply:

46

1. Pre-coding SDM - Since the base station knows the channel at the receiver, it can pre-code

the transmitted symbols to present interference using the V matrix from the SVD of H.

2. Beamforming - Use some form of beamforming such as TMRC.

Without CSIT the base station can use Space-Frequency Block Coding by using the Alamouti

code for each tone.

In the uplink MU-MIMO can be used with the proper scheduling. The baseline case assumes

1 2 and the extension is 1 4.

9.2 WiMAX

WiMAX was originally developed to address the last mile connection to the internet. It has

evolved to provide high data rate mobile data. The key features of WiMAX are outlined below

[IS04, IS05]:

• High data rates - 75 Mbps in 802.16d and 30 Mbps in 802.16e.

• Mobility - Introduced in 802.16e. Range up to 30 miles in 802.16e.

• Spectrum - No ﬁxed spectrum size. Allowed sizes are 1.25, 2.5, 5, 10, and 20 MHz.

• OFDM - WiMAX uses OFDM with a variable number of subcarriers.

The general structure of a WiMAX transmitter is demonstrated in the ﬁgure below:

Figure 19: WiMAX Transmitter [AGM05]

47

There are several diﬀerent MIMO methods that can be employed in WiMAX. As we have seen

before the methods employed depend on whether the transmitter has channel state information

or not.

Open Loop (No CSIT) The 802.16 standard deﬁnes several options for space-time codes

for 2-4 antennas. However, the two most common codes for space-time coding are:

_

S

1

S

2

_ _

S

1

−S

∗

2

S

2

S

∗

1

_

(153)

where S

1

and S

2

are OFDM symbols. 802.16 also provides for space-frequency coding called the

Frequency Hopping Diversity Code (FHDC) based on the Alamouti code. The OFDM symbols

are uncoded in time and coded in the frequency domain. The ﬁgure below shows how FHDC

works:

Figure 20: WiMAX Frequency Hopping Diversity [AGM05]

Closed Loop (CSIT) With CSIT the transmitter can make better decisions. One of the

common methods used in feedback is codebook based feedback. The codebook is basically a

predetermined set of choices for the Q matrix in BLAST. The feedback is an index into the

codebook that tells the receiver, which matrix to use. Another alternative is to use a feedback

channel and have the receiver transmit a quantized version of the channel. The transmitter can

then design the optimal precoding matrix.

9.3 802.11n

802.11n is the next generation 802.11 LAN that seeks to provide very high data rates. In 802.11

a/b/g high data rates were achieved by using high order modulation like 64-QAM. However,

the cost of this approach is a loss in range because higher SNR is necessary to successfully

demodulate 64-QAM. The way 802.11n seeks to overcome this problem and provide both high

48

data rates and better range is through MIMO-OFDM. 802.11n transmits multiple data streams

from the multiple transmit and receive antennas. Thus 802.11n achieves higher data rates

without using more bandwidth or larger constellations by increasing spectral eﬃciency.

The key features of 802.11n are outlined below [IWG04, ?, ?]:

• High Data Rate - 130 Mbps typically

• Spectrum - 20 MHz (Optionally 40 MHz)

• OFDM - Uses OFDM

The basic case for 802.11n is 2 2 but the 802.11n standard provides for up to 4 4 MIMO. A

block diagram demonstrating the operation of 802.11n follows:

Figure 21: 802.11n Transmitter [OC06]

The 802.11n transmitter sends every other group of bits to each OFDM branch. Each branch

performs normal OFDM with spatial subcarrier mapping. Then each branch transmits on one

antenna. The receiver architecture is symmetric and is manufacturer speciﬁc.

10 Conclusion

MIMO has become a popular technology for emerging wireless standards because it can provide

better error performance in the form of diversity gain and better data rates in the form of

multiplexing gain without using more bandwidth. In addition, MIMO works well with OFDM,

which has become a ubiquitous feature of modern wireless standards. MIMO continues to be an

active research area with multiuser-MIMO as a new area of great interest for future development.

MIMO is an exciting ﬁeld that looks to be a major part of research and standards in wireless

communications for many years to come.

49

A Math Review

This section reviews a few common mathematical tools used in MIMO. In particular, this section

covers some important linear algebra topics and Lagrange multipliers for optimization [HJ95].

A.1 Rank

Let A ∈ C

m×n

. Then A can be written in terms of column vectors as A = [a

1

, a

2

, . . . , a

n

]. The

column space of A denoted col(A) is given by

col(A) = span(a

1

, a

2

, . . . , a

n

) (154)

Then the rank of A denoted rank(A) is deﬁned to be dimcol(A). So the rank of A is the largest

number of columns of A that constitute a linearly independent set.

A can be written in terms of rows as [a

T

1

, a

T

2

, . . . , a

T

m

]

T

. Then the row space of A denoted

row(A) is given by

row(A) = span(a

1

, a

2

, . . . , a

n

) (155)

With these deﬁnitions it can be demonstrated that rank(A) = dimrow(A). Listed below are

several useful facts about rank:

1. rank(A

T

) = rank(A

∗

) = rank(A)

2. rank(A) ≤ min¦m, n¦.

3. rank(A

∗

A) = rank(A)

4. If A is square, then A is invertible if and only if rank(A) = n.

A.2 Eigenvalues and Eigenvectors

Let A be a nn matrix over the complex numbers. A complex number λ and a complex vector

x ,= 0 are said to be an eigenvalue and its associated eigenvector if

Ax = λx (156)

By simple rearranging this expression can be written as

(A−λI) x = 0 (157)

This equations has non-trivial solutions (x ,= 0) only if A − λI is not invertible. This is true

precisely when det (A−λI) = 0. When the determinant is expanded and evaluated it becomes

an nth degree polynomial. By the Fundamental Theorem of Algebra this polynomial has n

complex roots counting multiplicity. Thus A has n eigenvalues including multiplicity.

50

A.2.1 Diagonalization

Sometimes A can be related to a diagonal matrix D by A = S

−1

DS where S is a n n

invertible matrix. If this is possible, then A is said to be diagonalizable. The matrix S can be

interpreted as a change of basis that allows the matrix A to be described as a diagonal matrix.

This representation is particularly nice in a n n MIMO system, since the channel becomes n

independent parallel channels. This transformation makes capacity calculation much easier.

So the main point of interest is determining when A is diagonalizable. The following two

conditions are suﬃcient to guarantee that a square matrix is diagonalizable:

1. A has n distinct eigenvalues

2. A has n linearly independent eigenvalues

A.2.2 Connection To The Determinant and Trace

The following formula is a useful connection between the eigenvalues of a matrix and its deter-

minant and trace:

det(A) = λ

1

λ

2

λ

n

(158)

tr(A) =

n

i=1

a

ii

= λ

1

+ λ

2

+ + λ

n

(159)

A.3 Inner Product Space

C

n×1

is an inner product space with the inner product given by:

< x, y >= y

∗

x (160)

There are two important points of interest in regarding C

n×1

as an inner product space. First, it

is possible to perform orthogonal projections of a vector onto a space spanned by several other

vectors. This can be used in V-BLAST for the ZF-Nulling receiver. The second point is the

Cauchy-Schwarz inequality

[ < x, y > [ ≤

√

< x, x >

√

< y, y > (161)

with equality for y = Kx for any constant K. The Cauchy-Schwarz inequality can be used to

derive the optimal receive combining vector for MRC.

A.4 Singular Value Decomposition

It is only possible to diagonalize a square matrix, but sometimes it is desirable to decompose a

matrix with arbitrary dimensions into another matrix that is almost diagonal. The singular value

51

decomposition achieves this and is deﬁned for all matrices in C

m×n

. Speciﬁcally the singular

value decomposition of a matrix A ∈ C

m×n

is

A = UΣV

∗

(162)

where U ∈ C

m×m

and V ∈ C

n×n

are unitary matrices, which means

UU

∗

= U

∗

U = I

m

(163)

VV

∗

= V

∗

V = I

n

(164)

Σ ∈ C

m×n

has non-zero entries only for the entries on the diagonal, Σ

ii

. The entries Σ

ii

are

called the singular values of A. Then it is clear that there are n

min

= min¦m, n¦ singular values.

Typically the matrix Σ is constructed such that

Σ

11

≥ Σ

22

≥ ≥ Σ

n

min

n

min

(165)

Intuitively what the SVD does is use the matrix V to rotate an input vector to a coordinate

system in which the action of the matrix can be described by a simple matrix Σ. Then the

output of this simple matrix is rotated back to the original coordinate system by U to produce

the output of A.

The concept of the SVD can be viewed as a generalization of eigenvalues. For a column of

u ∈ C

m

, a corresponding column of v ∈ C

n

, and the corresponding singular value σ ∈ C

Av = σu (166)

This equation is similar to the equation that deﬁnes an eigenvalue and eigenvector, which led to

u being called a left singular vector and v a right singular vector. One of the most important

properties of the SVD is that the number of non-zero singular values is precisely the rank of the

matrix A.

A.4.1 Pseudoinverse

The inverse of a matrix is only deﬁned for square matrices, but there is a way to deﬁne a special

matrix that is like an inverse but deﬁned for arbitrary mn matrices called the pseudoinverse.

Let A ∈ C

m×n

have a SVD UΣV

∗

. Then the pseudoinverse is deﬁned to be A

†

= VΣ

†

U

∗

where Σ

†

is the transpose of Σ with the non-zero singular values inverted. There are four kery

properties of the pseudoinverse that deﬁne its behavior:

1. AA

†

A = A - Note that AA

†

is not in general the identity matrix but the combination of

three matrices produces the desired eﬀect.

2. A

†

AA

†

= A

†

3.

_

AA

†

_

∗

= AA

†

4.

_

A

†

A

_

∗

= A

†

A

52

A.4.2 Condition Number

Consider solving the linear system Ax = b. The condition number is a measure of how well this

system behaves for small changes in b. Speciﬁcally the condition number measures how small

changes in b change x. So for the perturbed system Ax = b +e the condition number is given

by

κ(A) =

[[A

−1

e[[/[[A

−1

b

[[e[[/[[b[[

(167)

This quantity can be related to the singular values of A by

κ(A) =

σ

max

σ

min

(168)

A.5 Lagrange Multipliers

Consider the following optimization problem:

Maximize : f(x

1

, x

2

, . . . , x

n

)

Subject to : g(x

1

, x

2

, . . . , x

n

) = C

The optimal solution can be found by solving the following system of equations given by the

gradient

_f = λ _g

g = C (169)

These equations can be expressed in terms of partial derivatives as

∂f

∂x

i

= λ

∂g

∂x

i

i = 1, 2, . . . , n

g = C (170)

Lagrange multipliers can be used to ﬁnd the optimal power allocation to maximize capacity.

References

[3GPPRel8] 3GPP, “Technical Speciﬁcation Group Radio Access Network Requirements for

Further Advancements for E-UTRA (LTE-Advanced) (Release 8)”

[3GPPRel9] 3GPP. “Overview of 3GPP Release 9 V.0.0.4 (2009-01)”

[3GPP07] 3GPP. “Physical Channels and Modulation (Release 8)”. September 2007.

[AG05] J. Akhtar and D. Gesbert. “Spatial Multiplexing over correlated MIMO channels with

a closed-form precoder”. IEEE Trans. Wireless Commun., 4(5):2400-2409, September 2005.

53

[AGM05] J. Andrews, A. Ghosh, and R. Muhamed. Fundamentals of WiMAX

[Ala98] S.M. Alamouti. “A simple transmit diversity technique for wireless communications”.

IEEE J. Select. Areas Commun., 16(10):1451-1458, October 1998.

[BRV05] J.C. Belﬁore, G. Rekaya, and E.Viterbo. “The golden code: a 22 full-rate space-time

code with non-vanishing determinants”. IEEE Trans. Inform. Theory, 51(4):1432-1436, April

2005.

[CKT98] C.N. Chuah, J.M. Kahn, and D. Tse. “Capacity of multi-antenna array systems in

indoor wireless environment”. In Proc. Globecom 1998 - IEEE Global Telecommunications

Conf., volume 4, pages 1894-1899, Sydney, Australia, 1998.

[CT91] T. Cover and T. Thomas. Elements of Information Theory. Wiley, NewYork, NY, 1991.

[DTB02] M.O. Damen,A.Tewﬁk, and J.-C. Belﬁore. “A construction of a space-time code based

on number theory”. IEEE Trans. Inform. Theory, 48(3):753-760, March 2002.

[Fle00] B.H. Fleury. “First- and second-order characterization of direction dispersion and space

selectivity in the radio channel”. IEEE Trans. Inform. Theory, 46(6):2027-2044, June 2000.

[FG98] G.J. Goschini and M.J. Gans. “On Limits of Wireless Communications in a Fading

Environment when Using Multiple Antennas,” Wireless Personal Communications. Vol 6(3),

March 1998.

[Fos96] G.J. Foschini. “Layered space-time architecture for wireless communication in a fading

environment when using multi-element antennas”. Bell Labs Tech. J., pages 41-59, Autumn

1996.

[GA04] D. Gesbert and M.-S. Alouini, “How much feedback is multi-user diversity really

worth?”, Proc. IEEE Int. Conf. on Comm. (ICC), Paris, France, June 2004, pp. 234-238

[GC80] A. El Gamal and T.M. Cover, “Multiple user information theory”, Proc. IEEE, vol. 68,

no. 12, pp. 1466-1483, Dec. 1980

[GD03] H.E. Gamal and M.O. Damen. “Universal space-time coding”. IEEE Trans. Inform.

Theory, 49(5):1097-1119, May 2003.

[GKHCS07] D. Gesbert, M. Kountouris, R. W. Heath, Jr., C.-B. Chae, and T. Salzer, “Shift-

ing the MIMO Paradigm: From Single User to Multiuser Communications”, IEEE Signal

Processing Magazine, vol. 24, no. 5, pp. 36-46, Oct. 2007

[Gold05] Andrea Goldsmith. Wireless Communications. Cambridge Press. 2005.

[GFVW99] G.D. Golden, G.J. Foschini, R.A.Valenzuela, and P.W.Wolniansky. “Detection al-

gorithm and initial laboratory results using the V-BLAST space- time communication ar-

chitecture”. Elect. Lett., 35(1):14-15, January 1999.

54

[God97] L.C. Godara. “Applications of antenna arrays to mobile communications, part II:

Beamforming and direction-of-arrival considerations”. Proceedings IEEE, 85(8):1195-1245,

August 1997.

[GSP02] D. Gore, S. Sandhu, and A. Paulraj. “Delay diversity codes for frequency selective

channels”. In Proc. ICC 2002 - IEEE Int. Conf. Commun., pages 1949-1953, NewYork, May

2002.

[Hea01] R. Heath. “Space-Time Signaling in Multi-Antenna Systems”. PhD thesis, Stanford

University, November 2001.

[Her04] M. Herdin. “Non-stationary indoor MIMO radio channels”. PhD thesis, Technische

UniversitatWien, August 2004.

[HH01] B. Hassibi and B. Hochwald. “High-rate linear space-time codes”. In Proc. ICASSP

2001 - IEEE Int. Conf. Acoust. Speech and Signal Processing, volume 4, pages 2461-2464,

Salt Lake City, UT, May 2001.

[HJ95] R.A. Horn and C.R. Johnson. Topics in matrix analysis. Cambridge University Press,

Cambridge, UK, 1995.

[HPS05] B. M. Hochwald, C. B. Peel, and A. L. Swindlehurst, “A vector-perturbation technique

for near capacity multiantenna multiuser communication - part II: perturbation”, IEEE

Trans. Comm., vol. 53, no. 3, pp. 537-544, March 2005

[HV05] B. Hassibi and H. Vikalo, “On the sphere decoding algorithm: Part I, the expected

complexity”, IEEE Transactions on Signal Processing, vol. 53, pp. 2806 - 2818, Aug. 2005.

[Jaf01] H. Jafarkhani. “A quasi orthogonal space-time block code”. IEEE Trans. Commun.,

49(1):1-4, January 2001.

[Jak71] W.C. Jakes. “ A Comparison of Speciﬁc Space Diversity Techniques for the Reduction

of Fast Fading in UHF Mobile Radio Systems,” IEEE Trans. on Veh. Techn., Vol. VT-20,

No. 4, pp. 81-93, Nov. 1971.

[Kah54] L. Kahn, “Ratio Squarer,” Proc. of IRE(Corr.), Vol. 42, pp. 1074, November 1954.

[IS04] IEEE Standard 802.16-2004

[IS05] IEEE Standard 802.16e-2005, Amendment to IEEE Standard for Local and Metropolitan

Area Networks - 6 Part 16: Air Interface for Fixed Broadband Wireless Access System

[IWG04] IEEE 802.16 Working Group, Part 16: Air Interface for Fixed and Mobile Broadband

Wireless Access Systems Amendment 2: Physical and Medium Access Control Layers for

Combined Fixed and Mobile Operation in Licensed Bands, 2004

[IWG206] IEEE 802.16 Working Group, Part 16: Air Interface for Fixed and Mobile Broadband

Wireless Access Systems Amendment 2: Physical and Medium Access Control Layers for

Combined Fixed and Mobile Operation in Licensed Bands, 2006

55

[IWG1106] IEEE 802.11Working Group, IEEE P802.11n/D1.0 Draft Amendment to Stan-

dard for Information Technology- Telecommunications and Information Exchange Between

Systems-Local and Metropolitan Networks-Speciﬁc Requirements-Part 11: Wireless LAN

Medium Access Control (MAC) and Physical Layer (PHY) Speciﬁcations: Enhancements

for Higher Throughput, March 2006.

[LP00] E. Lindskog and A.J. Paulraj. “A transmit diversity scheme for channels with intersym-

bol interference”. In Proc. ICC 2000 - IEEE Int. Conf. Commun., volume 1, pages 307-311,

New Orleans, June 2000.

[OC06] Claude Oestges and Bruno Clerckx. MIMO Wireless Communications. Artech House

Press.

[Par00] J.D. Parsons. The mobile radio propagation channel. 2nd ed., Wiley, London, UK, 2000.

[PHS05] C. B. Peel, B. M. Hochwald, and A. L. Swindlehurst, “A vector-perturbation technique

for near capacity multiantenna multiuser communication - part I: channel inversion and

regularization”, IEEE Trans. Comm., vol. 53, no. 1, pp. 195-202, Jan. 2005

[PNG03] A. Paulraj, R. Nabar, and D. Gore. Introduction to Space-Time Wireless Communi-

cations. Cambridge University Press, Cambridge, UK, 2003.

[Pro01] J.G. Proakis. Digital communications. 4th ed., McGraw-Hill, NewYork, NY, 2001.

[Rapp02] T.S. Rappaport, Wireless Communications: Principles and Practices. Prentice Hall.

[SA00] M.K. Simon and M.-S. Alouini. “Digital communications over fading channels: a uniﬁed

approach to performance analysis”.Wiley. New York, NY, 2000.

[San02] S. Sandhu. “Signal Design for Multiple-Input Multiple-Output Wireless: A Uniﬁed

Perspective”. PhD thesis, Stanford University, August 2002.

[SB07] M. Sharif and B. Hassibi, “A comparison of time-sharing, DPC, and beamforming for

MIMO broadcast channels with many users”, IEEE Trans. Comm., vol. 55, no. 1, pp. 11-15,

Jan. 2007.

[Sim01] M.K. Simon. “Evaluation of average bit error probability for space-time coding based

on a simpler exact evaluation of pairwise error probability”. Journal of Communications and

Networks, 3(3):257-264, September 2001.

[SMB01] M. Steinbauer, A.F. Molisch, and E. Bonek. “The double-directional radio channel”.

IEEE Antennas Propagat. Mag., 43(4):51-63, August 2001.

[SP03] N. Sharma and C.B. Papadias. “Improved quasi-orthogonal codes through constellation

rotation”. IEEE Trans. Commun., 51(3):332-335, March 2003.

[SSH04] Q. Spencer, A. L. Swindlehurst, and M. Haardt, “Zero-forcing methods for downlink

spatial multiplexing in multiuser MIMO channels”, IEEE Trans. Sig. Proc., vol. 52, no. 2,

pp. 462-471, Feb. 2004.

56

[SSRS03] B.A. Sethuraman, B. Sundar Rajan, and V. Shashidhar. “Full-diversity, high-rate,

space-time block codes from division algebras”. IEEE Trans. Inform. Theory, 49(10):2596-

2616, October 2003.

[SW94] N. Seshadri and J.H. Winters. “Two signaling schemes for improving the error perfor-

mance of frequency-division-duplex (FDD) transmission systems using transmitter antenna

diversity”. Int. J. Wireless Information Networks, 1:49-60, 1994.

[SX04] W. Su and X. Xia. “Signal constellations for quasi-orthogonal space- time block codes

with full diversity”. IEEE Trans. Inform. Theory, 50(10):2331-2347, October 2004.

[TBH00] O. Tirkkonen, A. Boariu, and A. Hottinen. “Minimal non-orthogonality rate 1 space-

time block code for 3+Tx antennas”. In IEEE 6th Int. Symp. on Spread-Spectrum Tech. and

Appl. (ISSSTA 2000), pages 429-432, September 2000.

[Tel95] E.Telatar. “Capacity of multiantenna Gaussian channels” .Tech. Rep.,AT&T Bell Labs.

1995.

[TJC99] V. Tarokh, H. Jafarkhani, and A.R. Calderbank. “Space-time block codes from orthog-

onal designs”. IEEE Trans. Inform. Theory, 45(7):1456-1467, July 1999.

[TSC98] V. Tarokh, N. Seshadri, and A.R. Calderbank. “Space-time codes for high data rate

wireless communication: Performance criterion and code construction”. IEEE Trans. Inform.

Theory, 44(3):744-765, March 1998.

[TV05] D. Tse and P. Viswanath. Fundamentals of wireless communication. Cambridge Univer-

sity Press, Cambridge, UK, 2005.

[VB99] E. Viterbo and J. Boutros. “A universal lattice code decoder for fading channels”. IEEE

Trans. Inform. Theory, 45(5):1639-1642, July 1999.

[Wit93] A. Wittneben. “A new bandwidth eﬃcient transmit antenna modulation diversity

scheme for linear digital modulation”. In Proc. ICC 1993 - IEEE Int. Conf. Commun.,

pages 1630-1633, 1993.

[WX05] D. Wang and X.G. Xia. “Optimal diversity product rotations for quasiorthogonal STBC

with MPSK symbols”. IEEE Commun. Lett., 9(5):420- 422, May 2005.

[XL05] L. Xian and H. Liu. “Optimal rotation angles for quasi-orthogonal space- time codes

with PSK modulation”. IEEE Commun. Lett., 9(8):676-678, August 2005.

[Yac93] M.D.Yacoub. Foundation of mobile radio engineering. CRC Press, Boca Raton, FL,

1993.

[YW03] H.Yao andG.W.Wornell. “Structured space-time block codes with optimal diversity-

multiplexing tradeoﬀ and minimum delay”. In Proc. Globecom 2003 - IEEE Global Telecom-

munications Conf., volume 4, pages 1941-1945, San Francisco, CA, December 2003.

[ZT03] L. Zheng and D. Tse. “Diversity and multiplexing: a fundamental tradeoﬀ in multiple-

antenna channels”. IEEE Trans. Inform. Theory, 49(5):1073-1096, May 2003.

57

Contents

1 Introduction 2 Beneﬁts of MIMO 2.1 Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 2.1.2 2.2 Union Bound on Probability of Error . . . . . . . . . . . . . . . . . . . . Outage Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 2 3 4 5 5 5 6 6 7 8 8 9 10 11 11 12 12 13 15 15 16 17 19

Spatial Multiplexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3 Basic Schemes for Multiple Antennas 3.1 3.2 3.3 3.4 3.5 3.6 3.7 Channel Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Scalar Rayleigh Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Maximal Ratio Combining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Selection Combining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Equal Gain Combining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Transmit Maximal Ratio Combining . . . . . . . . . . . . . . . . . . . . . . . . Alamouti Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4 MIMO Channel Modeling and Capacity 4.1 Narrowband MIMO Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.1 4.1.2 4.2 Narrowband MIMO Channel Capacity . . . . . . . . . . . . . . . . . . . Rank and Condition Number . . . . . . . . . . . . . . . . . . . . . . . .

Physical Modeling of MIMO Channels . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 4.2.2 4.2.3 4.2.4 LOS SIMO and MISO Channel . . . . . . . . . . . . . . . . . . . . . . . LOS MIMO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Geographically Separated MIMO . . . . . . . . . . . . . . . . . . . . . . Two-Ray MIMO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4.3

Statistical Modeling of MIMO Channels . . . . . . . . . . . . . . . . . . . . . . 4.3.1 Frequency Selective MIMO Channel . . . . . . . . . . . . . . . . . . . . .

i

5 Diversity-Multiplexing Tradeoﬀ 5.1 Scalar Rayleigh Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1 5.2 5.3 QAM over the Scalar Rayleigh Channel . . . . . . . . . . . . . . . . . . .

20 20 21 21 21 23 24 25 26 30 31 38 39 39 40 40 40 41 41 42 43 43 44 44 45 45 45

MISO Rayleigh Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . MIMO Rayleigh Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6 Space-Time Coding over Narrowband Channels 6.1 6.2 Error Motivated Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Space-Time Block Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.1 Linear STBCs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.3

Bell Labs Space Time Architectures . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.1 6.3.2 V-BLAST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-BLAST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.4

Space-Time Trellis Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.1 6.4.2 Trellis Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . Delay-Diversity Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7 Space-Time Coding for Frequency Selective Channels 7.1 7.2 Single Carrier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . MIMO-OFDM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.1 7.2.2 7.2.3 7.2.4 7.2.5 OFDM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Extension to MIMO-OFDM . . . . . . . . . . . . . . . . . . . . . . . . . Space-Frequency Coded MIMO-OFDM . . . . . . . . . . . . . . . . . . . Space-Time Coded MIMO-OFDM . . . . . . . . . . . . . . . . . . . . . . Space-Time Frequency Coded MIMO-OFDM . . . . . . . . . . . . . . . .

8 Multiuser MIMO 8.1 Precoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1.1 8.1.2 Linear Precoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nonlinear Precoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii

. . A. . . . . . . . . .1 9. . . . . . . . . . . . . . . . . . . . . . . . .2 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Inner Product Space . . . . . . . .2 Connection To The Determinant and Trace . . . . . A. . . . . . . . .1 Pseudoinverse . . .2 Condition Number . . . . . . . . . . . . . . . . . . .2. . . . . . . . . . . . . . .3 Scheduling . . . . . . . . A. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. . . . . . . . . . . . . . . . . . . . .4. .4. . . References iii . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 46 46 46 47 48 49 50 50 50 51 51 51 51 52 53 53 53 9 MIMO in Wireless Standards 9. . . . . . . . . . . . .2 8. . . . . . . . . . . . . .4 Singular Value Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5 Lagrange Multipliers . Working with Partial CSIT . . . . . . .2. . . .1 Diagonalization .11n . . . . . . . . A. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 802. . . . . . . . . . . . .3 3GPP LTE . . . . . . . . . . . . . .2 9. . . .1 Rank . . . . . . . . . . . . . . . WiMAX . . . . . . . . . . . . . . . . . . . 10 Conclusion A Math Review A. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8. A. . . . . . . A. A. . . . . . . . . . . . . . . . . . . . .

1 . MIMO in Wireless Standards 2 Beneﬁts of MIMO The two major beneﬁts of MIMO are diversity gain. Multiple Input and Multiple Output (MIMO) wireless communication systems have become a hot research topic because they promise to deal with all of these issues by providing both increased resilience to fading and increased capacity without using more bandwidth or power. The ﬁgure below shows a simple MIMO setup with nt transmit antennas and nr receive antennas. In the 1990’s MIMO systems with multiple antennas at both the transmitter and receiver were proposed. Multi-User MIMO and Applications 6. better quality of service.1 Introduction Wireless systems face several challenges including demands for higher data rates. which achieved high spectral eﬃciency on the order of 10-20 bits/s/Hz [Fos96]. Instead of just using diversity to combat fading MIMO systems actively take advantage of multipath to work. increased rate of transmission by exploiting the increased degrees of freedom oﬀered by the spatial MIMO channel. Space-Time Coding and Architectures 4. and increased network capacity while working with limited amounts of spectrum. In the 2000’s MIMO has continued to be developed and there are now plans to implement MIMO in several new wireless standards such as 802. One of the early seminal works in MIMO was Telatar’s paper. Early methods provided for spatial diversity to improve error performance and beamforming to increase SNR by focusing the energy from an antenna into a desired direction. Around the same time Bell Labs developed the BLAST architectures.11n. WiMAX. which demonstrated the potential for improved capacity with no extra spectrum [Tel95]. increased resilience to fading in the form of better error performance. and multiplexing gain. Diversity-Multiplexing Tradeoﬀ 3. and LTE. This tutorial paper focuses on the following major topics in MIMO: 1. MIMO Channel Modeling and Capacity 2. Also around the same time the ﬁrst space-time coding methods were proposed [TSC98]. Methods to take advantage of multiple antennas at the receiver or the transmitter were known from the 1950’s onward. Space-Time Coding in Frequency Selective Channels 5.

1 Diversity Diversity is an attempt to exploit redundancies in the way information is sent to achieve better error performance by cleverly using multiple copies of the same signal. orthogonal frequency division multiplexing(OFDM) can apply modulation order adaption to each subcarrier depending on the quality of a given subchannel. Finally.Figure 1: MIMO System Concept [Gold05] 2. It is generally of interest to quantify exactly how much diversity a given scheme provides. The simplest example is the repetition code. Frequency diversity exploits the variations in a frequency selective channel. L is the diversity gain. The second type of antenna diversity is to use multiple antennas with diﬀerent polarizations. This type of antenna diversity is one of the main focuses of this paper. Time diversity involves averaging the fading eﬀects of the channel over time. For the case of nt × nr narrowband MIMO the maximum possible diversity gain is nt nr . there are several diﬀerent types of antenna diversity. The most obvious type is to simply use multiple antennas. which transmits the same symbol multiple times with the transmissions separated by more than the coherence time of the channel. The receiver decodes each symbol independently and estimates the transmitted symbol by majority rule. Three fundamental types of diversity are time. For example. and antenna diversity. which is the maximum number of independent copies of the same signal that the receiver sees. frequency. 2 . This can be done through calculating either the average probability of error or the outage probability. The third type of antenna diversity is to use multiple antennas with diﬀerent non-overlapping beam patterns. Both of these expressions can generally be approximated as SNR−L at high SNR. This diversity gain can be more rigorously deﬁned as L = − lim log(Pe ) SNR→∞ log(SNR) (1) This is just a formalization of the intuition above that replaces “at high SNR” with a limit.

Consider an arbitrary constellation C containing M points.1 Union Bound on Probability of Error It can be diﬃcult to calculate an exact expression for the probability of error for an arbitrary modulation. Write the constellation as C = {c1 . . cM } (2) Let Pe be the probability of symbol error. This is denoted P [cm → cl ]. . c2 . A simplifying approximation is the pairwise error probability(PEP) in which it is assumed for the purposes of calculation that only cm and cl are in the constellation. so M 2 P ||cm − cl || Pe|cm ≤ Q No 2 l=1 l=m (5) 3 .1. Assuming all symbols are equally likely then Pe = 1 M M Pe|cm m=1 (3) The conditional probability of symbol of error can be expanded as Pe|cm = P [cm is detected incorrectly | cm was transmitted] M = l=1 l=m P [cm is estimated as cl | cm was transmitted] Computing each of these probabilities is diﬃcult and requires integration over a possibly complicated Voronoi region speciﬁc to each type of modulation.2. so it is useful to calculate an upper bound on the probability of error. Let Pe|ci be the probability of symbol error given ci ∈ C was sent. . For complex AWGN 2 Ex ||cm − cl || P [cm → cl ] = Q (4) No 2 PEP overestimates the probability of decoding cm as cl . .

For the typical Gaussian memoryless channel the channel capacity is C = B log2 (1 + SNR). Thus P [B log2 (1 + SNR) < R] = P SNR < 2R/B − 1 Generally for other channels the outage condition reduces to the SNR being below a certain threshold.1.2 Outage Probability Formally the channel is in outage if the rate of transmission exceeds the channel capacity. 2. The outage probability is the probability that this situation occurs: P [C < R]. min Then M M 1 P ||cm − cl ||2 Pe ≤ Q M m=1 l=1 No 2 l=m ≤ 1 M 1 M M M Q m=1 l=1 l=m M P No 2 d2 min = (M − 1)Q m=1 = (M − 1)Q The Chernoﬀ bound on the Q-function is Q(x) ≤ P d2 min No 2 P No 2 d2 min (6) 1 −x2 /2 e 2 (7) So then the probability of error can be approximated as Pe ≤ P M − 1 − N d24 min e o 2 (8) This bound is very useful in calculating the diversity gain for simple multiple antenna systems. Thus the diversity gain can generally be found from the probability: P [SNR < γ] 4 . From expressions for outage probability one can also ﬁnd the diversity gain in a manner similar to the average probability of error method.Then let d2 be the square of the minimum distance between points in the constellation C.

which are placed on the cosine and sine terms. The major assumptions underlying these schemes are whether the receiver has channel information(CSIR) or the transmitter has channel information(CSIT).1 Channel Models The channel model for these basic MIMO schemes is a simple extension of the scalar Rayleigh channel. which is demultiplexed by the symbol mapping operation into two streams. however. CSIT is trickier to achieve as the receiver must estimate the channel and feed the estimate back to the transmitter through a feedback channel in FDD or the transmitter must assume that it sees the same channel as the receiver in TDD. This channel model is justiﬁed in terms of physical propagation models in the next section. This system has two real degrees of freedom (1 complex degree of freedom) because independent streams of bits could be transmitted on the cosine and sine terms. 3 Basic Schemes for Multiple Antennas Now consider a few basic multiple antennas schemes that can provide diversity. To see how MIMO achieves this ﬁrst consider QAM.2. the two independent streams usually come from one original stream. Feedback entails a cost in terms of lost capacity and bandwidth. so there are four degrees of freedom. Fundamentally MIMO provides increased rates in a similar way by providing even more degrees of freedom. each transmit antenna can send an independent stream.2 Spatial Multiplexing Besides providing diversity gain and improved error performance MIMO can also provide increased data rates and spectral eﬃciency through spatial multiplexing. In the 4 × 4 MIMO case. The channels are now modeled as complex Gaussian vectors with CN (0. nr }. I) distribution. 3. 5 . In practice. The transmitted signal can be expressed as xn (t) = an (t)cos(2πfc t) − bn (t)sin(2πfc t) (9) assuming the appropriate normalizations have been made. which will be received by all four antennas simultaneously. The maximum possible complex degrees of freedom for MIMO is min{nt . CSIR is a pretty common assumption and can be achieved through several estimation methods. for example. The degrees of freedom in MIMO come from the multiple antennas transmitting independent streams.

No ).3. For a complex gaussian vector of length m with correlation matrix R the pdf for the vector h is given by 1 ∗ −1 e−h R h (11) fh = m π | det R| Also by the deﬁnition of the pdf ··· 1 π m | det R| e−h ∗ R−1 h (10) dh = 1 (12) Then the average probability of error for the Rayleigh channel is Pe M − 1 − h∗ hP d24 min ≤ E e No 2 M −1 1 − h∗ hP d24 −h∗ h min e dh = e No 2 π M −1 = 2 M −1 = 2 = 1 −h∗ e π 1+ „ P 1+ N o d2 min 4 «−1 h dh d2 min 4 o 2 P dmin No 4 2 P dmin No 4 −1 −1 e „ P −h∗ 1+ N «−1 h dh π 1+ 2 P dmin No 4 M −1 2 1+ (13) At high SNR P e ≈ SNR−1 (14) This corresponds to a diversity gain of 1.the Single-In Multiple-Out (SIMO) case. which is to be expected as there is only one copy of the signal. This system can be modeled as y[n] = hx[n] + v 6 (15) .3 Maximal Ratio Combining Consider a system with a single transmit antenna and nr receive antennas .2 Scalar Rayleigh Channel For comparison consider the scalar Rayleigh channel y[n] = hx[n] + v[n] with h¬ CN (0. 3. 1) and v[n]¬ CN (0.

In MRC this is done with a weighted summation of the received branches performed by a complex vector q. Multiplying by the conjugate of the channel co-phases the signals and then weights the branches by the channel amplitude [Rapp02]. z[n] = qhx[n] + qv (16) In this case the SNR can be calculated and bounded with the Cauchy-Schwarz inequality.4 Selection Combining This method has the same general setup as MRC but the receiver selects the best receive antennas with largest |hi | as opposed to combining the signal from all antennas [Jak71]. SNR = |qh|2 P |q|2 No |q|2 |h|2 P ≤ |q|2 No |h|2 P ≤ No (17) Thus the optimal choice for q is q = h∗ . This is eﬀectively a matched ﬁlter.This model is basically an extension of the scalar Rayleigh channel to the vector iid Rayleigh channel. which achieves the maximum SNR [Kah54]. Now to calculate the diversity gain of MRC through the average probability of error with the union upper bound consider: Pe ≤ E M − 1 − ||h||2 P d24 min e No 2 M − 1 − h∗ hP d24 min = E e No 2 Then since R = I.the number of receive antennas and also the number of copies of the symbol that the receiver sees. 3. The receiver must take the received parallel signal and estimate the transmitted symbol. Pe ≤ M −1 2 ··· 1 − h∗ hP d24 −h∗ h min e No e dh nr π P d2 4No min −nr M −1 = 2 = 2 Then at high SNR +1 −nr ··· π nr nr e −h∗ “ P ( 4No d2 +1) min −1 ” I h dh P d2 4No min +1 (18) M −1 P d2 4No min +1 P e ≈ SNR−nr (19) From this calculation it is evident that the diversity gain is nr . This 7 . This kind of action is similar to the RAKE receiver for CDMA.

3. However.5 Equal Gain Combining The branches from each antenna are ﬁrst co-phased to cancel out the eﬀects of the channel and then they are simply added together to produce the output. If the channel tap hi = αi ejθi . To get an intuitive explanation for this one can consider the eﬀective channel that the receiver sees to be h1 + h2 + · · · + hnt ¬ CN (0. . This eﬀective channel behaves like a scalar Rayleigh n channel. but with a 1-3 dB penalty depending on the exact setup and number of antennas. Assuming each branch has amplitude sk . With multiple transmit antennas it is important to keep the total transmit power P constant to allow a fair comparison to the cases with only one transmit antenna. Most of the gain comes from going from one receive antenna to two and three receive antennas. . 1) (23) √ nt The √1 t normalizes the transmit power. this method does not achieve any diversity gain. then the co-phasing operation is simply a multiplication of each branch by e−jθi . snr } < S] = 1 − e−S So the pdf of smax is given by psmax (S) = nr 2Se−S Then the average received SNR is nr 2 2 nr (20) 1 − e−S 2 nr −1 (21) SNR = P i=1 1 i (22) It is obvious from this equation that increasing the number of receive antennas provides a diminishing return. . 3. which requires CSIT and is a close analog of MRC.6 Transmit Maximal Ratio Combining We have considered the SIMO case and now it is time to consider the Multiple-In Single-Out (MISO) case with nt transmit antennas. An approach that does work is Transmit Maximal Ratio Combining (TMRC). The question now is if any diversity gain can be achieved and if so how? A ﬁrst attempt to achieve diversity gain with multiple transmit antennas is to simply transmit the same symbol on each branch. then as outlined in [SA00] P [max{s1 .nr . The system can be modeled as y = hx + v 8 (24) . . s2 . which provides no diversity gain beyond the scalar Rayleigh channel. Equal gain combining produces performance similar to MRC and achieves the full diversity gain as demonstrated in [Yac93].method can achieve the same diversity gain as MRC .

To transmit two symbols u1 and u2 do the following over two symbol times: 1. which entails a host of other problems including delay issues and channel estimation accuracy issues. Alamouti in [Ala98] showed that in the 2 × nr case it is possible to achieve the full diversity. To get the idea of the Alamouti code consider the 2 × 1 case. During the second symbol time send x1 [n + 1] = −u∗ and x2 [n + 1] = u∗ . 1 2 The system can then be written in matrix form as y[n] y[n + 1] = h1 h2 u1 −u∗ 2 u 2 u∗ 1 + v[n] v[n + 1] (27) The receiver is trying to detect u1 and u2 . During the ﬁrst symbol time send x1 [n] = u1 and x2 [n] = u2 . 2. 3. without CSIT using a clever transmit scheme with minimal drawbacks.7 Alamouti Code Using TMRC requires CSIT. The diversity gain for this scheme is nt . For a narrowband channel the system can be modeled as y[n] = h1 x1 [n] + h2 x2 [n] + v[n] (26) with h1 and h2 the channel coeﬃcients. Then the received signal for each symbol that is used for detection is ri = ||h||2 ui + vi ˜ (29) 9 . However. so it is more convenient to write the system in the following form obtained by conjugating y[n + 1]: y[n] y [n + 1] ∗ = h1 h2 h∗ −h∗ 2 1 u1 u2 + v[n] v [n + 1] ∗ (28) The two columns of the square matrix are orthogonal: h∗ h2 1 h∗ −h1 2 h1 h2 h∗ −h∗ 2 1 = |h1 |2 + |h2 |2 0 2 0 |h1 | + |h2 |2 Thus this detection problem can be decomposed into simple scalar detection problems by projecting the receiver vector y onto each column of the H matrix.A weighting vector q sends a weighted version of the current symbol x to each antenna. 2nt . So then y = hqx + v (25) Then by a derivation similar to MRC it can be shown that the optimal choice for q is q = h∗ [God97].

In fact. This power loss hurts detection but no so much as to make the Alamouti code useless. For all MIMO channels we will assume the rate of transmission is high enough that the channel will be slow fading. In detection the vector channel is decomposed into a scalar Rayleigh ˜ channel for each symbol. At lower power it is easier to ﬁnd cheap ampliﬁers that can operate in the linear region. The Alamouti code is representative of a larger class of codes call orthogonal space-time block codes (O-STBCs) that also have easy detection due to orthogonality. It can be shown that the diversity gain is 2. there are some advantages to using antennas transmitting with lower power. Figure 2: Comparison of Alamouti and MRC Error Performance [OC06] 4 MIMO Channel Modeling and Capacity In this section we will consider several MIMO channels and the physical meaning behind these channels. the Alamouti code can be extended to the full 2×nr case by using the same transmission scheme as the 2 × 1 case and MRC. Finally. This method provides the full 2nr diversity gain. Since each symbol is transmitted twice. the transmit power of each antenna must be reduced by 3 dB compared to the single antenna case to normalize the total power.with vi ¬ CN (0. Also the Alamouti code transmits two symbols over two symbol periods. which is a reasonable assumption in any modern high speed wireless system. Of particular interest is how the structure of a MIMO channel suggests the gains of MIMO. so its eﬀective rate of transmission is the same as the original symbol rate. No ). 10 .

No Inr ). vnr (30) with vi ¬ CN (0. . + .1 Narrowband MIMO Channel First.1. nt }. . The singular value decomposition (SVD). Both U and V are unitary. . which means UU∗ = U∗ U = Inr VV∗ = V∗ V = Int Then the system becomes y = (UΣV∗ )x + v Deﬁne y = U∗ y. Then the matrix Σ is zero ˜ except on the diagonals where Σii = σi is the ith singular value of H. 4. No ). the capacity is easy to compute. In this case the system can be modeled with matrices as y = Hx + v.4. and v = U∗ v. In addition by convention. since U∗ is unitary. This coordinate change transforms the complicated system described by H into the simple system with independent parallel channels described by Σ. and Σ ∈ Rnr ×nt . This is a nice mathematical formulation. however. Let nmin = min{nr . . The capacity that a MIMO system can support in this case assuming CSIR and CSIT is n 2 Pi σi (36) Csum = B log2 1 + N0 i=1 11 . σ1 ≥ σ2 ≥ · · · ≥ σnmin . This can be written as y1 h11 · · · h1nt x1 . ynr hnr 1 · · · hnr nt xnt v1 . The SVD of H is H = UΣV∗ with U ∈ Cnr ×nr . .1 Narrowband MIMO Channel Capacity Since the MIMO channel has been decomposed into several parallel channels. = . Then ˜ ˜ ˜ y = Σ˜ + v ˜ x ˜ (35) (34) (32) (33) (31) with v ¬ CN (0. . . . consider the narrowband MIMO channel in which the channel is modeled as a single complex coeﬃcient hij between the jth transmit antenna and the ith receive antenna [Gold05].. . x = V∗ y. V ∈ Cnt ×nt . can provide the desired insight. . . but it oﬀers little insight into what constitutes desirable properties for H. . .

The power allocation Pi can be chosen by trying to maximize Csum subject to the constraint nmin Pi = P . 4. So obviously we want k as large as possible. ∂ ∂Pi nmin B log2 i=1 2 Pi σi 1+ N0 ∂ = λ ∂Pi nmin Pi i=1 2 Bσi = λ 2 (Pi σi + No ) log(2) Pi = B No − 2 λ log(2) σi (37) with λ chosen such that nmin Pi = P . which is nmin at most in the case that H has full rank. Jensen’s inequality can give more information about behavior of the capacity with respect to H. This power allocation method is known as the waterﬁlling i=1 B o power allocation. κ(H). This is achieved precisely when i=1 all the singular values are roughly equal. In matrix theory this quantity σmin is the condition number.2 Physical Modeling of MIMO Channels The major goal of this section is to see how MIMO’s ability to spatially multiplex depends on the actual propagation environment. Tel95]. Thus H should be well conditioned to ensure a large capacity. Also. CKT98. Lagrange multipliers can be used in i=1 this case to compute the optimal power allocation. so k C≈ i=1 B log2 1 + 2 P σi kNo k ≈ k log2 (SNR) + i=1 log2 2 σi k (38) k is thus the parameter that controls the number of spatial degrees of freedom and hence the number of independent streams that can be multiplexed [TV05]. At high SNR the waterﬁlling allocation is close to the uniform power allocation. That the channel capacity increases linearly in nmin at high SNR is one of the most attractive features of MIMO.2 Rank and Condition Number Let k be the number of nonzero singular values of H. and a matrix with κ(H) ≈ 1 is said to be well-conditioned. In other words σmax ≈ 1. 4. The term λ log(2) represents the surface of the water and the N2 term represents σi the depth of the water for any singular value. this section will examine what must be true of the propagation to ensure that the rank and condition number criteria are satisﬁed. All antenna arrays in this section are assumed to be linear and uniformly spaced. k k 2 P P σi B log2 1 + ≤ B log2 1 + σ2 (39) kNo kNo i=1 i i=1 2 This suggests that the quantity k σi should be maximized.1. which is also the rank of H. 12 .as demonstrated in [CT91.

This normalization eliminates many λs from subsequent equations.2. Figure 3: LOS MISO and SIMO [TV05] 13 .1 LOS SIMO and MISO Channel Suppose the antennas are uniformly and linearly spaced by ∆r λc where ∆r represents the spacing as a fraction of the wavelength.4.

. nr . . . . e−j2π(nr −1)∆r Ω e−j2π(nr −1)∆r Ω (45) (43) (42) (44) 1 (1) (1) + ej2π∆r Ω nr ˆ∗ (Ω)ˆr (Ω) = 1 ar a = Then the channel h can be written as e−j2π∆r Ω + · · · + ej2π(nr −1)∆r Ω √ a h = a e−j2πd/λc nr ˆr (Ω) as demonstrated in [SMB01]. . −j2π(nr −1)∆r Ω e Then the following important identity holds: ˆ∗ (Ω)ˆr (Ω) = ar a 1 nr 1 ej2π∆r Ω · · · ej2π(nr −1)∆r Ω 1 e−j2π∆r Ω . For large d di ≈ d + (i − 1)∆r λc cos(φ) Deﬁne Ω = cos(φ). The MISO case is similar and involves the use of 1 ˆt (Ω) = √ a nt (48) 1 e−j2π∆t Ω . No Inr ). h2 . Thus there is a power gain and increased capacity potentially but no degree of freedom gain and so no spatial multiplexing is possible. . . The channel capacity is C = B log2 1 + P ||h||2 No = B log2 1 + P a2 n r No (46) (47) as given in [TV05]. At baseband the channel gain is given by hi = a e−j2πdi /λc (41) So then the channel can be modeled with AWGN as y = hx + n with h = [h1 . . hnr ] and w¬ CN (0. . Deﬁne the following quantity from [Fle00]: 1 e−j2π∆r Ω 1 ˆr (Ω) = √ a . e−j2π(nt −1)∆t Ω 14 .The impulse responses between the transmit antenna and each receive antenna are hi (τ ) = aδ(τ − di /c) (40) a models the path loss of the propagating wave and the di /c term models the time it takes for a propagating EM wave to reach the ith receive antenna [SMB01].

Figure 4: Geographically Distributed Antenna Arrays [TV05] 15 .2.3 Geographically Separated MIMO Still consider LOS propagation and the narrowband case.2. Deﬁne Ωr = cos(φr ) and Ωt = cos(φt ).2 LOS MIMO Similarly to the SIMO case the baseband equivalent channel is hij = ae−j2πdij /λc If d is large then dij ≈ d + (i − 1)∆r λc cos(φr ) − (j − 1)∆t λc cos(φt ) (50) as shown in [TV05].4. Then the channel matrix is given by √ a at (51) H = a nt nr e−j2πd/λc ˆr (Ωr )ˆ∗ (Ωt ) √ In this case H has rank 1 and the only singular value is a nt nr . 4. Then the capacity is C = B log2 1 + P a2 n t n r No (52) (49) This is the same result as the SIMO/MISO case: no degree of freedom gain.

Deﬁne √ i ai = ai nt nr e−j2πd /λc (59) Then the channel matrix can be expressed as H = a1 ˆr (Ωr1 )ˆ∗ (Ωt1 ) + a2 ˆr (Ωr2 )ˆ∗ (Ωt2 ) a at a at (60) 16 . This angle satisﬁes a | cos(θ)| = |ˆ∗ (Ωr1 )ˆr (Ωr2 )| ar sin(πLr Ωr ) = nr sin(πLr Ωr /nr ) with Lr = nr ∆r . Now what remains to be considered is whether H is wellconditioned. In the 2 × nr case as long as the two angles are not a multiple of 1/∆r the two rows of H are linearly independent and thus H has full rank. 4. so ˆr (Ωr1 ) and ˆr (Ωr2 ) are linearly independent as long as Ωr1 − Ωr2 is a a not an integer multiple of 1/∆r .4 Two-Ray MIMO Consider the full MIMO case with antenna arrays at both the transmitter and receiver. Let d(i) be the distance between transmit antenna 1 and receiver antenna 1 along path i. Thus in this case spatial multiplexing is possible. To determine this consider the angle θ between the two columns of H associated with the two transmit antennas.Then the channel between the kth transmit antenna and all the receive antennas is √ hk = ak nr e−j2πdk /λc ˆr (Ωrk ) a (53) with dk the distance between the kth transmit antenna and the ﬁrst receive antenna [PNG03.2. So basically when the diﬀerence between two directional cosines of two 1 angular paths are within Lr the receiver can’t distinguish between the two paths. the function ˆr (Ω) doesn’t take on the same a a value twice in one period. Her04]. which occurs when |Ωr − m 1 | << ∆r Lr (58) for some integer m. ˆr (Ω) is periodic with period 1/∆r . Then the two singular values are λ1 = Thus κ(H) = 1 + | cos θ| 1 − | cos θ| (57) a2 nr (1 + | cos θ|). λ2 = a2 nr (1 − | cos θ|) (56) (54) (55) Thus the matrix is ill conditioned whenever | cos(θ)| ≈ 1. This is similar to the case in frequency selective channels in which the bandwidth of the system controls which multipath delays can be resolved. Also.

3 Statistical Modeling of MIMO Channels In the case of a frequency selective channel the channel can be modeled as an FIR ﬁlter with taps {h[n]}. In modeling a MIMO channel the interest is not in time resolution of multipath but angular resolution at the transmitter and receiver [Par00].Figure 5: Two-Ray MIMO [TV05] as in [PNG03. In this case not all individual multipath components can be resolved but only multipath components that diﬀer in delay by a suﬃcient amount related to the system bandwidth. The term hij is j 1 1 the aggregation of all paths of angular spacing Lt about Lt and angular spacing Lr about Lir . Paths that have Ωs that diﬀer 1 1 by less than Lt at the transmitter or Lr at the receiver can not be resolved. Her04]. This expression for the channel can be put in matrix form as H= a1 ˆr (Ωr1 ) a2 ˆr (Ωr2 ) a a ˆ∗ (Ωt1 ) at ˆ∗ (Ωt2 ) at (61) To ensure H has rank 2 the following two conditions must hold: Ωt1 = Ωt2 mod Ωr1 = Ωr2 1 ∆r 1 mod ∆r (62) (63) H has rank 2 so spatial multiplexing is possible. If there are an arbitrary number of paths then the channel is given by H= i ai ˆr (Ωri ) ˆ∗ (Ωti ) a at (64) The received and transmitted signals can always be expressed in terms of the follow pair of 17 . 4. Suppose the transmit and receive antenna lengths are Lt and Lr . To ensure that H is well conditioned it is 1 1 necessary that Ωr2 − Ωr1 ≥ Lr and Ωt2 − Ωt1 ≥ Lt that is to say there must be suﬃcient angular separation at the transmitter and receiver to ensure that the paths can be resolved.

If x is a vector transmitted by the antennas. nr − 1 1 ). ˆt ( ). ˆr ( a ) Lr Lr 1 nt − 1 ˆt (0). like the Rayleigh channel. then in the angular domain xa are related by x = Ut xa . .basis: Sr = St = which represent the angular bins. ˆt ( a a a ) Lt Lt ˆr (0). Then deﬁne ya = U∗ y.v. . Let Ut be the nt × nt matrix with columns from St . r In this coordinate system ya = U∗ HUt xa + va r = Ha xa + va (68) Each element ha can be reasonably modeled as independent circularly symmetric complex Gausij sian r. . . . ˆr ( a a (65) (66) Figure 6: Angular Domain MIMO [TV05] Each basis can be used to represent transmitted and received signals in the angular domain in terms of the directional cosine Ω. xa = U∗ x t (67) By examining the matrix Ut it can be seen that xa is the IDFT of x. . . . The validity of this assumption rests on two key factors 18 .

then the fading is not Rayleigh but Ricean. Thus in the narrowband case the MIMO channel is basically an extension of the scalar Rayleigh channel where each coeﬃcient of the channel matrix is a complex Gaussian random variable. 4. In addition.3. As a rule of thumb antenna spacing of at least λ is desirable and results in uncorrelated coeﬃcients [FG98]. The channel in this case can be modeled as N y[n] = l=1 Hl x[n − l] + v[n] (70) as in [TV05]. Thus the channel in this model can support spatial multiplexing.• Amount of scattering and reﬂection in the multipath environment . The exact amount of correlation depends on the angular spread of the antennas.Short antenna arrays lump many multipath components into the same angular bin. so the achievable diversity gain is reduced. results from random matrix theory show that H with this distribution has full rank with probability 1. Since Ut and Ur are unitary and H = Ur Ha U∗ t (69) H has the same iid Gaussian distribution [CT91]. If there is a strong line-of-sight component. Antenna Spacing The assumption that the coeﬃcients of H are independent or at least uncorrelated depends heavily on the antenna spacing. In practice the channel coeﬃcients are never completely uncorrelated but as a simplifying assumption to make analysis tractable we assume they are uncorrelated and independent. The justiﬁcation for this model is a straightforward extension of the angular model outlined in the previous sections. Since the coeﬃcients are highly correlated the receiver does not see as many independent copies of the transmitted signal. 19 . A longer antenna array results in better angular resolution of paths and more non-zero entries in Ha . For antennas with small angular spread at separations on the order of λ or smaller 4 the coeﬃcients are highly correlated.1 Frequency Selective MIMO Channel The extension of the preceding ﬂat MIMO channel model to the frequency selective MIMO channel model is fairly straightforward. As the antenna spacing decreases towards λ the channel coeﬃcients become 4 strongly correlated.this model needs several multipath components in each angular bin • The lengths of Lt and Lr . As the antenna spacing 2 increases there is still a diversity gain but it is not quite as large as if the antennas were spaced further. In this model the channel between any two pairs of antennas is modeled as a scalar frequency selective channel in which the output is a convolution of the input and the channel taps.

transmitting at a given rate what is the maximum possible diversity gain. This tradeoﬀ curve is diﬃcult to compute. d∗ (r). In particular. Thus the outage probability is approximately 1 pout ≈ (75) SNR1−r Thus d∗ (r) = 1 − r is the optimal tradeoﬀ. For suﬃciently large . . At high SNR the MIMO capacity is C ≈ nmin log2 (SNR) (71) for a channel with full rank. so ∗ pout ≈ SNR−d (r) (72) Thus it makes sense to deﬁne d∗ (r) = − lim log pout (r log SNR) SNR→∞ log SNR log Pe (r log SNR) log SNR (73) Alternatively d∗ (r) can be deﬁned in terms of the probability of error d∗ (r) = − lim SNR→∞ (74) Before tackling the full MIMO channel it is useful to consider the diversity-multiplexing tradeoﬀ in scalar and SIMO/MISO channels.1 Scalar Rayleigh Channel The scalar channel is in outage if the capacity it supports falls below the rate of transmission. Of great interest is whether a given space-time code or modulation can achieve this frontier and thus be optimal. P [|h|2 < ] ≈ . 1. On the other hand a MIMO system can transmit nmin independent streams to provide the maximum possible rate with the minimum error protection. This kind of analysis leads to a curve relating the transmit rate and the optimal diversity gain. nmin . Tse and Zheng proposed in [ZT03] studying this tradeoﬀ by making assumptions on the possible rates of transmission and letting the SNR approach inﬁnity. . The diversity-multiplexing tradeoﬀ involves investigating what happens between these two extremes and in particular what constitutes the optimal tradeoﬀ. . So pout is given by pout = P log 1 + |h|2 SNR < r log SNR SNRr − 1 = P |h|2 < SNR |h|2 is chi-squared distributed. 5. Tse and Zheng assumed that only rates R = r log(SNR) are possible with r = 0. . but some methods have been proposed to simplify the study of this tradeoﬀ. 20 .5 Diversity-Multiplexing Tradeoﬀ A MIMO system can transmit one symbol on all the transmit antennas and use the right processing to obtain the full diversity gain nt nr . The optimal diversity gain. is the exponent in the outage probability.

Then the outage (79) The Alamouti code eﬀectively decomposes the MISO channel into parallel Rayleigh channel. 5.1 QAM over the Scalar Rayleigh Channel 2R . can be used. It can be easily demonstrated that the optimal tradeoﬀ curve for this parallel Rayleigh channel is d∗ (r) = 2(1 − r). This scheme 21 . So if QAM is used on each of the scalar channels along with the Alamouti code. SNR It can be demonstrated that for QAM that Pe ≈ d(r) = − lim Then log Pe log SNR log 2r log SNR /SNR = − lim SNR→∞ log SNR r log SNR − log SNR = − lim SNR→∞ log SNR = 1−r SNR→∞ (76) Thus QAM achieves the optimal diversity-multiplexing tradeoﬀ of the scalar Rayleigh channel. The power allocation at the transmitter directly aﬀects the SNR at the receiver.1.3 MIMO Rayleigh Channel The outage probability is given by pout = min Kx :Tr[Kx ]≤SN R P [log det (Inr + HKx H∗ ) < r log SNR] (80) The matrix Kx is the covariance matrix of the input and basically represents a power allocation. then the resulting system is tradeoﬀ optimal for the MISO channel.2 MISO Rayleigh Channel In this case the system can be modeled as y[n] = hx[n] + w[n] Taking the rate R = r log SNR as usual the outage probability is pout = P log 1 + ||h||2 SNR nt < r log SNR nt (77) (78) ||h||2 is χ2n distributed so the approximation P [||h||2 < ] ≈ probability is roughly pout ≈ SNR−nt (1−r) So it is apparent that the optimal tradeoﬀ d∗ (r) = nt (1 − r). 5.5.

In this case H has rank r and H is in the space Vr of rank r matrices in the space Cnt ×nr . so SNR (81) pout = P log det Inr + HH∗ < r log SNR nt This outage probability can be written in terms of the singular values of H as nmin pout = P i=1 log 1 + SNR 2 σ nt i < r log SNR (82) There are no neat approximations to evaluate this outage probability but there is a neat geometric argument to evaluate the outage probability [TV05. For the remainder of this argument restrict our consideration to N . then 0 ∈ Vr . The following paragraph is very technical but the fundamental result is simple: Vr can be considered to be a linear space in a suﬃciently small neighborhood. Outage occurs when H is close to 0. If ⊥ the portion of H in Vr vanishes. so the input covariance matrix 1 must be chosen not to exceed the limit. The situation seems hopeless but it has been shown by Tse and Zheng that although there are many ways for the channel to be in outage the most common way is for r eigenchannels to be good and the remained to be bad. But clearly 0 has rank 0. which is orthogonal to Vr . To see that Vr is not linear consider that if Vr were a linear space. the notion of orthogonality can be used. although it turns out that Vr may not be a linear subspace. The question of interest is what happens when H is close to Vr . The worst covariance matrix Kx is approximately nt Inr . This question is tractable but also a little tricky. So the question of whether H puts the channel in outage is the question of whether H is close to Vr in the appropriate sense. since Vr is not a linear space. Close can be evaluated in terms of the Froebnius norm ||H − 0||F = ||H||F nmin = i=1 2 σi = i. and so the channel is 22 . the surface of Earth is a manifold since a small neighborhood looks like a portion of R2 even though the overall space is clearly not linear. so it is suﬃcient to consider a small neighborhood N of a point of Vr containing H. so 0 ∈ Vr . Now if r is an integer greater than 0 the situation becomes considerably more complicated. A manifold is a space with the property that small neighborhoods of a point look like linear subspaces of Rk or Ck . For example. First consider r close to 0. N looks like a linear subspace of Cnt ×nr . it is a manifold embedded in Cnt ×nr . Thus Vr is not a linear / nt ×nr subspace of C . Since Vr can be considered locally linear. ZT03]. since there are more ways to choose bad λi to put the channel in outage.j |hij |2 Thus the magnitude of each channel coeﬃcient |hij | must be close to 0 for the channel to be in outage.makes a speciﬁc assumption about the rate R at a given SNR. then H is basically in Vr . However. H has rank r. Then H can ⊥ be decomposed into a portion in Vr and a portion in the space Vr .

. . . then r rows of length nt can be chosen and the remaining nr rows can be written as linear combinations of the ﬁrst r rows. ⊥ nt nr = dim Cnt ×nr = dim Vr + dim Vr Thus ⊥ dim Vr = nt nr − (nt r + (nr − r)r) = (nt − r)(nr − r) (83) Thus pout ≈ SNR−(nt −r)(nr −r) and so the optimal tradeoﬀ is given by d∗ (r) = (nt − r)(nr − r) for r = 0. where d is the dimension of Vr . Since Vr and Vr⊥ decompose the nt × nr space. nmin . The rate at which the 23 . Figure 7: Diversity-Multiplexing Tradeoﬀ For MIMO [TV05] 6 Space-Time Coding over Narrowband Channels There are two major types of space-time codes: block codes and trellis codes. A trellis code is a convolutional code in which the current output depends on a block of input bits and the previous input bits represented by the state of the trellis code. which assumes that the channel remains constant over the duration of a code. From this it follows that dim Vr = nt r + (nr − r)r. which are derived from the similar structures in the single antenna case.⊥ in outage as discussed before. There names imply their structures. 1. If H is of rank r. The basic idea of a space time block code is to map Q symbols into a block of transmitted symbols of size nt × T for some integer T . The probability that the portion of H in Vr vanishes(the outage ⊥ probability) is SNR−d . . One general assumption on almost all space-time codes is the quasi-static assumption.

A good space-time code should then achieve a high diversity gain and a high coding gain. but not in the middle of codewords. One approach to ﬁnding these conditions for the slow fading MIMO channel is to consider what factors aﬀect ML decoding of the codewords. The channel can change between codewords. Conditioning on the channel matrix H the PEP is [Pro01] T SNR ||H(ck − ek )||2 (85) P [C → E|H] = Q F 2 k=0 Averaging over all channel realization gives the average PEP: P [C → E]. Then the error probability of interest is the paired error probability(PEP) that a codeword C is incorrectly decoded as E. The quantity c improves performance and is called the coding gain. dg can be deﬁned in terms of the PEP as dg = − lim log P [C → E] SNR→∞ log SNR (86) Generally at high SNR the PEP is of the form (c × SNR)−dg . then noise can lead to incorrect estimation of a codeword as another codeword. which is in turn related to the Doppler spread. If two codewords are close together. The covariance of two 24 .1 Error Motivated Design It is important and interesting to ﬁnd conditions that will guarantee a good error performance for a space-time code. The optimal way to detect a codeword is with ML detection is given by ˆ C = arg min ||Y − HC||2 C∈C (84) The operation of this detector is limited mainly by the closest pair of codewords.Figure 8: Space-Time Encoder Structure channel changes is related to the coherence time. In a way similar to the diversity-multiplexing tradeoﬀ the diversity gain. The system must be designed to ensure that the duration of a codeword is less than the coherence time. The relevant question now is how to achieve diversity and coding gains. 6.

A quantity of interest is the eﬀective symbol rate of the code: Q rs = (91) T 25 .E∈C C=E (90) These criteria guarantee good codes at high SNR.Maximize the product of the nonzero eigenvalues to achieve coding gain ˜ rank(E) dλ = min C. Then the PEP is given by [SA00. This expression can be further bounded to yield −nr ˜ ˜ rank(E) −nr rank(E) SNR λi (87) P [C → E] ≤ 4 i=1 rank( ˜ ˜ Thus the diversity gain is nr rank(E) and the coding gain is i=1 E) λi .Maximize the minimum rank of the codeword diﬀerence matrix to achieve a good diversity gain always: ˜ max min rank(E) C. P [C → E] = 1 π π/2 det Int + 0 ˜ π/2 rank(E) 0 i=1 SNR ˜ E 4 sin2 β −nr dβ −nr 1 = π ≤ SNR 1+ λi 4 sin2 β −nr dβ ˜ rank(E) i=1 SNR λi 1+ 4 ˜ with the second expansion due to expressing the determinants in terms of the eigenvalues λi (E) and the last expansion valid at high SNR. 6.2 Space-Time Block Codes A space-time block code(STBC) maps a block of Q input symbols into a block of symbols of size nt × T to be transmitted on the antennas.˜ codewords C and E is the matrix E = (E−C)(E−C)∗ .E∈C C=E (88) • Determinant Criterion . Sim01].E∈C C=E i=1 λi (89) In the case where the codeword matrix always has full rank this becomes maximize ˜ dλ = min det E C. Given these two gains there are two criterion for a good space-time code at high SNR are as follows [TSC98]: • Rank Criterion .

The codeword of the linear block matrix can be expressed as a linear function of complex nt × T basis matrices φq and input symbols c1 . cQ as follows [HH01]: Q C= q=1 φq {cq } + φq+C {cq } (92) It may seem a little odd to break up the real and imaginary components of the symbols. . In the case of linear STBCs if the basis matrices are unitary meaning φ∗ φ = Int if T ≤ nt (Tall matrix) or φφ∗ = IT if T ≥ nt (Wide matrix). 6. The following example with the Alamouti code shows that this is possible. but the advantage of this approach is that conjugation of symbols can be used in linear STBCs. which represents the Alamouti code: c1 −c∗ 2 c2 c∗ 1 Then the code can be represented with basis matrices as: φ1 = φ3 = 1 0 0 1 1 0 0 −1 φ2 = φ4 = 0 −1 1 0 0 1 1 0 (94) (93) Code Design Criteria for Linear STBCs As we saw in the previous section minimizing the worst PEP is a good strategy to develop a good space-time code. c2 . Example: Alamouti code The two complex symbols c1 and c2 are mapped into the following matrix. An orthogonal STBC has codewords C that satisfy the following key property Q T ∗ |cq |2 Int (97) CC = Qnt q=1 26 . but one of the most common is the linear block code.2. For rs < 1 the system on average transmits less than one symbol per symbol period. . Codes with rs < 1 eﬀectively reduce the rate of transmission. .For rs = 1 the system eﬀectively transmits one symbol per symbol period. then the PEP condition is φq φ∗ + φp φ∗ = 0 q = p (Wide) p q φ∗ φp + φ∗ φq = 0 q = p (Tall) p q (95) (96) Orthogonal STBCs There are a special class of linear STBCs that have special orthogonality property that leads to easy decoding [TJC99].1 Linear STBCs There are many diﬀerent classes of space-time block codes. .

. . However. this is not very useful as many constellations such as QAM are complex. . If the O represent Alamouti codewords. However. c2Q ) O(cQ+1 . then the codeword matrix is c1 −c∗ c3 −c∗ 2 4 1 c c∗ c4 c∗ 1 3 (100) Q(c1 . c4 ) = 2 2 c3 −c∗ c1 −c∗ 4 2 c4 c∗ c2 c∗ 1 3 Then during decoding the codeword matrix is multiplied a 0 b 1 0 a 0 QQ∗ = 4 a 0 b 0 a 0 where 4 by its conjugate. . . . Quasi Orthogonal STBCs O-STBC achieve full diversity but at the expense of any spatial multiplexing. . Quasi Orthogonal STBCs (QO-STBCs) attempt to achieve some of the beneﬁts of O-STBCs while also providing for some spatial multiplexing by using smaller O-STBCs as building blocks. . . . It is clear in the case of Alamouti that it takes two symbol times to transmit two symbols. . which works on complex constellations. . . C2Q ) = O(c1 . . For more than two transmit antennas. . it turns out that the Alamouti code is the only O-STBC that works on complex symbols that achieves a transmit rate rs of one symbol per second. If rs < 1 then it is always 2 possible to ﬁnd an O-STBC that achieves good diversity. . cQ ) (99) gs ) gs ∈ [0. rs ] rs (98) were each O is a codeword matrix for a smaller O-STBC on only Q input symbols [TBH00]. For example a QO-STBC could be Q(c1 . For a purely real constellation it is always possible to ﬁnd a real O-STBC for an nt that achives rs = 1. . rs < 1 always.This property is very nice because it implies that easy decoding is possible due to the orthogonality. so the transmit rate rs = 1. The key example of an O-STBC is the Alamouti code. c3 . which yields 0 b 0 b (101) a = b = q=1 c1 c∗ 3 |cq |2 + c3 c∗ − c2 c∗ − c4 c∗ 4 2 1 27 . cQ ) O(cQ+1 . . The diversity multiplexing tradeoﬀ for O-STBCs is given by [OC06] as d∗ (gs ) = nt nr (1 − for QAM constellations. . . c2Q ) O(c1 . c2 .

. . LDCs are derived through numerical optimization to determine. . Linear Dispersion Codes The BLAST architecture achieves high multiplexing gain at the expense of diversity gain. Heath and Sandhu LDCs [Hea01. c2Q ) O(c1 . . Fundamentally designing an algebraic code comes down to choosing the appropriate matrices M1 . . . . O-STBC in contrast achieve high diversity gain at the expense of multiplexing gain. which means the QO-STBC fails the rank condition. . . but instead of transmitting a conjugate transmit a rotated version of the ﬁrst set of symbols. and φ. C2Q ) = O(c1 . then det(E) = 0. c2Q )∗ O(cQ+1 . A way to improve on this is to use rotated variations of the base constellation to prevent rank deﬁciencies and achieve good diversity gain [SP03. 28 . Other combinations of O-STBCs have been proposed including the following Alamouti like scheme [Jaf01] Q(c1 . XL05]. Hassibi and Hochwald LDCs [HH01] 2. . which basis matrices are optimal relative to some criteria that balances diversity and multiplexing gain. M2 . .The codeword matrix doesn’t nicely decouple like in the case of O-STBC. . cQ ) −O(cQ+1 . . Linear dispersion codes(LDC) try to achieve a little of both. . Rotated QO-STBCs Because of the way quasi orthogonal matrices are constructed if two ˜ codewords E and C each contain one point from the constellation. There have been several LDCs proposed including 1. cQ )∗ (102) Decoding with this scheme has complexity similar to the previous case of QO-STBCs. . . Algebraic codes also transmit a symbol twice. In terms of the codeword marix this can be written as C= with u1 u2 = M1 c1 c2 v1 v2 = M2 c3 c4 (104) u1 φ1/2 v1 φ1/2 v2 u2 (103) M1 and M2 are unitary matrices and the constellation points come from QAM that represent the rotations. . . . San02] Algebraic STBCs The Alamouti code works by transmitting two symbols and then their conjugates arranged in the appropriate way. WX05. This implies that in some cases QO-STBCs will have bad diversity gain. SX04. . which greatly reduces complexity. but at least the ﬁrst/third and second/fourth columns can be decoded separately.

(108) (109) √ 1+ 5 2 The ﬁgure below shows how these space-time codes compare to the optimal diversity-multiplexing tradeoﬀ: 29 . Finally. Golden Code This code [BRV05] is given by 1 M1 = √ 10 1 M2 = √ 10 α and θ are chosen in terms of the golden ratio α αθ α αθ 1 0 0 j and the constellation.φ code 1 1 ejπ/4 M1 = M2 = (106) 2 1 e−jπ/4 Tilted QAM This code [YW03] is given by 1 Mi = √ 2 cos ωi sin ωi − sin ωi cos ωi (107) This choice of Mi is literally a rotation matrix that rotates points about the origin by ω i radians. Threaded Algebraic Space-Time Code(TAST) This code [GD03] is similar to the B2.B2. Optimization methods can be used to ﬁnd φ. φ = ejω .φ code In this code [DTB02] M1 = M2 = 1 2 1 ejω 1 e−jω (105) and ω is chosen by numerical optimization to ﬁt the given constellation.

The Diagonal and Vertical Bell Labs Space Time Architectures (D-BLAST/V-BLAST) suggest general architectures to achieve the gains of MIMO.Figure 9: Diversity-Multiplexing Tradeoﬀ For Several Techniques [OC06] The ﬁgure below shows the error performance of several space-time codes: Figure 10: Error Performance For Several Techniques [OC06] 6.3 Bell Labs Space Time Architectures The sections on the MIMO channel have demonstrated that MIMO can provide both a degree of freedom gain (increased capacity) and a diversity gain (better error performance). The general idea of the BLAST architectures 30 .

6.is to multiplex several streams of symbols (possibly demultiplexed from one original stream) onto the multiple antennas and then receive and decode the streams. Sometimes a system provides a codebook of Q matrices that the transmitter can use. The next step is decoding in which any codes that were applied to individual streams 31 . There are two natural choices for Q depending on whether there is CSIT or not. The design of eﬃcient V-BLAST receivers is an active area of research. logically it makes more sense to present V-BLAST ﬁrst and then discuss how D-BLAST is logically an extension of V-BLAST. and the complexity of decoding. If there is CSIT. V-BLAST Receiver Structures There are two general steps in the V-BLAST receiver. In V-BLAST there is a large degree of freedom in choosing the exact receiver structure. The action of Q is to rotate the input streams. so that the action of the channel can be expressed in a simple form. These actions create an equivalent channel model: y = Σ˜ + v ˜ x ˜ (110) The complex MIMO channel is reduced to several parallel scalar channels with each subchannel carrying one stream. The choice of receiver structure aﬀects error rates. The feedback from receiver is just an index into the codebook that tells the transmitter. Foschini suggested the D-BLAST architecture ﬁrst and then V-BLAST was developed later as a simpliﬁcation. capacity.3. In this case the choice of receiver is an interesting problem and there are many choices all with diﬀerent choices. then the situation is considerably more complicated and interesting. G. However. At the receiver the received vector y is multiplied by the matrix U from the SVD of H. The ﬁrst is demodulation in which the receiver estimates what symbol was sent and hence which bits were sent. Historically. At the receiver the streams are decoded jointly or individually. If there is not CSIT. then the matrix V from the SVD of H can be used. which Q to use.1 V-BLAST The general architecture of V-BLAST is described in the ﬁgure below [GFVW99] Figure 11: VBLAST Architecture The independent streams are multiplexed by the matrix Q onto the transmit antennas. In this case the best choice for Q is simply the identity matrix Int . This form of feedback massively reduces the required bandwidth in the feedback channel.

Then there are two key problems that sphere decoding has to deal with [HV05]. where s is a point in the original constellation. the actual transmitted lattice point is likely to be close by the received vector and in the sphere. Basically any convolutional and block code can be applied to individual stream. In this case sphere decoding agrees with ML detection. In addition. Figure 12: Idea Behind Sphere Decoding [HV05] This process reduces the search space and necessary number of computations. How to ﬁnd lattice points inside the sphere? The detector can not compare the received vector to every point in the lattice to ﬁnd the points inside the sphere or it would be performing an exhaustive search oﬀering no advantage over normal ML-detection. Although this method is optimal it is computationally complex (NP-hard) as it must be performed over all possible transmit vectors. The ML receiver estimates the transmitted streams by the rule [TV05] ˆ = arg min ||y − Hs||2 s s∈C (111) Practically what this method does is pick the closest point to the received vector in the lattice of points formed by Hs. which is what ML detection would pick as an estimate of the transmitted lattice point. If the sphere actually contains any points. since the transmitted vector is corrupted by AWGN. these ML-like algorithms can feed soft decisions to the decoders to improve their performance. 1. The optimal V-BLAST receiver is the ML-receiver that jointly decodes the streams.are decoded. Sphere decoding is one such algorithm [VB99]. Sphere Decoding Although the ML detector is basically computationally infeasible in many practical system there has been considerable interest in algorithms that are similar to MLdetection in methodology and performance but with considerably less complexity. In addition. This problem is known as the integer least squares problem. 32 . then obviously it must contain the closest point. This computational complexity generally makes it infeasible to use an ML detector. The basic idea behind sphere decoding is to look only at points within a sphere of radius d about the received vector and then choose the closest point inside the sphere [HV05]. so we are primarily interested in diﬀerent architectures for demodulation.

Now the algorithm proceeds inductively by assuming that all k-dimensional points within the sphere of radius d have been found. which is the easy one-dimensional problem. The algorithm proceeds ﬁrst by calculating the QR factorization of the matrix H: R H=Q (114) 0(n−m)×m 33 . Figure 13: Tree for Sphere Decoding [HV05] To see exactly how sphere decoding works suppose the lattice we are working on is the integer lattice Zm [HV05].2. Fix a sphere radius d. B A solution to problem number one above is based on a simple observation: the problem is diﬃcult in general but easy in one dimension. Suppose the channel matrix H ∈ Rn×n and that n ≥ m. One way to choose the sphere radius is to compute the Babai estimate for the transmitted symbol sˆ . then the sphere may contain no points. There are other heurestic methods to choose d. This process continues until the full dimension of the search space is reached. This estimate is not actually a point in B the lattice. This process is usually visualized as a tree where the kth level of the tree corresponds to the points of dimension k inside the sphere of radius d. The goal is to ﬁnd the points s ∈ Zm such that ||y − Hs||2 ≤ d2 (113) where y is the received vector. then the detector considers too many points. but the least squares solution (not constrained to the lattice) given by sˆ = arg min ||y − Hs||2 B s (112) Then choose d = ||y − Hsˆ . In one dimension the sphere is simply an interval. Then the set of k +1-dimensional points that lies within radius d is an interval. How to choose the sphere radius? If d is too large. so the problem reduces to ﬁnding the lattice points inside this interval. If d is too small.

they are simply the integers ˜ ˜ ˜ ˜ −d + ym d + ym ≤ sm ≤ Rmm Rmm (121) The key step in this process is how to proceed from ﬁnding the sm in the sphere to ﬁnding which {sm−1 .m sm − Rm−1. sm } are in the sphere.m−1 sm−1 )2 y y To make use of this condition proceed as follows: For each sm deﬁne 2 ˜ ˜2 d2 m−1 = d − (ym − Rmm sm ) (122) (123) 34 . This decomposition will make later calculations simpler.m−1 sm−1 )2 + · · · y y (119) We observe that the ﬁrst term depends on only {sm }. Expand the orthogonal matrix Q as Q= Q1 Q2 (115) with Q1 ∈ Rn×m and Q1 ∈ Rn×(n−m) . Finding the integers that satisfy this necessary condition is easy. Then the condition to be in the sphere is given by 1 2 ˜ d2 ≥ ||˜ − Rs||2 y m m 2 (117) Rij sj j=i = i=1 yi − ˜ (118) The sum can be written term by term as ˜ d2 ≥ (˜m − Rmm sm )2 + (˜m−1 − Rm−1. Then the following is a necessary condition for any point s to be in the sphere: ˜ d2 ≥ (˜m − Rmm sm )2 y (120) ˜ ˜ Basically the last coordinate of s must be within d of y. the second term depends on only {sm−1 . Then the points inside the sphere satisfy: d2 ≥ ||y − = || Q∗ 1 Q∗ 2 Q1 Q2 y− R 0 R 0(n−m)×m s||2 s||2 = ||Q∗ y − Rs||2 + ||Q∗ y||2 2 1 This expression can be rearranged to the condition: d2 − ||Q∗ y||2 ≥ ||Q∗ y − Rs||2 2 2 (116) ˜ ˜ Deﬁne d2 = d2 − ||Q∗ y||2 and y = Q∗ y. sm } and so on. This is done by ensuring the ﬁrst two terms in equation 119 are ˜ less than d2 : ˜ d2 ≥ (˜m − Rmm sm )2 + (˜m−1 − Rm−1.where Q is an n × n orthogonal matrix and R is an m × m upper triangular matrix.m sm − Rm−1.

Then we can obtain a condition that sm−1 must satisfy to be in the sphere: ˜ ˜ −dm−1 + ym−1 − Rm−1,m sm dm−1 + ym−1 − Rm−1,m sm ≤ sm−1 ≤ Rm−1,m−1 Rm−1,m−1 (124)

By applying this method to each sm the points {sm−1 , sm } inside the sphere of radius d can be found. This process can be continued until the full m-dimensional problem has been solved. It is clear why a tree is an appropriate structure to represent the operation of sphere decoding, since each leaf gives rise to some number of children (possibly zero) in the next iteration all of whom are inside the sphere as one more dimension of the problem is solved. It is also clear that if we choose the radius to be too small one of the conditions like equation 124 may not be satisﬁed by any integer and thus no points are in the sphere. If the sphere radius is too large, then too many points may satisfy equation 124 making computing the closest point tricky. Non-Joint Detection Besides joint detection there are a wealth of detectors that work on detecting individual streams from the received signal and don’t attempt to decode all the streams simultaneously. Consider trying to decode one stream xk . The system in this case can be modeled as y[n] = hk xk [n] + hi xi [n] + v[n] (125)

i=k

where hi is the ith column of the channel matrix H. In this system there is a stream of interest plus several interfering streams represented by the sum terms plus a noise terms. To successfully decode the stream of interest the receiver must deal with the interference term and the noise term. Zero Forcing Nulling At high SNR performance will be interference limited not noise limited [GFVW99]. ZF-Nulling attempts to remove all the interfering terms in the sum to leave only the stream of interest. This can be done linearly with a single vector multiplication. The weighting vector qk to decode the kth stream satisﬁes qT hj = δkj k where δkj is the Kronecker delta which is 1 when k = j and 0 otherwise. Then qT y[n] = qT hk xk [n] + k k

i=k

(126)

qT hi xi [n] + qT v[n] k k δki xi [n] + qT v[n] k

= δkk xk [n] +

i=k

= xk [n] +

qT v[n] k

This weighting vector has an obvious geometric interpretation; the weighting vector projects the received vector y onto a subspace orthogonal to h1 , . . . , hk−1 , hk+1 , . . . , hnt .

35

Figure 14: Zero Forcing Nulling MIMO The weighting vectors are just the columns of the pseudoinverse of H given by H† = (H∗ H)−1 H∗ , so it is not too diﬃcult to compute the appropriate weighting vectors given the channel matrix H. It is easy to calculate the SNR out for each stream using weighting vectors as SNRk = P ||qk ||2 No (127)

ZF-Nulling with Successive Interference Cancellation The SNR has an inverse relation to ||qk ||2 , so if ||qk ||2 can be reduced the SNR will be increased. Results from linear algebra indicate that the higher the dimension of the space that qk must be orthogonal to the larger ||qk ||2 is. So if qk must be orthogonal to fewer vectors, then ||qk ||2 will be reduced. Successive interference cancellation(SIC) can reduce the dimension and increase the SNR. The diagram below shows the operation of SIC.

Figure 15: Successive Interference Cancellation [TV05] With this scheme as each stream is decoded it is subtracted from the received vector. As a result the subtracted scheme does not interfere with any subsequent streams. So then qk must be orthogonal to hk+1 , . . . , hnt . The reduced number of vectors means ||qk ||2 is reduced and SNRk is increased.

36

One practical issue when implementing SIC is the order of cancellation. The last decoded stream has the least interference and achieves the best performance. It has been demonstrated that a greedy choice of order is optimal relative to the maximin criteria [GFVW99]. This means that the kth stream to be decoded should be chosen from the reaming streams as the one that will achieve the highest SNR of the remaining streams if it is decoded now. The maximin criteria means that the smallest SNRk is maximized by choosing the optimal order. The major drawback to SIC is error propagation. Mistakes at the beginning of the decoding chain can introduce mistakes later on. So if one stream is inaccurately decoded, then all subsequent streams will likely be decoded inaccurately. Matched Filter At very low SNR noise is the problem, so a matched ﬁlter can be used to deal with the noise. In the MIMO case the matched ﬁlter for each stream is simply maximum ratio combining(MRC) performed on the appropriate column of H. MMSE Receiver The matched ﬁlter performs well at low SNR and ZF-nulling performs well at high SNR. But at high SNR the matched ﬁlter has bad performance and at low SNR ZFnulling has bad performance. So naturally one may wonder if there is a receiver that operates well at both low and high SNR. The MMSE receiver is such a receiver [TV05]. To understand how the MMSE receiver works consider the following SIMO system modeled as y = hx + z (128)

with z colored noise having invertible correlation matrix Kz . The ﬁrst operation is to whiten −1 the noise by multiplying by Kz 2 . Then the system becomes Kz 2 y = Kz 2 hx + Kz 2 z Then apply a matched ﬁlter (Kz 2 h)∗ to yield the system h∗ K−1 y = (h∗ K−1 h)x + h∗ K−1 z z z z (130)

−1 −1 −1 −1

(129)

Thus the receiver simply multiplies the received signal by h∗ K−1 and performs normal demodz ulation. This is the MMSE receiver, which maximizes the SNR, while minimizing the MMSE between the estimate of x and x itself. For V-BLAST the corrupting non-white noise is the interference terms plus the additive noise. The covariance matrix for this noise is given by Kzk = No Inr +

i=k

P i hi h∗ i

(131)

**A similar derivation shows that the MMSE receiver in this case the weighting vector is
**

−1

qk =

No Inr +

i=k

P i hi h∗ i 37

hk

(132)

Suppose that there are two separate streams each consisting of two blocks. b(2) can be received with MRC. which is a matched ﬁlter. If a SIC structure is used with either the MMSE receiver or ZF-nulling. 5. 3. Next decode the spatial code across the ﬁrst layer [a(1) a(2) ]. then subsequent streams will likely be incorrectly decoded. Thus the maximum possible diversity gain for any individual stream is nr and there is a limit to how much MIMO diversity techniques can protect a stream [Fos96]. Thus the MMSE receiver is like ZF-Nulling at high SNR. At low SNR Kzk ≈ No Inr (133) so the receiver is given by hk . Now both streams have been decoded reliably. Consider the case with two transmit antennas. Coding across streams is used to ensure each stream is reliably decoded. In addition. Denote this by a(i) and b(i) for i = 0. but in order to decode the spatial code across the streams each stream must already be decoded in V-BLAST. while ignoring b(1) . The key observation is that for a single layer if one of the blocks for one stream is initially decoded incorrectly. so it can be cancelled out and b(1) can be received. First receive a(1) with MRC 2. The solution to this problem is to alter the way the streams are transmitted. If SIC is used in conjunction with MMSE. Then the second layer [b(1) b(2) ] can be decoded.It is pretty easy to see that the MMSE receiver is a tradeoﬀ between the matched ﬁlter and ZF-Nulling. Now a(2) has been reliably decoded. Then the D-BLAST codeword is a(1) b(1) (135) C= a(2) b(2) From this codeword matrix it is obvious where D-BLAST gets its name from. then if one stream is incorrectly decoded. Next receive a(2) with MMSE or ZF-nulling. The main reason for this problem is that no coding is performed spatially across the multiple streams. since the layers are now diagonal. It is transmitted by one antenna and received by all nr receive antennas.2 D-BLAST Consider the kth stream.3. 4. there is still a chance to ﬁx 38 . Finally. then MMSE-SIC can achieve the channel capacity. The receiver works as follows: 1. 1. 6. the MMSE receiver has good performance in the region between high and low SNR. At high SNR Kzk ≈ i=k P i hi h∗ i (134) and it can be seen that qk is simply the kth column of the pseudoinverse of H.

Figure 16: Trellis Coding [OC06] The decoder’s job is estimate which sequence. The transition arrows are driven by the input bits. and so some capacity is lost. A trellis diagram is a way of representing the action of a STTC [OC06]. Finally. One way 39 . For example. Trellis codes provide better error performance compared to block codes and coding gain at the expense of implementation complexity. The diagram below shows a trellis. which has 2ν states.4 Space-Time Trellis Codes A space-time trellis code (STTC) is an extension of normal convolutional codes to multiple antennas [TSC98]. The key idea behind a STTC is to make the output of the encoder a function of the input bits and the state of the encoder. If the output is 02 for example then the 0th is sent on the ﬁrst antenna and the 2nd symbol is sent on the second antenna. The number of nodes is the number of states in the code. path through the trellis. The left column represents the current state of the code and the right column represents the next state. during the ﬁrst block the second transmit antenna transmits nothing. There are 2B arrows from each state on the left to states on the right for each possible combination of inputs. which is in turn a function of the previous inputs. 6. The major price to pay for using D-BLAST is the lost capacity during the startup process due to the blank spots in the codeword. The possible outputs are listed on the left hand side of the trellis. 6.the error with the code applied across the layer.1 Trellis Representation Suppose B bits are input into the encoder.4. was sent with. there is also the cost in implementation complexity of applying coding and decoding across streams.

1 Single Carrier In this case the system can be modeled as L−1 yk = l=0 H[l]ck−l + vk 40 (137) . 6.OFDM Many modern wireless standards that use MIMO also use OFDM. so MIMO-OFDM is of particular interest. Multicarrier . Obviously as the number of states increases the complexity of decoding increases.the Viterbi algorithm . so this lower bound on states puts a lower bound on the possible complexity. This is equivalent to passing c1 through a frequency selective channel with two taps in the frequency domain: h1 and h2 . Consider a 2 × 1 MIMO system. 7.4. which sequence was sent.to do this is with a Maximum Likelihood Sequence Estimator (MLSE). Trellis Complexity There is a fundamental lower bound to the complexity of a STTC. it sees the channel h1 . The codeword for T transmitted symbols is given by 1 C=√ 2 c1 c2 · · · cT 0 0 c1 · · · cT −1 cT (136) The trellis diagram below represents this code The eﬀect of this code is to convert spatial diversity to frequency diversity. SW94]. There is a well known algorithm .2 Delay-Diversity Scheme This is one of the simplest trellis codes to achieve diversity [Wit93. When c1 is transmitted during the second symbol period. it sees channel h2 . So spatial diversity becomes frequency diversity by applying this code. Single carrier 2. When c1 is transmitted during the ﬁrst symbol period.to eﬃciently estimate. For a STTC with B input bits and minimum rank rmin has at least 2B(rmin −1) states. 7 Space-Time Coding for Frequency Selective Channels There are two basic approaches to MIMO over frequency selective channels as in normal SISO frequency selective channels: 1.

2 MIMO-OFDM MIMO-OFDM is an extension of normal OFDM to the MIMO case where there are multiple antennas. 0 λN but D is not speciﬁc to each H.This complicated system involving a summation can be expressed as a simple system of the form yk = [H[0] · · · H[L − 1]] cT · · · cT k k−L+1 which is similar to the narrowband MIMO case. which can be expressed in matrix form as y[0] h[0] h[N − 1] h[N − 2] · · · h[1] x[0] v[0] y[1] h[1] h[0] h[N − 1] · · · [2] x[1] v[1] = + . . . .2. . . H[k]X[k] results in a circular convolution in the time domain. . Where D is the matrix that performs the DFT. .. . . . 7.. Figure 17: OFDM System Model [OC06] The system model can be expressed in the DFT frequency domain as Y [k] = H[k]X[k] + V [k] (139) with V [k] the corrupting noise. .. . . . . T + vk (138) 7.1 OFDM OFDM uses the FFT and IFFT to decompose the wideband frequency selective channel into several smaller narrowband frequency ﬂat channels. . . 41 . . . . . The matrix Λ is a diagonal matrix speciﬁc to each H λ1 0 λ2 Λ= (140) . . The cyclic preﬁx is added to prevent ISI. y[N − 1] h[N − 1] h[N − 2] h[N − 3] · · · h[1] x[N − 1] v[N − 1] The singular value decomposition of H is DΛD∗ .

2. A large blockwise circulant matrix can represent the eﬀective channel seen by the whole MIMO-OFDM codeword.7. . = . can be removed from the analytical model. 0 H[1] . First. Thus the complicated MIMO-OFDM channel can be regarded as a diagonal channel with the appropriate coordinate change given by the DFT. which is necessary for practical implementation. Given L−1 Hk = l=0 H[l]e−j2π/T kl (146) then ML detection is given by T −1 ˆ X = arg min C k=0 ||yk − Hk ck ||2 (147) 42 . .. to prevent ISI to produce the modiﬁed system y = Hg [Xg X] + v ˜ with the channel matrix given by Hg = H[l − 1] 0 .2 Extension to MIMO-OFDM A MIMO-OFDM system can be modeled like a SISO OFDM system with the channel taps replaced by channel matrices [OC06]. Xg . H[0] 0 ··· 0 H[l − 1] · · · H[1] . 0 · · · 0 H[l − 1] · · · H[1] H[0] Hcp (144) Since Hcp is blockwise circulant. the SVD of Hcp is given by Hcp = D∗ Λcp D (145) where D is the IDFT matrix as usual. start with the frequency selective MIMO channel L−1 yk = l=0 H[l]xk−l + vk (141) Then append a cyclic preﬁx. . H[l − 1] ··· 0 ··· H[0] H[1] 0 H[0] ··· ··· 0 0 (143) (142) H[l − 1] H[l − 2] · · · H[0] As in SISO OFDM the cyclic preﬁx. .

MIMO-OFDM has issues with PAPR and frequency oﬀset estimation. Perform normal Alamouti decoding This idea certainly works. A block diagram for MIMO-OFDM follows below: Figure 18: MIMO OFDM [OC06] 7. For example in the 2 × nr case the Alamouti code can be used on each subcarrier through the following process: 1.Like OFDM. Thus codes designed for fast fading time channels can be applied across the subcarriers. Transmit [−c∗ c∗ ]T on the same tone during the second OFDM symbol 2 1 3.2. Depending on system parameters this may not be a reasonable 43 . but it limits the system. The frequency index k can be reinterpreted as a time domain index. In MIMO-OFDM the same idea can be used to code across the subcarriers [TV05].2. since the channel has to remain static for the duration of two OFDM symbols. Transmit [c1 c2 ]T on a given tone during the ﬁrst OFDM symbol 2.4 Space-Time Coded MIMO-OFDM This is the simplest MIMO-OFDM system with no coding across the subcarriers.3 Space-Frequency Coded MIMO-OFDM For normal OFDM the frequency domain channel coeﬃcients H[k] can be viewed as the channel coeﬃcients in a narrowband fast fading time channel. 7. Instead the OFDM part of the system chops the frequency selective channel into frequency ﬂat channels on which normal space time coding techniques can be applied.

The scheme is then 1. In general all space-time codes discussed before assume the channel is static over the duration of a codeword. MIMO can also be used as a multiple access technique to allow multiple users to seamlessly share the spatial channel. However.assumption. Send [c1 [k] c2 [k]]T 2. However. The collection of antennas at all the mobile users in a cell is regarded as one big antenna array. The fundamental units of transmission are two blocks of length T : c1 [k] and c2 [k]. The typical application of MU-MIMO is in a cellular system with multiple antennas at the base station and only one or two antennas at each mobile [GKHCS07]. Decode like Alamouti except use two independent MLSE estimators. Below are several examples of this idea. form (148) 8 Multiuser MIMO Historically MIMO was developed for use in point to point situations. and frequency. time. in order to actually get the beneﬁts of MU-MIMO the base station needs CSIT or at least partial CSIT. Generalized Delay Diversity This code [GSP02] has matrix c1 c2 · · · cT 0 0 1 0 c1 c2 · · · cT 0 C= √ 2 0 c1 c2 · · · cT 0 0 0 c1 c2 · · · cT This code provides a diversity gain of 3. 7. One of the key advantages of having a distributed array comprised of all the mobiles is that the channel matrix rarely suﬀers from rank deﬁciencies. This can be accomplished with two parallel copies of the Viterbi algorithm. Lindskog-Paulraj Scheme This code [LP00] basically extends Alamouti in a natural way to MIMO-OFDM. 44 . which entails increased complexity. Send [−c∗ [k] c∗ [k]]T 1 2 3. This type of MIMO is called Multiuser MIMO(MU-MIMO).2. so this is a general problem in Space-Time Coded MIMO-OFDM.5 Space-Time Frequency Coded MIMO-OFDM In a Space-Time Frequency Coded MIMO-OFDM system coding is performed over all three available dimensions: space. so spatial multiplexing is almost always possible.

MAC channel. Well known nonlinear precoding methods include perturbation methods and Tomlinson-Harathisma codes [PHS05. Eﬀectively what the coding at the transmitter does is pre-cancel out interference at the receivers like ZF-nulling does in BLAST.For a MU-MIMO system having N transmit antennas at the base station and U users each with Mk antenna the downlink.l=k Wl sk + vk (152) In the case when each user has one receive antenna this problem is identical to canceling interference in BLAST.1.1 Linear Precoding The downlink channel can be written in a simple for making explicit how other users’ streams produce interference. N yk = Hk Wk sk + Hk l=1.2 Nonlinear Precoding Nonlinear precoding is more like DPC than linear precoding and can produce better results at the cost of increased complexity. that will cancel out the interferers [SSH04]. 45 . 8.1. can be modeled as U y= i=1 hk x k + v (150) 8.1 Precoding Information theoretic results have shown that using a type of coding called dirty paper coding(DPC) at the transmitter N users streams can be multiplexed and transmitted [SB07. GC80]. HPS05]. N yk = Hk sk + Hk l=1.l=k sk + vk (151) The simplest form of precoding is to multiply the transmit symbols by a matrix. broadcast channel. 8. So the proper choice for Wk is the kth column of the pseudoinverse of the eﬀective channel matrix H = h1 h1 · · · hN . Wk . for each user can be modeled as N yk = hk l=1 xl + vk (149) The uplink.

5. • Spectrum . 9. 9 MIMO in Wireless Standards Many emerging wireless standards provide for MIMO to provide both diversity and multiplexing gain as needed.2 Scheduling If the number of users U is greater than the number of transmit antennas N . which selects the N users with the best channels. The basic option for the downlink is two antennas at the base station and two at the mobile station. 5. Allowed sizes are 1. Basic results have demonstrated that the gains of MU-MIMO can still be achieved with only partial CSIT.3 Working with Partial CSIT To achieve CSIT each user must feedback its channel estimate to the base station. If the base station has CSIT. which is tricky and reduces capacity [GA04]. The major features of LTE are outlined below [3GPP07. 8. • IP Network .No ﬁxed spectrum size. which entails less system complexity. So at any given time the base station must choose some subset of the users to transmit to [GKHCS07]. then there are two methods it can apply: 46 . then the base station can’t transmit to all the users simultaneously. 2.8. 15. 3GPPRel9]: • High data rates .6. so heurestic methods must be used to choose a subset of users.LTE uses OFDM with a variable number of subcarriers.100 Mbps in the downlink using 2 × 2 MIMO and 50 Mbps in the uplink using no MIMO • Mobility . This section examines three prominent new wireless standards that employ MIMO. The optimal scheduling algorithm is to simply perform an exhaustive search over all possible combinations of users. and 20 MHz.No circuit switched domain but all IP based network.1 3GPP LTE The Third Generation Partnership Project Long-Term Evolution (3GPP LTE) is the emerging 4G standard that is currently being implemented and tested. A simple choice is a greedy algorithm. This is not computationally feasible though. Extensions to LTE allow 4 × 2 and 4 × 4 MIMO.Best performance for 0-15 km/hr and good performance of 15-120 km/hr. 1. The downlink in LTE provides several options for using MIMO. • OFDM . 3GPPRel8.25. 10. To combat this problem some research has been performed into MU-MIMO systems with only partial CSIT.

Since the base station knows the channel at the receiver. Pre-coding SDM .No ﬁxed spectrum size. and 20 MHz. In the uplink MU-MIMO can be used with the proper scheduling. 9.16e.1. IS05]: • High data rates . it can pre-code the transmitted symbols to present interference using the V matrix from the SVD of H.WiMAX uses OFDM with a variable number of subcarriers.75 Mbps in 802. • Spectrum .2 WiMAX WiMAX was originally developed to address the last mile connection to the internet. 2. Without CSIT the base station can use Space-Frequency Block Coding by using the Alamouti code for each tone.Introduced in 802. 10. The baseline case assumes 1 × 2 and the extension is 1 × 4.Use some form of beamforming such as TMRC. • Mobility . It has evolved to provide high data rate mobile data. Range up to 30 miles in 802.16e. 5. 2. Allowed sizes are 1.16e. • OFDM .16d and 30 Mbps in 802. Beamforming .5. The general structure of a WiMAX transmitter is demonstrated in the ﬁgure below: Figure 19: WiMAX Transmitter [AGM05] 47 . The key features of WiMAX are outlined below [IS04.25.

16 standard deﬁnes several options for space-time codes for 2-4 antennas. However. The way 802.16 also provides for space-frequency coding called the Frequency Hopping Diversity Code (FHDC) based on the Alamouti code. the cost of this approach is a loss in range because higher SNR is necessary to successfully demodulate 64-QAM. The feedback is an index into the codebook that tells the receiver. the two most common codes for space-time coding are: S1 S2 ∗ S1 −S2 ∗ S2 S 1 (153) where S1 and S2 are OFDM symbols.3 802. Open Loop (No CSIT) The 802. However. Another alternative is to use a feedback channel and have the receiver transmit a quantized version of the channel. 9. The OFDM symbols are uncoded in time and coded in the frequency domain.11n is the next generation 802. As we have seen before the methods employed depend on whether the transmitter has channel state information or not. 802.11n 802. The transmitter can then design the optimal precoding matrix.11 a/b/g high data rates were achieved by using high order modulation like 64-QAM. One of the common methods used in feedback is codebook based feedback. In 802. The codebook is basically a predetermined set of choices for the Q matrix in BLAST.There are several diﬀerent MIMO methods that can be employed in WiMAX. which matrix to use.11 LAN that seeks to provide very high data rates. The ﬁgure below shows how FHDC works: Figure 20: WiMAX Frequency Hopping Diversity [AGM05] Closed Loop (CSIT) With CSIT the transmitter can make better decisions.11n seeks to overcome this problem and provide both high 48 .

A block diagram demonstrating the operation of 802. MIMO continues to be an active research area with multiuser-MIMO as a new area of great interest for future development.20 MHz (Optionally 40 MHz) • OFDM . which has become a ubiquitous feature of modern wireless standards.11n follows: Figure 21: 802. 49 . ?]: • High Data Rate .11n Transmitter [OC06] The 802. 10 Conclusion MIMO has become a popular technology for emerging wireless standards because it can provide better error performance in the form of diversity gain and better data rates in the form of multiplexing gain without using more bandwidth.11n achieves higher data rates without using more bandwidth or larger constellations by increasing spectral eﬃciency. The receiver architecture is symmetric and is manufacturer speciﬁc. Thus 802.Uses OFDM The basic case for 802. 802. The key features of 802. MIMO works well with OFDM. Then each branch transmits on one antenna.11n standard provides for up to 4 × 4 MIMO. In addition.11n transmitter sends every other group of bits to each OFDM branch.11n are outlined below [IWG04. ?. Each branch performs normal OFDM with spatial subcarrier mapping.11n transmits multiple data streams from the multiple transmit and receive antennas. MIMO is an exciting ﬁeld that looks to be a major part of research and standards in wireless communications for many years to come.data rates and better range is through MIMO-OFDM.130 Mbps typically • Spectrum .11n is 2 × 2 but the 802.

. A. . a2 . . Listed below are several useful facts about rank: 1. 3. . So the rank of A is the largest number of columns of A that constitute a linearly independent set. By the Fundamental Theorem of Algebra this polynomial has n complex roots counting multiplicity. . T A can be written in terms of rows as [a1T . then A is invertible if and only if rank(A) = n.1 Rank Let A ∈ Cm×n . If A is square. . . . . A complex number λ and a complex vector x = 0 are said to be an eigenvalue and its associated eigenvector if Ax = λx By simple rearranging this expression can be written as (A − λI) x = 0 (157) (156) This equations has non-trivial solutions (x = 0) only if A − λI is not invertible. A. . Then A can be written in terms of column vectors as A = [a1 . Thus A has n eigenvalues including multiplicity. rank(A∗ A) = rank(A) 4. rank(A) ≤ min{m. a2T . an ]. The column space of A denoted col(A) is given by col(A) = span(a1 . . am ]T . an ) (154) Then the rank of A denoted rank(A) is deﬁned to be dim col(A). n}.A Math Review This section reviews a few common mathematical tools used in MIMO. . . . Then the row space of A denoted row(A) is given by row(A) = span(a1 . rank(AT ) = rank(A∗ ) = rank(A) 2. In particular. this section covers some important linear algebra topics and Lagrange multipliers for optimization [HJ95]. This is true precisely when det (A − λI) = 0. 50 . a2 . a2 . When the determinant is expanded and evaluated it becomes an nth degree polynomial. an ) (155) With these deﬁnitions it can be demonstrated that rank(A) = dim row(A).2 Eigenvalues and Eigenvectors Let A be a n × n matrix over the complex numbers. . .

2 Connection To The Determinant and Trace The following formula is a useful connection between the eigenvalues of a matrix and its determinant and trace: det(A) = λ1 λ2 · · · λn (158) n tr(A) = i=1 aii = λ1 + λ2 + · · · + λn (159) A. it is possible to perform orthogonal projections of a vector onto a space spanned by several other vectors. The singular value 51 . This can be used in V-BLAST for the ZF-Nulling receiver.4 Singular Value Decomposition It is only possible to diagonalize a square matrix. since the channel becomes n independent parallel channels.3 Inner Product Space Cn×1 is an inner product space with the inner product given by: < x.2. y > with equality for y = Kx for any constant K.A. The matrix S can be interpreted as a change of basis that allows the matrix A to be described as a diagonal matrix. The second point is the Cauchy-Schwarz inequality √ √ (161) | < x. but sometimes it is desirable to decompose a matrix with arbitrary dimensions into another matrix that is almost diagonal. First. The following two conditions are suﬃcient to guarantee that a square matrix is diagonalizable: 1. A has n distinct eigenvalues 2. This representation is particularly nice in a n × n MIMO system. If this is possible. So the main point of interest is determining when A is diagonalizable. y > | ≤ < x. then A is said to be diagonalizable. This transformation makes capacity calculation much easier. The Cauchy-Schwarz inequality can be used to derive the optimal receive combining vector for MRC. A.1 Diagonalization Sometimes A can be related to a diagonal matrix D by A = S−1 DS where S is a n × n invertible matrix.2. y >= y∗ x (160) There are two important points of interest in regarding Cn×1 as an inner product space. x > < y. A has n linearly independent eigenvalues A.

2. and the corresponding singular value σ ∈ C Av = σu (166) This equation is similar to the equation that deﬁnes an eigenvalue and eigenvector.Note that AA† is not in general the identity matrix but the combination of three matrices produces the desired eﬀect. Speciﬁcally the singular value decomposition of a matrix A ∈ Cm×n is A = UΣV∗ where U ∈ Cm×m and V ∈ Cn×n are unitary matrices. A† A ∗ ∗ = AA† = A† A 52 . Then the pseudoinverse is deﬁned to be A† = VΣ† U∗ where Σ† is the transpose of Σ with the non-zero singular values inverted.decomposition achieves this and is deﬁned for all matrices in Cm×n . Typically the matrix Σ is constructed such that Σ11 ≥ Σ22 ≥ · · · ≥ Σnmin nmin (165) Intuitively what the SVD does is use the matrix V to rotate an input vector to a coordinate system in which the action of the matrix can be described by a simple matrix Σ. n} singular values.1 Pseudoinverse The inverse of a matrix is only deﬁned for square matrices. A† AA† = A† 3. a corresponding column of v ∈ Cn . One of the most important properties of the SVD is that the number of non-zero singular values is precisely the rank of the matrix A. Then the output of this simple matrix is rotated back to the original coordinate system by U to produce the output of A. For a column of u ∈ Cm . The entries Σii are called the singular values of A. Σii . Let A ∈ Cm×n have a SVD UΣV∗ . which means UU∗ = U∗ U = Im VV∗ = V∗ V = In (163) (164) (162) Σ ∈ Cm×n has non-zero entries only for the entries on the diagonal. Then it is clear that there are nmin = min{m.4. The concept of the SVD can be viewed as a generalization of eigenvalues. AA† 4. but there is a way to deﬁne a special matrix that is like an inverse but deﬁned for arbitrary m × n matrices called the pseudoinverse. A. AA† A = A . which led to u being called a left singular vector and v a right singular vector. There are four kery properties of the pseudoinverse that deﬁne its behavior: 1.

“Spatial Multiplexing over correlated MIMO channels with a closed-form precoder”. Wireless Commun. September 2007. IEEE Trans. x2 . . [AG05] J. x2 . September 2005. “Overview of 3GPP Release 9 V. Gesbert.5 Lagrange Multipliers Consider the following optimization problem: Maximize : Subject to : f (x1 . .0.2 Condition Number Consider solving the linear system Ax = b. n ∂xi ∂xi g = C (170) Lagrange multipliers can be used to ﬁnd the optimal power allocation to maximize capacity. 4(5):2400-2409. . .4. . . The condition number is a measure of how well this system behaves for small changes in b. . . xn ) = C The optimal solution can be found by solving the following system of equations given by the gradient f = λ g = C g (169) These equations can be expressed in terms of partial derivatives as ∂f ∂g = λ i = 1. “Technical Speciﬁcation Group Radio Access Network Requirements for Further Advancements for E-UTRA (LTE-Advanced) (Release 8)” [3GPPRel9] 3GPP. 2. “Physical Channels and Modulation (Release 8)”.4 (2009-01)” [3GPP07] 3GPP.0. . xn ) g(x1 .A. Akhtar and D. 53 . .. Speciﬁcally the condition number measures how small changes in b change x. So for the perturbed system Ax = b + e the condition number is given by ||A−1 e||/||A−1 b (167) κ(A) = ||e||/||b|| This quantity can be related to the singular values of A by κ(A) = σmax σmin (168) A. References [3GPPRel8] 3GPP. . .

NY.J. on Comm. Gesbert. 234-238 [GC80] A. “Shifting the MIMO Paradigm: From Single User to Multiuser Communications”. Cover and T. January 1999.A. Wiley. March 2002. “How much feedback is multi-user diversity really worth?”. “On Limits of Wireless Communications in a Fading Environment when Using Multiple Antennas. IEEE Trans. IEEE Trans. Belﬁore. El Gamal and T. “Layered space-time architecture for wireless communication in a fading environment when using multi-element antennas”. pp. Muhamed. IEEE J. A.-S.M. 24. Gesbert and M. Alamouti. Vol 6(3). Australia. no. IEEE. Sydney. [DTB02] M. Foschini. G...J. R. Golden. 51(4):1432-1436. R. and E. Dec.Viterbo. [FG98] G. Proc.IEEE Global Telecommunications Conf. Rekaya.N. Gans. [GKHCS07] D. Theory. and R. Inform. “First. 46(6):2027-2044.E. 54 . Foschini. May 2003. Fundamentals of WiMAX [Ala98] S. Autumn 1996. Cambridge Press. Paris. France.-B.-C. Elect. 12. Bell Labs Tech.Tewﬁk.O. 1998. Theory. Cover. Theory.time communication architecture”. Chuah.W. Fleury. Proc.A. 1980 [GD03] H. 49(5):1097-1119.M. “A construction of a space-time code based on number theory”.O. Theory. 5. pages 1894-1899. Select. IEEE Int. [GFVW99] G. October 1998. “Detection algorithm and initial laboratory results using the V-BLAST space. [CKT98] C. pages 41-59. Globecom 1998 .. pp. 1466-1483. Conf.” Wireless Personal Communications. 16(10):1451-1458.H. April 2005. Ghosh. vol. “Capacity of multi-antenna array systems in indoor wireless environment”. [Fle00] B. “A simple transmit diversity technique for wireless communications”. Jr. Wireless Communications. and T. J. Goschini and M. Salzer. Andrews. vol. Belﬁore. 2005. Damen. In Proc. C.M. NewYork. “Multiple user information theory”. Elements of Information Theory. 36-46. March 1998. Chae. Kahn. Inform. 48(3):753-760.C.[AGM05] J. Tse. (ICC). 2007 [Gold05] Andrea Goldsmith.Valenzuela. Kountouris. “Universal space-time coding”. Areas Commun. Thomas. Oct. W.J. volume 4. Damen. Lett. and D. 35(1):14-15.. “The golden code: a 22 full-rate space-time code with non-vanishing determinants”. 68. Gamal and M. [Fos96] G. IEEE Trans. G. [GA04] D. and J.J. [BRV05] J. Heath.and second-order characterization of direction dispersion and space selectivity in the radio channel”. June 2000.. IEEE Trans. Inform.D. no. J. pp. Inform. Alouini. and P. [CT91] T.Wolniansky. June 2004. M. IEEE Signal Processing Magazine. 1991.

Horn and C. and A.. 2806 . 53.IEEE Int. “Ratio Squarer. No.C. Conf. S. 2005. vol. Godara. Peel. part II: Beamforming and direction-of-arrival considerations”. Gore. VT-20. Paulraj. “A quasi orthogonal space-time block code”. Hassibi and B. [Kah54] L. UT.2818. Hochwald. Johnson.6 Part 16: Air Interface for Fixed Broadband Wireless Access System [IWG04] IEEE 802. volume 4. vol. pages 1949-1953. Conf. March 2005 [HV05] B. Part 16: Air Interface for Fixed and Mobile Broadband Wireless Access Systems Amendment 2: Physical and Medium Access Control Layers for Combined Fixed and Mobile Operation in Licensed Bands. ICC 2002 . [Hea01] R.. PhD thesis. IEEE Trans. pp. Technische UniversitatWien. of IRE(Corr. Stanford University. and A. [IS04] IEEE Standard 802. 1971. Herdin. Salt Lake City. UK. 2004 [IWG206] IEEE 802. no. Hochwald.” IEEE Trans. the expected complexity”. May 2001. Topics in matrix analysis.16-2004 [IS05] IEEE Standard 802. August 2004. L. “High-rate linear space-time codes”. pp.A. Sandhu. 1995. 3. pp. IEEE Trans. Aug. 85(8):1195-1245. Comm. 49(1):1-4. Jafarkhani. January 2001. [Jak71] W.).” Proc. Cambridge University Press. ICASSP 2001 .C.. Part 16: Air Interface for Fixed and Mobile Broadband Wireless Access Systems Amendment 2: Physical and Medium Access Control Layers for Combined Fixed and Mobile Operation in Licensed Bands.16e-2005. August 1997. IEEE Transactions on Signal Processing. 53. [Her04] M. C. May 2002. Amendment to IEEE Standard for Local and Metropolitan Area Networks . “On the sphere decoding algorithm: Part I. Kahn.. M. pages 2461-2464. [HJ95] R. [HH01] B. In Proc. November 1954. 537-544. [Jaf01] H.16 Working Group. Swindlehurst.IEEE Int. Vikalo. “Space-Time Signaling in Multi-Antenna Systems”.part II: perturbation”. In Proc. Commun. Nov. [HPS05] B. Cambridge. Jakes.[God97] L. November 2001. 4. [GSP02] D. PhD thesis. Heath. “A vector-perturbation technique for near capacity multiantenna multiuser communication . Speech and Signal Processing. pp. Proceedings IEEE. Hassibi and H. on Veh. “Delay diversity codes for frequency selective channels”. “ A Comparison of Speciﬁc Space Diversity Techniques for the Reduction of Fast Fading in UHF Mobile Radio Systems. Commun. 2006 55 . “Applications of antenna arrays to mobile communications. Acoust. B. Techn.R. Vol. “Non-stationary indoor MIMO radio channels”. Vol. 1074. 42. 81-93. NewYork.16 Working Group.

Proc. Paulraj. L. Sharma and C. and M. Steinbauer.. IEEE Trans.[IWG1106] IEEE 802. “A vector-perturbation technique for near capacity multiantenna multiuser communication . Bonek. 11-15. Wiley. Lindskog and A. NY. [SB07] M. [OC06] Claude Oestges and Bruno Clerckx. 53. 2003. 3(3):257-264. 462-471. Comm.. Alouini. Gore. vol. B. [LP00] E.IEEE Int. IEEE P802.F. Rappaport. McGraw-Hill. Sharif and B.B. Spencer. Swindlehurst. 52.0 Draft Amendment to Standard for Information Technology. 56 . 2001. [Sim01] M.D. In Proc. Journal of Communications and Networks. M. 2000. 2000. Proakis.K. ICC 2000 . IEEE Antennas Propagat. Commun.J. Artech House Press. Digital communications. Simon. 1. Swindlehurst. no. B.Telecommunications and Information Exchange Between Systems-Local and Metropolitan Networks-Speciﬁc Requirements-Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Speciﬁcations: Enhancements for Higher Throughput. 2005 [PNG03] A.. Hochwald. London. “A transmit diversity scheme for channels with intersymbol interference”. 2nd ed. pp. Peel. vol. 51(3):332-335. UK. New Orleans. PhD thesis. “A comparison of time-sharing. 4th ed. Haardt. UK. Parsons. [Par00] J. 43(4):51-63. and E. “Evaluation of average bit error probability for space-time coding based on a simpler exact evaluation of pairwise error probability”. March 2003.11n/D1. Sig. [Pro01] J.G. 2.. [SMB01] M. and D. IEEE Trans. “Digital communications over fading channels: a uniﬁed approach to performance analysis”. NewYork. volume 1.Wiley. R.. A. [SP03] N. “The double-directional radio channel”. Cambridge University Press. pages 307-311. Paulraj. 195-202.part I: channel inversion and regularization”. 1. June 2000. March 2006. Stanford University. Commun. IEEE Trans. Jan.. no.11Working Group.. Papadias. L. pp. August 2001. Introduction to Space-Time Wireless Communications. Prentice Hall. [SSH04] Q. Cambridge. 2007. 55. [Rapp02] T. The mobile radio propagation channel.. no. vol.-S. 2004. A. “Improved quasi-orthogonal codes through constellation rotation”. DPC.S. and A. “Zero-forcing methods for downlink spatial multiplexing in multiuser MIMO channels”. [PHS05] C.K. Molisch. MIMO Wireless Communications. August 2002. Jan. Wireless Communications: Principles and Practices. Sandhu. September 2001. [SA00] M. [San02] S. pp. Feb. Conf. Simon and M. Nabar. Mag. and beamforming for MIMO broadcast channels with many users”. New York. “Signal Design for Multiple-Input Multiple-Output Wireless: A Uniﬁed Perspective”. IEEE Trans. NY. Hassibi. Comm.

44(3):744-765. San Francisco. “Diversity and multiplexing: a fundamental tradeoﬀ in multipleantenna channels”. Cambridge University Press. pages 1941-1945. In Proc. J. July 1999. Boca Raton. IEEE Trans. Zheng and D. May 2005. IEEE Commun.D. [TJC99] V. Seshadri and J. 45(5):1639-1642. B. Wittneben. Calderbank. [Yac93] M. “Minimal non-orthogonality rate 1 spacetime block code for 3+Tx antennas”. Boariu.time codes with PSK modulation”.Wornell.Yacoub. Rep. “A universal lattice code decoder for fading channels”. [TV05] D. Int. October 2004. [Tel95] E.[SSRS03] B.R. and A. Inform. FL. IEEE Trans. 9(5):420. Winters. Inform. Tse and P.. H.R. Theory.422. N. “Full-diversity. and A. August 2005. 1995. 9(8):676-678. Calderbank. Globecom 2003 . 49(10):25962616. Inform.AT&T Bell Labs. UK.. Lett. Commun. “Capacity of multiantenna Gaussian channels” . July 1999. Hottinen. 57 . Xia. “Two signaling schemes for improving the error performance of frequency-division-duplex (FDD) transmission systems using transmitter antenna diversity”. Inform. Inform. May 2003. [XL05] L. [SW94] N. and A. Tse. and Appl. In Proc. pages 429-432. Seshadri. Theory. “Optimal diversity product rotations for quasiorthogonal STBC with MPSK symbols”. Sethuraman. IEEE Trans. Lett. Wireless Information Networks. Tarokh. [VB99] E. 1993. Wang and X. (ISSSTA 2000). [YW03] H. pages 1630-1633. space-time block codes from division algebras”. “Space-time codes for high data rate wireless communication: Performance criterion and code construction”. “Signal constellations for quasi-orthogonal space.Tech. “Structured space-time block codes with optimal diversitymultiplexing tradeoﬀ and minimum delay”. A. Theory.IEEE Int. IEEE Trans. on Spread-Spectrum Tech. Xia. Jafarkhani. Fundamentals of wireless communication. Conf. Cambridge. Theory. CRC Press. Foundation of mobile radio engineering. Su and X.G. high-rate. Symp. Inform. March 1998.IEEE Global Telecommunications Conf. [SX04] W. and V. [WX05] D. Shashidhar. Sundar Rajan. In IEEE 6th Int. December 2003. 2005. IEEE Trans. 49(5):1073-1096. [Wit93] A. Xian and H. “A new bandwidth eﬃcient transmit antenna modulation diversity scheme for linear digital modulation”. Tirkkonen. ICC 1993 . [TSC98] V. Tarokh. September 2000. Viterbo and J. volume 4. Boutros. Liu. 1:49-60. [TBH00] O. Theory. CA. October 2003. 50(10):2331-2347. Theory. 1993.Telatar.W.. “Space-time block codes from orthogonal designs”. IEEE Trans. [ZT03] L.. Viswanath.. 1994. IEEE Commun. “Optimal rotation angles for quasi-orthogonal space.time block codes with full diversity”.A.H. 45(7):1456-1467.Yao andG.