A Tutorial on MIMO

Craig Wilson
EE381K-11: Wireless Communications
Spring 2009
May 9, 2009
Contents
1 Introduction 1
2 Benefits of MIMO 1
2.1 Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.1.1 Union Bound on Probability of Error . . . . . . . . . . . . . . . . . . . . 3
2.1.2 Outage Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Spatial Multiplexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3 Basic Schemes for Multiple Antennas 5
3.1 Channel Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.2 Scalar Rayleigh Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.3 Maximal Ratio Combining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.4 Selection Combining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.5 Equal Gain Combining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.6 Transmit Maximal Ratio Combining . . . . . . . . . . . . . . . . . . . . . . . . 8
3.7 Alamouti Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4 MIMO Channel Modeling and Capacity 10
4.1 Narrowband MIMO Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.1.1 Narrowband MIMO Channel Capacity . . . . . . . . . . . . . . . . . . . 11
4.1.2 Rank and Condition Number . . . . . . . . . . . . . . . . . . . . . . . . 12
4.2 Physical Modeling of MIMO Channels . . . . . . . . . . . . . . . . . . . . . . . 12
4.2.1 LOS SIMO and MISO Channel . . . . . . . . . . . . . . . . . . . . . . . 13
4.2.2 LOS MIMO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.2.3 Geographically Separated MIMO . . . . . . . . . . . . . . . . . . . . . . 15
4.2.4 Two-Ray MIMO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.3 Statistical Modeling of MIMO Channels . . . . . . . . . . . . . . . . . . . . . . 17
4.3.1 Frequency Selective MIMO Channel . . . . . . . . . . . . . . . . . . . . . 19
i
5 Diversity-Multiplexing Tradeoff 20
5.1 Scalar Rayleigh Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.1.1 QAM over the Scalar Rayleigh Channel . . . . . . . . . . . . . . . . . . . 21
5.2 MISO Rayleigh Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.3 MIMO Rayleigh Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
6 Space-Time Coding over Narrowband Channels 23
6.1 Error Motivated Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
6.2 Space-Time Block Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
6.2.1 Linear STBCs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
6.3 Bell Labs Space Time Architectures . . . . . . . . . . . . . . . . . . . . . . . . . 30
6.3.1 V-BLAST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
6.3.2 D-BLAST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
6.4 Space-Time Trellis Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
6.4.1 Trellis Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
6.4.2 Delay-Diversity Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
7 Space-Time Coding for Frequency Selective Channels 40
7.1 Single Carrier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
7.2 MIMO-OFDM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
7.2.1 OFDM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
7.2.2 Extension to MIMO-OFDM . . . . . . . . . . . . . . . . . . . . . . . . . 42
7.2.3 Space-Frequency Coded MIMO-OFDM . . . . . . . . . . . . . . . . . . . 43
7.2.4 Space-Time Coded MIMO-OFDM . . . . . . . . . . . . . . . . . . . . . . 43
7.2.5 Space-Time Frequency Coded MIMO-OFDM . . . . . . . . . . . . . . . . 44
8 Multiuser MIMO 44
8.1 Precoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
8.1.1 Linear Precoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
8.1.2 Nonlinear Precoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
ii
8.2 Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
8.3 Working with Partial CSIT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
9 MIMO in Wireless Standards 46
9.1 3GPP LTE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
9.2 WiMAX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
9.3 802.11n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
10 Conclusion 49
A Math Review 50
A.1 Rank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
A.2 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
A.2.1 Diagonalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
A.2.2 Connection To The Determinant and Trace . . . . . . . . . . . . . . . . . 51
A.3 Inner Product Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
A.4 Singular Value Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
A.4.1 Pseudoinverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
A.4.2 Condition Number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
A.5 Lagrange Multipliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
References 53
iii
1 Introduction
Wireless systems face several challenges including demands for higher data rates, better quality
of service, and increased network capacity while working with limited amounts of spectrum.
Multiple Input and Multiple Output (MIMO) wireless communication systems have become
a hot research topic because they promise to deal with all of these issues by providing both
increased resilience to fading and increased capacity without using more bandwidth or power.
Methods to take advantage of multiple antennas at the receiver or the transmitter were known
from the 1950’s onward. Early methods provided for spatial diversity to improve error per-
formance and beamforming to increase SNR by focusing the energy from an antenna into a
desired direction. In the 1990’s MIMO systems with multiple antennas at both the transmitter
and receiver were proposed. Instead of just using diversity to combat fading MIMO systems
actively take advantage of multipath to work. One of the early seminal works in MIMO was
Telatar’s paper, which demonstrated the potential for improved capacity with no extra spectrum
[Tel95]. Around the same time Bell Labs developed the BLAST architectures, which achieved
high spectral efficiency on the order of 10-20 bits/s/Hz [Fos96]. Also around the same time the
first space-time coding methods were proposed [TSC98]. In the 2000’s MIMO has continued to
be developed and there are now plans to implement MIMO in several new wireless standards
such as 802.11n, WiMAX, and LTE.
This tutorial paper focuses on the following major topics in MIMO:
1. MIMO Channel Modeling and Capacity
2. Diversity-Multiplexing Tradeoff
3. Space-Time Coding and Architectures
4. Space-Time Coding in Frequency Selective Channels
5. Multi-User MIMO and Applications
6. MIMO in Wireless Standards
2 Benefits of MIMO
The two major benefits of MIMO are diversity gain, increased resilience to fading in the form
of better error performance, and multiplexing gain, increased rate of transmission by exploiting
the increased degrees of freedom offered by the spatial MIMO channel. The figure below shows
a simple MIMO setup with n
t
transmit antennas and n
r
receive antennas.
1
Figure 1: MIMO System Concept [Gold05]
2.1 Diversity
Diversity is an attempt to exploit redundancies in the way information is sent to achieve bet-
ter error performance by cleverly using multiple copies of the same signal. Three fundamental
types of diversity are time, frequency, and antenna diversity. Time diversity involves averaging
the fading effects of the channel over time. The simplest example is the repetition code, which
transmits the same symbol multiple times with the transmissions separated by more than the co-
herence time of the channel. The receiver decodes each symbol independently and estimates the
transmitted symbol by majority rule. Frequency diversity exploits the variations in a frequency
selective channel. For example, orthogonal frequency division multiplexing(OFDM) can apply
modulation order adaption to each subcarrier depending on the quality of a given subchannel.
Finally, there are several different types of antenna diversity. The most obvious type is to simply
use multiple antennas. This type of antenna diversity is one of the main focuses of this paper.
The second type of antenna diversity is to use multiple antennas with different polarizations.
The third type of antenna diversity is to use multiple antennas with different non-overlapping
beam patterns.
It is generally of interest to quantify exactly how much diversity a given scheme provides. This
can be done through calculating either the average probability of error or the outage probability.
Both of these expressions can generally be approximated as
SNR
−L
(1)
at high SNR. L is the diversity gain. This diversity gain can be more rigorously defined as
L = − lim
SNR→∞
log(P
e
)
log(SNR)
This is just a formalization of the intuition above that replaces “at high SNR” with a limit. For
the case of n
t
n
r
narrowband MIMO the maximum possible diversity gain is n
t
n
r
, which is
the maximum number of independent copies of the same signal that the receiver sees.
2
2.1.1 Union Bound on Probability of Error
It can be difficult to calculate an exact expression for the probability of error for an arbitrary
modulation, so it is useful to calculate an upper bound on the probability of error. Consider an
arbitrary constellation ( containing M points. Write the constellation as
( = ¦c
1
, c
2
, . . . , c
M
¦ (2)
Let P
e
be the probability of symbol error. Let P
e|c
i
be the probability of symbol error given
c
i
∈ ( was sent. Assuming all symbols are equally likely then
P
e
=
1
M
M

m=1
P
e|cm
(3)
The conditional probability of symbol of error can be expanded as
P
e|cm
= P[c
m
is detected incorrectly [ c
m
was transmitted]
=
M

l=1
l=m
P[c
m
is estimated as c
l
[ c
m
was transmitted]
Computing each of these probabilities is difficult and requires integration over a possibly com-
plicated Voronoi region specific to each type of modulation. A simplifying approximation is the
pairwise error probability(PEP) in which it is assumed for the purposes of calculation that only
c
m
and c
l
are in the constellation. This is denoted P[c
m
→ c
l
]. For complex AWGN
P[c
m
→ c
l
] = Q
_
_
¸
E
x
N
o
[[c
m
−c
l
[[
2
2
_
_
(4)
PEP overestimates the probability of decoding c
m
as c
l
, so
P
e|cm

M

l=1
l=m
Q
_
_
¸
P
N
o
[[c
m
−c
l
[[
2
2
_
_
(5)
3
Then let d
2
min
be the square of the minimum distance between points in the constellation (.
Then
P
e

1
M
M

m=1
M

l=1
l=m
Q
_
_
¸
P
N
o
[[c
m
−c
l
[[
2
2
_
_

1
M
M

m=1
M

l=1
l=m
Q
_
_
¸
P
N
o
d
2
min
2
_
_
=
1
M
M

m=1
(M −1)Q
_
_
¸
P
N
o
d
2
min
2
_
_
= (M −1)Q
_
_
¸
P
N
o
d
2
min
2
_
_
(6)
The Chernoff bound on the Q-function is
Q(x) ≤
1
2
e
−x
2
/2
(7)
So then the probability of error can be approximated as
P
e

M −1
2
e

P
No
d
2
min
4
(8)
This bound is very useful in calculating the diversity gain for simple multiple antenna systems.
2.1.2 Outage Probability
Formally the channel is in outage if the rate of transmission exceeds the channel capacity. The
outage probability is the probability that this situation occurs: P [C < R]. From expressions
for outage probability one can also find the diversity gain in a manner similar to the average
probability of error method. For the typical Gaussian memoryless channel the channel capacity
is C = Blog
2
(1 + SNR). Thus
P [Blog
2
(1 + SNR) < R] = P
_
SNR < 2
R/B
−1
¸
Generally for other channels the outage condition reduces to the SNR being below a certain
threshold. Thus the diversity gain can generally be found from the probability:
P [SNR < γ]
4
2.2 Spatial Multiplexing
Besides providing diversity gain and improved error performance MIMO can also provide in-
creased data rates and spectral efficiency through spatial multiplexing. To see how MIMO
achieves this first consider QAM. The transmitted signal can be expressed as
x
n
(t) = a
n
(t)cos(2πf
c
t) −b
n
(t)sin(2πf
c
t) (9)
assuming the appropriate normalizations have been made. This system has two real degrees of
freedom (1 complex degree of freedom) because independent streams of bits could be transmitted
on the cosine and sine terms. In practice, however, the two independent streams usually come
from one original stream, which is demultiplexed by the symbol mapping operation into two
streams, which are placed on the cosine and sine terms.
Fundamentally MIMO provides increased rates in a similar way by providing even more degrees
of freedom. The degrees of freedom in MIMO come from the multiple antennas transmitting
independent streams. In the 4 4 MIMO case, for example, each transmit antenna can send
an independent stream, which will be received by all four antennas simultaneously, so there
are four degrees of freedom. The maximum possible complex degrees of freedom for MIMO is
min¦n
t
, n
r
¦.
3 Basic Schemes for Multiple Antennas
Now consider a few basic multiple antennas schemes that can provide diversity. The major
assumptions underlying these schemes are whether the receiver has channel information(CSIR)
or the transmitter has channel information(CSIT). CSIR is a pretty common assumption and
can be achieved through several estimation methods. CSIT is trickier to achieve as the receiver
must estimate the channel and feed the estimate back to the transmitter through a feedback
channel in FDD or the transmitter must assume that it sees the same channel as the receiver in
TDD. Feedback entails a cost in terms of lost capacity and bandwidth.
3.1 Channel Models
The channel model for these basic MIMO schemes is a simple extension of the scalar Rayleigh
channel. The channels are now modeled as complex Gaussian vectors with (^(0, I) distribution.
This channel model is justified in terms of physical propagation models in the next section.
5
3.2 Scalar Rayleigh Channel
For comparison consider the scalar Rayleigh channel
y[n] = hx[n] + v[n] (10)
with h¬ (^(0, 1) and v[n]¬ (^(0, N
o
).
For a complex gaussian vector of length m with correlation matrix R the pdf for the vector h
is given by
f
h
=
1
π
m
[ det R[
e
−h

R
−1
h
(11)
Also by the definition of the pdf
_ _

_
1
π
m
[ det R[
e
−h

R
−1
h
dh = 1 (12)
Then the average probability of error for the Rayleigh channel is
P
e
≤ E
_
M −1
2
e

h

hP
No
d
2
min
4
_
=
M −1
2
_
1
π
e

h

hP
No
d
2
min
4
e
−h

h
dh
=
M −1
2
_
1
π
e
−h


1+
P
No
d
2
min
4
«
−1
h
dh
=
M −1
2
_
_
1 +
P
No
d
2
min
4
_
−1
π
_
1 +
P
No
d
2
min
4
_
−1
e
−h


1+
P
No
d
2
min
4
«
−1
h
dh
=
M −1
2
_
1 +
P
No
d
2
min
4
_ (13)
At high SNR
P
e
≈ SNR
−1
(14)
This corresponds to a diversity gain of 1, which is to be expected as there is only one copy of
the signal.
3.3 Maximal Ratio Combining
Consider a system with a single transmit antenna and n
r
receive antennas - the Single-In
Multiple-Out (SIMO) case. This system can be modeled as
y[n] = hx[n] +v (15)
6
This model is basically an extension of the scalar Rayleigh channel to the vector iid Rayleigh
channel. The receiver must take the received parallel signal and estimate the transmitted symbol.
In MRC this is done with a weighted summation of the received branches performed by a complex
vector q.
z[n] = qhx[n] +qv (16)
In this case the SNR can be calculated and bounded with the Cauchy-Schwarz inequality.
SNR =
[qh[
2
P
[q[
2
N
o

[q[
2
[h[
2
P
[q[
2
N
o

[h[
2
P
N
o
(17)
Thus the optimal choice for q is q = h

, which achieves the maximum SNR [Kah54]. This is
effectively a matched filter. Multiplying by the conjugate of the channel co-phases the signals
and then weights the branches by the channel amplitude [Rapp02]. This kind of action is
similar to the RAKE receiver for CDMA. Now to calculate the diversity gain of MRC through
the average probability of error with the union upper bound consider:
P
e
≤ E
_
M −1
2
e

||h||
2
P
No
d
2
min
4
_
= E
_
M −1
2
e

h

hP
No
d
2
min
4
_
Then since R = I,
P
e

M −1
2
_ _

_
1
π
nr
e

h

hP
No
d
2
min
4
e
−h

h
dh
=
M −1
2
_ _

_
_
P
4No
d
2
min
+ 1
_
−nr
π
nr
_
P
4No
d
2
min
+ 1
_
−nr
e
−h


(
P
4No
d
2
min
+1)
−1
I

h
dh
=
M −1
2
_
P
4No
d
2
min
+ 1
_
nr
(18)
Then at high SNR
P
e
≈ SNR
−nr
(19)
From this calculation it is evident that the diversity gain is n
r
- the number of receive antennas
and also the number of copies of the symbol that the receiver sees.
3.4 Selection Combining
This method has the same general setup as MRC but the receiver selects the best receive
antennas with largest [h
i
[ as opposed to combining the signal from all antennas [Jak71]. This
7
method can achieve the same diversity gain as MRC - n
r
. Assuming each branch has amplitude
s
k
, then as outlined in [SA00]
P [max¦s
1
, s
2
, . . . , s
nr
¦ < S] =
_
1 −e
−S
2
_
nr
(20)
So the pdf of s
max
is given by
p
smax
(S) = n
r
2Se
−S
2
_
1 −e
−S
2
_
nr−1
(21)
Then the average received SNR is
SNR = P
nr

i=1
1
i
(22)
It is obvious from this equation that increasing the number of receive antennas provides a
diminishing return. Most of the gain comes from going from one receive antenna to two and
three receive antennas.
3.5 Equal Gain Combining
The branches from each antenna are first co-phased to cancel out the effects of the channel and
then they are simply added together to produce the output. If the channel tap h
i
= α
i
e

i
, then
the co-phasing operation is simply a multiplication of each branch by e
−jθ
i
. Equal gain combining
produces performance similar to MRC and achieves the full diversity gain as demonstrated in
[Yac93], but with a 1-3 dB penalty depending on the exact setup and number of antennas.
3.6 Transmit Maximal Ratio Combining
We have considered the SIMO case and now it is time to consider the Multiple-In Single-Out
(MISO) case with n
t
transmit antennas. With multiple transmit antennas it is important to
keep the total transmit power P constant to allow a fair comparison to the cases with only one
transmit antenna. The question now is if any diversity gain can be achieved and if so how? A
first attempt to achieve diversity gain with multiple transmit antennas is to simply transmit the
same symbol on each branch. However, this method does not achieve any diversity gain. To get
an intuitive explanation for this one can consider the effective channel that the receiver sees to
be
h
1
+ h
2
+ + h
nt

n
t
¬ (^(0, 1) (23)
The
1

nt
normalizes the transmit power. This effective channel behaves like a scalar Rayleigh
channel, which provides no diversity gain beyond the scalar Rayleigh channel.
An approach that does work is Transmit Maximal Ratio Combining (TMRC), which requires
CSIT and is a close analog of MRC. The system can be modeled as
y = hx + v (24)
8
A weighting vector q sends a weighted version of the current symbol x to each antenna. So then
y = hqx + v (25)
Then by a derivation similar to MRC it can be shown that the optimal choice for q is q = h

[God97]. The diversity gain for this scheme is n
t
.
3.7 Alamouti Code
Using TMRC requires CSIT, which entails a host of other problems including delay issues and
channel estimation accuracy issues. However, Alamouti in [Ala98] showed that in the 2 n
r
case it is possible to achieve the full diversity, 2n
t
, without CSIT using a clever transmit scheme
with minimal drawbacks. To get the idea of the Alamouti code consider the 2 1 case. For a
narrowband channel the system can be modeled as
y[n] = h
1
x
1
[n] + h
2
x
2
[n] + v[n] (26)
with h
1
and h
2
the channel coefficients. To transmit two symbols u
1
and u
2
do the following
over two symbol times:
1. During the first symbol time send x
1
[n] = u
1
and x
2
[n] = u
2
.
2. During the second symbol time send x
1
[n + 1] = −u

2
and x
2
[n + 1] = u

1
.
The system can then be written in matrix form as
_
y[n] y[n + 1]
¸
=
_
h
1
h
2
¸
_
u
1
−u

2
u
2
u

1
_
+
_
v[n] v[n + 1]
¸
(27)
The receiver is trying to detect u
1
and u
2
, so it is more convenient to write the system in the
following form obtained by conjugating y[n + 1]:
_
y[n]
y

[n + 1]
_
=
_
h
1
h
2
h

2
−h

1
_ _
u
1
u
2
_
+
_
v[n]
v

[n + 1]
_
(28)
The two columns of the square matrix are orthogonal:
_
h

1
h
2
h

2
−h
1
_ _
h
1
h
2
h

2
−h

1
_
=
_
[h
1
[
2
+[h
2
[
2
0
0 [h
1
[
2
+[h
2
[
2
_
Thus this detection problem can be decomposed into simple scalar detection problems by pro-
jecting the receiver vector y onto each column of the H matrix. Then the received signal for
each symbol that is used for detection is
r
i
= [[h[[
2
u
i
+ ˜ v
i
(29)
9
with ˜ v
i
¬ (^(0, N
o
). In detection the vector channel is decomposed into a scalar Rayleigh
channel for each symbol. The Alamouti code is representative of a larger class of codes call
orthogonal space-time block codes (O-STBCs) that also have easy detection due to orthogonality.
It can be shown that the diversity gain is 2. Since each symbol is transmitted twice, the
transmit power of each antenna must be reduced by 3 dB compared to the single antenna case
to normalize the total power. This power loss hurts detection but no so much as to make the
Alamouti code useless. In fact, there are some advantages to using antennas transmitting with
lower power. At lower power it is easier to find cheap amplifiers that can operate in the linear
region. Also the Alamouti code transmits two symbols over two symbol periods, so its effective
rate of transmission is the same as the original symbol rate.
Finally, the Alamouti code can be extended to the full 2n
r
case by using the same transmission
scheme as the 2 1 case and MRC. This method provides the full 2n
r
diversity gain.
Figure 2: Comparison of Alamouti and MRC Error Performance [OC06]
4 MIMO Channel Modeling and Capacity
In this section we will consider several MIMO channels and the physical meaning behind these
channels. Of particular interest is how the structure of a MIMO channel suggests the gains of
MIMO. For all MIMO channels we will assume the rate of transmission is high enough that the
channel will be slow fading, which is a reasonable assumption in any modern high speed wireless
system.
10
4.1 Narrowband MIMO Channel
First, consider the narrowband MIMO channel in which the channel is modeled as a single
complex coefficient h
ij
between the jth transmit antenna and the ith receive antenna [Gold05].
In this case the system can be modeled with matrices as y = Hx +v. This can be written as
_
¸
_
y
1
.
.
.
y
nr
_
¸
_
=
_
¸
_
h
11
h
1nt
.
.
.
.
.
.
.
.
.
h
nr1
h
nrnt
_
¸
_
_
¸
_
x
1
.
.
.
x
nt
_
¸
_
+
_
¸
_
v
1
.
.
.
v
nr
_
¸
_
(30)
with v
i
¬ (^(0, N
o
). This is a nice mathematical formulation, but it offers little insight into
what constitutes desirable properties for H. The singular value decomposition (SVD), however,
can provide the desired insight. The SVD of H is
H = UΣV

(31)
with U ∈ C
nr×nr
, V ∈ C
nt×nt
, and Σ ∈ R
nr×nt
. Both U and V are unitary, which means
UU

= U

U = I
nr
(32)
VV

= V

V = I
n
t
(33)
Then the system becomes
y = (UΣV

)x +v (34)
Define ˜ y = U

y, ˜ x = V

y, and ˜ v = U

v. Then
˜ y = Σ˜ x + ˜ v (35)
with ˜ v¬ (^(0, N
o
I
nr
), since U

is unitary. Let n
min
= min¦n
r
, n
t
¦. Then the matrix Σ is zero
except on the diagonals where Σ
ii
= σ
i
is the ith singular value of H. In addition by convention,
σ
1
≥ σ
2
≥ ≥ σ
n
min
. This coordinate change transforms the complicated system described by
H into the simple system with independent parallel channels described by Σ.
4.1.1 Narrowband MIMO Channel Capacity
Since the MIMO channel has been decomposed into several parallel channels, the capacity is
easy to compute. The capacity that a MIMO system can support in this case assuming CSIR
and CSIT is
C
sum
=
n

i=1
Blog
2
_
1 +
P
i
σ
2
i
N
0
_
(36)
11
as demonstrated in [CT91, CKT98, Tel95]. The power allocation P
i
can be chosen by trying
to maximize C
sum
subject to the constraint

n
min
i=1
P
i
= P. Lagrange multipliers can be used in
this case to compute the optimal power allocation.

∂P
i
_
n
min

i=1
Blog
2
_
1 +
P
i
σ
2
i
N
0
_
_
= λ

∂P
i
_
n
min

i=1
P
i
_

2
i
(P
i
σ
2
i
+ N
o
) log(2)
= λ
P
i
=
B
λlog(2)

N
o
σ
2
i
(37)
with λ chosen such that

n
min
i=1
P
i
= P. This power allocation method is known as the waterfilling
power allocation. The term
B
λlog(2)
represents the surface of the water and the
No
σ
2
i
term represents
the depth of the water for any singular value.
4.1.2 Rank and Condition Number
Let k be the number of nonzero singular values of H, which is also the rank of H. At high SNR
the waterfilling allocation is close to the uniform power allocation, so
C ≈
k

i=1
Blog
2
_
1 +

2
i
kN
o
_
≈ k log
2
(SNR) +
k

i=1
log
2
_
σ
2
i
k
_
(38)
k is thus the parameter that controls the number of spatial degrees of freedom and hence the
number of independent streams that can be multiplexed [TV05]. So obviously we want k as
large as possible, which is n
min
at most in the case that H has full rank. That the channel
capacity increases linearly in n
min
at high SNR is one of the most attractive features of MIMO.
Jensen’s inequality can give more information about behavior of the capacity with respect to
H.
k

i=1
Blog
2
_
1 +

2
i
kN
o
_
≤ Blog
2
_
1 +
P
kN
o
k

i=1
σ
2
i
_
(39)
This suggests that the quantity

k
i=1
σ
2
i
should be maximized. This is achieved precisely when
all the singular values are roughly equal. In other words
σmax
σ
min
≈ 1. In matrix theory this quantity
is the condition number, κ(H), and a matrix with κ(H) ≈ 1 is said to be well-conditioned. Thus
H should be well conditioned to ensure a large capacity.
4.2 Physical Modeling of MIMO Channels
The major goal of this section is to see how MIMO’s ability to spatially multiplex depends on
the actual propagation environment. Also, this section will examine what must be true of the
propagation to ensure that the rank and condition number criteria are satisfied. All antenna
arrays in this section are assumed to be linear and uniformly spaced.
12
4.2.1 LOS SIMO and MISO Channel
Suppose the antennas are uniformly and linearly spaced by ∆
r
λ
c
where ∆
r
represents the spac-
ing as a fraction of the wavelength. This normalization eliminates many λs from subsequent
equations.
Figure 3: LOS MISO and SIMO [TV05]
13
The impulse responses between the transmit antenna and each receive antenna are
h
i
(τ) = aδ(τ −d
i
/c) (40)
a models the path loss of the propagating wave and the d
i
/c term models the time it takes for a
propagating EM wave to reach the ith receive antenna [SMB01]. At baseband the channel gain
is given by
h
i
= a e
−j2πd
i
/λc
(41)
So then the channel can be modeled with AWGN as
y = hx +n (42)
with h = [h
1
, h
2
, . . . , h
nr
] and w¬ (^(0, N
o
I
nr
). For large d
d
i
≈ d + (i −1)∆
r
λ
c
cos(φ) (43)
Define Ω = cos(φ). Define the following quantity from [Fle00]:
ˆa
r
(Ω) =
1

n
r
_
¸
¸
¸
_
1
e
−j2π∆rΩ
.
.
.
e
−j2π(nr−1)∆rΩ
_
¸
¸
¸
_
(44)
Then the following important identity holds:
ˆa

r
(Ω)ˆa
r
(Ω) =
1
n
r
_
1 e
j2π∆rΩ
e
j2π(nr−1)∆rΩ
¸
_
¸
¸
¸
_
1
e
−j2π∆rΩ
.
.
.
e
−j2π(nr−1)∆rΩ
_
¸
¸
¸
_
=
1
n
r
_
(1) (1) +
_
e
j2π∆rΩ
_ _
e
−j2π∆rΩ
_
+ +
_
e
j2π(nr−1)∆rΩ
_ _
e
−j2π(nr−1)∆rΩ
__
ˆa

r
(Ω)ˆa
r
(Ω) = 1 (45)
Then the channel h can be written as
h = a e
−j2πd/λc

n
r
ˆa
r
(Ω) (46)
as demonstrated in [SMB01]. The channel capacity is
C = Blog
2
_
1 +
P[[h[[
2
N
o
_
= Blog
2
_
1 +
Pa
2
n
r
N
o
_
(47)
as given in [TV05]. Thus there is a power gain and increased capacity potentially but no degree
of freedom gain and so no spatial multiplexing is possible.
The MISO case is similar and involves the use of
ˆa
t
(Ω) =
1

n
t
_
¸
¸
¸
_
1
e
−j2π∆tΩ
.
.
.
e
−j2π(nt−1)∆tΩ
_
¸
¸
¸
_
(48)
14
4.2.2 LOS MIMO
Similarly to the SIMO case the baseband equivalent channel is
h
ij
= ae
−j2πd
ij
/λc
(49)
If d is large then
d
ij
≈ d + (i −1)∆
r
λ
c
cos(φ
r
) −(j −1)∆
t
λ
c
cos(φ
t
) (50)
as shown in [TV05]. Define Ω
r
= cos(φ
r
) and Ω
t
= cos(φ
t
). Then the channel matrix is given
by
H = a

n
t
n
r
e
−j2πd/λc
ˆa
r
(Ω
r
)ˆa

t
(Ω
t
) (51)
In this case H has rank 1 and the only singular value is a

n
t
n
r
. Then the capacity is
C = Blog
2
_
1 +
Pa
2
n
t
n
r
N
o
_
(52)
This is the same result as the SIMO/MISO case: no degree of freedom gain.
4.2.3 Geographically Separated MIMO
Still consider LOS propagation and the narrowband case.
Figure 4: Geographically Distributed Antenna Arrays [TV05]
15
Then the channel between the kth transmit antenna and all the receive antennas is
h
k
= a
k

n
r
e
−j2πd
k
/λc
ˆa
r
(Ω
rk
) (53)
with d
k
the distance between the kth transmit antenna and the first receive antenna [PNG03,
Her04]. ˆa
r
(Ω) is periodic with period 1/∆
r
. Also, the function ˆa
r
(Ω) doesn’t take on the same
value twice in one period, so ˆa
r
(Ω
r1
) and ˆa
r
(Ω
r2
) are linearly independent as long as Ω
r1
−Ω
r2
is
not an integer multiple of 1/∆
r
. In the 2 n
r
case as long as the two angles are not a multiple
of 1/∆
r
the two rows of H are linearly independent and thus H has full rank. Thus in this
case spatial multiplexing is possible. Now what remains to be considered is whether H is well-
conditioned. To determine this consider the angle θ between the two columns of H associated
with the two transmit antennas. This angle satisfies
[ cos(θ)[ = [ˆa

r
(Ω
r1
)ˆa
r
(Ω
r2
)[ (54)
=
¸
¸
¸
¸
sin(πL
r

r
)
n
r
sin(πL
r

r
/n
r
)
¸
¸
¸
¸
(55)
with L
r
= n
r

r
. Then the two singular values are
λ
1
=
_
a
2
n
r
(1 +[ cos θ[), λ
2
=
_
a
2
n
r
(1 −[ cos θ[) (56)
Thus
κ(H) =
¸
1 +[ cos θ[
1 −[ cos θ[
(57)
Thus the matrix is ill conditioned whenever [ cos(θ)[ ≈ 1, which occurs when
[Ω
r

m

r
[ <<
1
L
r
(58)
for some integer m. So basically when the difference between two directional cosines of two
angular paths are within
1
Lr
the receiver can’t distinguish between the two paths. This is similar
to the case in frequency selective channels in which the bandwidth of the system controls which
multipath delays can be resolved.
4.2.4 Two-Ray MIMO
Consider the full MIMO case with antenna arrays at both the transmitter and receiver. Let d
(i)
be the distance between transmit antenna 1 and receiver antenna 1 along path i.
Define
a

i
= a
i

n
t
n
r
e
−j2πd
i
/λc
(59)
Then the channel matrix can be expressed as
H = a

1
ˆa
r
(Ω
r1
)ˆa

t
(Ω
t1
) + a

2
ˆa
r
(Ω
r2
)ˆa

t
(Ω
t2
) (60)
16
Figure 5: Two-Ray MIMO [TV05]
as in [PNG03, Her04]. This expression for the channel can be put in matrix form as
H =
_
a

1
ˆa
r
(Ω
r1
) a

2
ˆa
r
(Ω
r2
)
¸
_
ˆa

t
(Ω
t1
)
ˆa

t
(Ω
t2
)
_
(61)
To ensure H has rank 2 the following two conditions must hold:

t1
,= Ω
t2
mod
1

r
(62)

r1
,= Ω
r2
mod
1

r
(63)
H has rank 2 so spatial multiplexing is possible. To ensure that H is well conditioned it is
necessary that Ω
r2
−Ω
r1

1
Lr
and Ω
t2
−Ω
t1

1
Lt
that is to say there must be sufficient angular
separation at the transmitter and receiver to ensure that the paths can be resolved.
4.3 Statistical Modeling of MIMO Channels
In the case of a frequency selective channel the channel can be modeled as an FIR filter with
taps ¦h[n]¦. In this case not all individual multipath components can be resolved but only mul-
tipath components that differ in delay by a sufficient amount related to the system bandwidth.
In modeling a MIMO channel the interest is not in time resolution of multipath but angular
resolution at the transmitter and receiver [Par00].
Suppose the transmit and receive antenna lengths are L
t
and L
r
. Paths that have Ωs that differ
by less than
1
Lt
at the transmitter or
1
Lr
at the receiver can not be resolved. The term h
ij
is
the aggregation of all paths of angular spacing
1
Lt
about
j
Lt
and angular spacing
1
Lr
about
i
Lr
.
If there are an arbitrary number of paths then the channel is given by
H =

i
a

i
ˆa
r
(Ω
ri
) ˆa

t
(Ω
ti
) (64)
The received and transmitted signals can always be expressed in terms of the follow pair of
17
basis:
o
r
=
_
ˆa
r
(0), ˆa
r
(
1
L
r
), . . . , ˆa
r
(
n
r
−1
L
r
)
_
(65)
o
t
=
_
ˆa
t
(0), ˆa
t
(
1
L
t
), . . . , ˆa
t
(
n
t
−1
L
t
)
_
(66)
which represent the angular bins.
Figure 6: Angular Domain MIMO [TV05]
Each basis can be used to represent transmitted and received signals in the angular domain in
terms of the directional cosine Ω. Let U
t
be the n
t
n
t
matrix with columns from o
t
. If x is a
vector transmitted by the antennas, then in the angular domain x
a
are related by
x = U
t
x
a
, x
a
= U

t
x (67)
By examining the matrix U
t
it can be seen that x
a
is the IDFT of x. Then define y
a
= U

r
y.
In this coordinate system
y
a
= U

r
HU
t
x
a
+v
a
= H
a
x
a
+v
a
(68)
Each element h
a
ij
can be reasonably modeled as independent circularly symmetric complex Gaus-
sian r.v. like the Rayleigh channel. The validity of this assumption rests on two key factors
18
• Amount of scattering and reflection in the multipath environment - this model needs
several multipath components in each angular bin
• The lengths of L
t
and L
r
- Short antenna arrays lump many multipath components into
the same angular bin. A longer antenna array results in better angular resolution of paths
and more non-zero entries in H
a
.
Since U
t
and U
r
are unitary and
H = U
r
H
a
U

t
(69)
H has the same iid Gaussian distribution [CT91]. Thus in the narrowband case the MIMO
channel is basically an extension of the scalar Rayleigh channel where each coefficient of the
channel matrix is a complex Gaussian random variable. In addition, results from random matrix
theory show that H with this distribution has full rank with probability 1. Thus the channel in
this model can support spatial multiplexing.
If there is a strong line-of-sight component, then the fading is not Rayleigh but Ricean.
Antenna Spacing The assumption that the coefficients of H are independent or at least
uncorrelated depends heavily on the antenna spacing. As a rule of thumb antenna spacing of
at least
λ
2
is desirable and results in uncorrelated coefficients [FG98]. As the antenna spacing
increases there is still a diversity gain but it is not quite as large as if the antennas were
spaced further. As the antenna spacing decreases towards
λ
4
the channel coefficients become
strongly correlated. The exact amount of correlation depends on the angular spread of the
antennas. For antennas with small angular spread at separations on the order of
λ
4
or smaller
the coefficients are highly correlated. Since the coefficients are highly correlated the receiver
does not see as many independent copies of the transmitted signal, so the achievable diversity
gain is reduced. In practice the channel coefficients are never completely uncorrelated but
as a simplifying assumption to make analysis tractable we assume they are uncorrelated and
independent.
4.3.1 Frequency Selective MIMO Channel
The extension of the preceding flat MIMO channel model to the frequency selective MIMO
channel model is fairly straightforward. The channel in this case can be modeled as
y[n] =
N

l=1
H
l
x[n −l] +v[n] (70)
as in [TV05]. In this model the channel between any two pairs of antennas is modeled as a
scalar frequency selective channel in which the output is a convolution of the input and the
channel taps. The justification for this model is a straightforward extension of the angular
model outlined in the previous sections.
19
5 Diversity-Multiplexing Tradeoff
A MIMO system can transmit one symbol on all the transmit antennas and use the right pro-
cessing to obtain the full diversity gain n
t
n
r
. On the other hand a MIMO system can transmit
n
min
independent streams to provide the maximum possible rate with the minimum error protec-
tion. The diversity-multiplexing tradeoff involves investigating what happens between these two
extremes and in particular what constitutes the optimal tradeoff. In particular, transmitting at
a given rate what is the maximum possible diversity gain. This kind of analysis leads to a curve
relating the transmit rate and the optimal diversity gain. Of great interest is whether a given
space-time code or modulation can achieve this frontier and thus be optimal.
This tradeoff curve is difficult to compute, but some methods have been proposed to simplify
the study of this tradeoff. Tse and Zheng proposed in [ZT03] studying this tradeoff by making
assumptions on the possible rates of transmission and letting the SNR approach infinity. At
high SNR the MIMO capacity is
C ≈ n
min
log
2
(SNR) (71)
for a channel with full rank. Tse and Zheng assumed that only rates R = r log(SNR) are
possible with r = 0, 1, . . . , n
min
. The optimal diversity gain, d

(r), is the exponent in the outage
probability, so
p
out
≈ SNR
−d

(r)
(72)
Thus it makes sense to define
d

(r) = − lim
SNR→∞
log p
out
(r log SNR)
log SNR
(73)
Alternatively d

(r) can be defined in terms of the probability of error
d

(r) = − lim
SNR→∞
log P
e
(r log SNR)
log SNR
(74)
Before tackling the full MIMO channel it is useful to consider the diversity-multiplexing tradeoff
in scalar and SIMO/MISO channels.
5.1 Scalar Rayleigh Channel
The scalar channel is in outage if the capacity it supports falls below the rate of transmission.
So p
out
is given by
p
out
= P
_
log
_
1 +[h[
2
SNR
_
< r log SNR
¸
= P
_
[h[
2
<
SNR
r
−1
SNR
_
[h[
2
is chi-squared distributed. For sufficiently large , P[[h[
2
< ] ≈ . Thus the outage
probability is approximately
p
out

1
SNR
1−r
(75)
Thus d

(r) = 1 −r is the optimal tradeoff.
20
5.1.1 QAM over the Scalar Rayleigh Channel
It can be demonstrated that for QAM that P
e

2
R
SNR
. Then
d(r) = − lim
SNR→∞
log P
e
log SNR
= − lim
SNR→∞
log
_
2
r log SNR
/SNR
_
log SNR
= − lim
SNR→∞
r log SNR −log SNR
log SNR
= 1 −r (76)
Thus QAM achieves the optimal diversity-multiplexing tradeoff of the scalar Rayleigh channel.
5.2 MISO Rayleigh Channel
In this case the system can be modeled as
y[n] = hx[n] + w[n] (77)
Taking the rate R = r log SNR as usual the outage probability is
p
out
= P
_
log
_
1 +[[h[[
2
SNR
n
t
_
< r log SNR
_
(78)
[[h[[
2
is χ
2n
distributed so the approximation P[[[h[[
2
< ] ≈
nt
can be used. Then the outage
probability is roughly
p
out
≈ SNR
−nt(1−r)
(79)
So it is apparent that the optimal tradeoff d

(r) = n
t
(1 −r).
The Alamouti code effectively decomposes the MISO channel into parallel Rayleigh channel. It
can be easily demonstrated that the optimal tradeoff curve for this parallel Rayleigh channel is
d

(r) = 2(1 − r). So if QAM is used on each of the scalar channels along with the Alamouti
code, then the resulting system is tradeoff optimal for the MISO channel.
5.3 MIMO Rayleigh Channel
The outage probability is given by
p
out
= min
Kx:Tr[Kx]≤SNR
P [log det (I
nr
+HK
x
H

) < r log SNR] (80)
The matrix K
x
is the covariance matrix of the input and basically represents a power allocation.
The power allocation at the transmitter directly affects the SNR at the receiver. This scheme
21
makes a specific assumption about the rate R at a given SNR, so the input covariance matrix
must be chosen not to exceed the limit. The worst covariance matrix K
x
is approximately
1
nt
I
nr
,
so
p
out
= P
_
log det
_
I
nr
+
SNR
n
t
HH

_
< r log SNR
_
(81)
This outage probability can be written in terms of the singular values of H as
p
out
= P
_
n
min

i=1
log
_
1 +
SNR
n
t
σ
2
i
_
< r log SNR
_
(82)
There are no neat approximations to evaluate this outage probability but there is a neat geo-
metric argument to evaluate the outage probability [TV05, ZT03]. First consider r close to 0.
Outage occurs when H is close to 0. Close can be evaluated in terms of the Froebnius norm
[[H−0[[
F
= [[H[[
F
=
n
min

i=1
σ
2
i
=

i,j
[h
ij
[
2
Thus the magnitude of each channel coefficient [h
ij
[ must be close to 0 for the channel to be in
outage.
Now if r is an integer greater than 0 the situation becomes considerably more complicated, since
there are more ways to choose bad λ
i
to put the channel in outage. The situation seems hopeless
but it has been shown by Tse and Zheng that although there are many ways for the channel to
be in outage the most common way is for r eigenchannels to be good and the remained to be
bad. In this case H has rank r and H is in the space 1
r
of rank r matrices in the space C
nt×nr
.
So the question of whether H puts the channel in outage is the question of whether H is close
to 1
r
in the appropriate sense.
This question is tractable but also a little tricky, since 1
r
is not a linear space. The following
paragraph is very technical but the fundamental result is simple: 1
r
can be considered to be a
linear space in a sufficiently small neighborhood. To see that 1
r
is not linear consider that if 1
r
were a linear space, then 0 ∈ 1
r
. But clearly 0 has rank 0, so 0 / ∈ 1
r
. Thus 1
r
is not a linear
subspace of C
nt×nr
. However, although it turns out that 1
r
may not be a linear subspace, it is a
manifold embedded in C
nt×nr
. A manifold is a space with the property that small neighborhoods
of a point look like linear subspaces of R
k
or C
k
. For example, the surface of Earth is a manifold
since a small neighborhood looks like a portion of R
2
even though the overall space is clearly
not linear. The question of interest is what happens when H is close to 1
r
, so it is sufficient to
consider a small neighborhood N of a point of 1
r
containing H. N looks like a linear subspace
of C
nt×nr
. For the remainder of this argument restrict our consideration to N.
Since 1
r
can be considered locally linear, the notion of orthogonality can be used. Then H can
be decomposed into a portion in 1
r
and a portion in the space 1

r
, which is orthogonal to 1
r
. If
the portion of H in 1

r
vanishes, then H is basically in 1
r
, H has rank r, and so the channel is
22
in outage as discussed before. The probability that the portion of H in 1

r
vanishes(the outage
probability) is SNR
−d
, where d is the dimension of 1

r
. If H is of rank r, then r rows of length
n
t
can be chosen and the remaining n
r
rows can be written as linear combinations of the first
r rows. From this it follows that dim1
r
= n
t
r + (n
r
− r)r. Since V
r
and V

r
decompose the
n
t
n
r
space,
n
t
n
r
= dimC
nt×nr
= dim1
r
+ dim1

r
Thus
dim1

r
= n
t
n
r
−(n
t
r + (n
r
−r)r) = (n
t
−r)(n
r
−r) (83)
Thus p
out
≈ SNR
−(nt−r)(nr−r)
and so the optimal tradeoff is given by d

(r) = (n
t
−r)(n
r
−r) for
r = 0, 1, . . . , n
min
.
Figure 7: Diversity-Multiplexing Tradeoff For MIMO [TV05]
6 Space-Time Coding over Narrowband Channels
There are two major types of space-time codes: block codes and trellis codes. There names
imply their structures, which are derived from the similar structures in the single antenna case.
The basic idea of a space time block code is to map Q symbols into a block of transmitted
symbols of size n
t
T for some integer T. A trellis code is a convolutional code in which the
current output depends on a block of input bits and the previous input bits represented by the
state of the trellis code.
One general assumption on almost all space-time codes is the quasi-static assumption, which
assumes that the channel remains constant over the duration of a code. The rate at which the
23
Figure 8: Space-Time Encoder Structure
channel changes is related to the coherence time, which is in turn related to the Doppler spread.
The system must be designed to ensure that the duration of a codeword is less than the coherence
time. The channel can change between codewords, but not in the middle of codewords.
6.1 Error Motivated Design
It is important and interesting to find conditions that will guarantee a good error performance
for a space-time code. One approach to finding these conditions for the slow fading MIMO
channel is to consider what factors affect ML decoding of the codewords. The optimal way to
detect a codeword is with ML detection is given by
ˆ
( = arg min
C∈C
[[¸ −H([[
2
(84)
The operation of this detector is limited mainly by the closest pair of codewords. If two code-
words are close together, then noise can lead to incorrect estimation of a codeword as another
codeword. Then the error probability of interest is the paired error probability(PEP) that a
codeword C is incorrectly decoded as E. Conditioning on the channel matrix H the PEP is
[Pro01]
P[C → E[H] = Q
_
_
¸
¸
¸
_
SNR
2
T

k=0
[[H(c
k
−e
k
)[[
2
F
_
_
(85)
Averaging over all channel realization gives the average PEP: P[C → E]. In a way similar to
the diversity-multiplexing tradeoff the diversity gain, d
g
can be defined in terms of the PEP as
d
g
= − lim
SNR→∞
log P[C → E]
log SNR
(86)
Generally at high SNR the PEP is of the form (c SNR)
−dg
. The quantity c improves per-
formance and is called the coding gain. A good space-time code should then achieve a high
diversity gain and a high coding gain.
The relevant question now is how to achieve diversity and coding gains. The covariance of two
24
codewords C and E is the matrix
˜
E = (E−C)(E−C)

. Then the PEP is given by [SA00, Sim01],
P[C → E] =
1
π
_
π/2
0
_
det
_
I
nt
+
SNR
4 sin
2
β
˜
E
__
−nr

=
1
π
_
π/2
0
rank(
˜
E)

i=1
_
1 +
SNR
4 sin
2
β
λ
i
_
−nr


rank(
˜
E)

i=1
_
1 +
SNR
4
λ
i
_
−nr
with the second expansion due to expressing the determinants in terms of the eigenvalues λ
i
(
˜
E)
and the last expansion valid at high SNR. This expression can be further bounded to yield
P[C → E] ≤
_
SNR
4
_
−nrrank(
˜
E)
_
_
rank(
˜
E)

i=1
λ
i
_
_
−nr
(87)
Thus the diversity gain is n
r
rank(
˜
E) and the coding gain is

rank(
˜
E)
i=1
λ
i
. Given these two gains
there are two criterion for a good space-time code at high SNR are as follows [TSC98]:
• Rank Criterion - Maximize the minimum rank of the codeword difference matrix to
achieve a good diversity gain always:
max
_
_
min
C,E∈C
C=E
rank(
˜
E)
_
_
(88)
• Determinant Criterion - Maximize the product of the nonzero eigenvalues to achieve
coding gain
d
λ
= min
C,E∈C
C=E
_
_
rank(
˜
E)

i=1
λ
i
_
_
(89)
In the case where the codeword matrix always has full rank this becomes maximize
d
λ
= min
C,E∈C
C=E
det
˜
E (90)
These criteria guarantee good codes at high SNR.
6.2 Space-Time Block Codes
A space-time block code(STBC) maps a block of Q input symbols into a block of symbols of
size n
t
T to be transmitted on the antennas. A quantity of interest is the effective symbol rate
of the code:
r
s
=
Q
T
(91)
25
For r
s
= 1 the system effectively transmits one symbol per symbol period. For r
s
< 1 the system
on average transmits less than one symbol per symbol period. Codes with r
s
< 1 effectively
reduce the rate of transmission.
6.2.1 Linear STBCs
There are many different classes of space-time block codes, but one of the most common is the
linear block code. The codeword of the linear block matrix can be expressed as a linear function
of complex n
t
T basis matrices φ
q
and input symbols c
1
, c
2
, . . . , c
Q
as follows [HH01]:
( =
Q

q=1
φ
q
'¦c
q
¦ + φ
q+C
·¦c
q
¦ (92)
It may seem a little odd to break up the real and imaginary components of the symbols, but the
advantage of this approach is that conjugation of symbols can be used in linear STBCs. The
following example with the Alamouti code shows that this is possible.
Example: Alamouti code The two complex symbols c
1
and c
2
are mapped into the following
matrix, which represents the Alamouti code:
_
c
1
−c

2
c
2
c

1
_
(93)
Then the code can be represented with basis matrices as:
φ
1
=
_
1 0
0 1
_
φ
2
=
_
0 −1
1 0
_
φ
3
=
_
1 0
0 −1
_
φ
4
=
_
0 1
1 0
_
(94)
Code Design Criteria for Linear STBCs As we saw in the previous section minimizing
the worst PEP is a good strategy to develop a good space-time code. In the case of linear
STBCs if the basis matrices are unitary meaning φ

φ = I
nt
if T ≤ n
t
(Tall matrix) or φφ

= I
T
if T ≥ n
t
(Wide matrix), then the PEP condition is
φ
q
φ

p
+ φ
p
φ

q
= 0 q ,= p (Wide) (95)
φ

q
φ
p
+ φ

p
φ
q
= 0 q ,= p (Tall) (96)
Orthogonal STBCs There are a special class of linear STBCs that have special orthogonality
property that leads to easy decoding [TJC99]. An orthogonal STBC has codewords ( that satisfy
the following key property
((

=
T
Qn
t
_
Q

q=1
[c
q
[
2
_
I
nt
(97)
26
This property is very nice because it implies that easy decoding is possible due to the or-
thogonality. The key example of an O-STBC is the Alamouti code, which works on complex
constellations. It is clear in the case of Alamouti that it takes two symbol times to transmit
two symbols, so the transmit rate r
s
= 1. However, it turns out that the Alamouti code is the
only O-STBC that works on complex symbols that achieves a transmit rate r
s
of one symbol
per second. For more than two transmit antennas, r
s
< 1 always. If r
s
<
1
2
then it is always
possible to find an O-STBC that achieves good diversity. For a purely real constellation it is
always possible to find a real O-STBC for an n
t
that achives r
s
= 1. However, this is not very
useful as many constellations such as QAM are complex.
The diversity multiplexing tradeoff for O-STBCs is given by [OC06] as
d

(g
s
) = n
t
n
r
(1 −
g
s
r
s
) g
s
∈ [0, r
s
] (98)
for QAM constellations.
Quasi Orthogonal STBCs O-STBC achieve full diversity but at the expense of any spatial
multiplexing. Quasi Orthogonal STBCs (QO-STBCs) attempt to achieve some of the benefits
of O-STBCs while also providing for some spatial multiplexing by using smaller O-STBCs as
building blocks. For example a QO-STBC could be
Q(c
1
, . . . , C
2Q
) =
_
O(c
1
, . . . , c
Q
) O(c
Q+1
, . . . , c
2Q
)
O(c
Q+1
, . . . , c
2Q
) O(c
1
, . . . , c
Q
)
_
(99)
were each O is a codeword matrix for a smaller O-STBC on only Q input symbols [TBH00]. If
the O represent Alamouti codewords, then the codeword matrix is
Q(c
1
, c
2
, c
3
, c
4
) =
1
2
_
¸
¸
_
c
1
−c

2
c
3
−c

4
c
2
c

1
c
4
c

3
c
3
−c

4
c
1
−c

2
c
4
c

3
c
2
c

1
_
¸
¸
_
(100)
Then during decoding the codeword matrix is multiplied by its conjugate, which yields
QQ

=
1
4
_
¸
¸
_
a 0 b 0
0 a 0 b
a 0 b 0
0 a 0 b
_
¸
¸
_
(101)
where
a =
4

q=1
[c
q
[
2
b = c
1
c

3
+ c
3
c

1
−c
2
c

4
−c
4
c

2
27
The codeword matrix doesn’t nicely decouple like in the case of O-STBC, but at least the
first/third and second/fourth columns can be decoded separately, which greatly reduces com-
plexity. Other combinations of O-STBCs have been proposed including the following Alamouti
like scheme [Jaf01]
Q(c
1
, . . . , C
2Q
) =
_
O(c
1
, . . . , c
Q
) −O(c
Q+1
, . . . , c
2Q
)

O(c
Q+1
, . . . , c
2Q
) O(c
1
, . . . , c
Q
)

_
(102)
Decoding with this scheme has complexity similar to the previous case of QO-STBCs.
Rotated QO-STBCs Because of the way quasi orthogonal matrices are constructed if two
codewords E and C each contain one point from the constellation, then det(
˜
E) = 0, which means
the QO-STBC fails the rank condition. This implies that in some cases QO-STBCs will have
bad diversity gain. A way to improve on this is to use rotated variations of the base constellation
to prevent rank deficiencies and achieve good diversity gain [SP03, SX04, WX05, XL05].
Linear Dispersion Codes The BLAST architecture achieves high multiplexing gain at the
expense of diversity gain. O-STBC in contrast achieve high diversity gain at the expense of
multiplexing gain. Linear dispersion codes(LDC) try to achieve a little of both. LDCs are
derived through numerical optimization to determine, which basis matrices are optimal relative
to some criteria that balances diversity and multiplexing gain. There have been several LDCs
proposed including
1. Hassibi and Hochwald LDCs [HH01]
2. Heath and Sandhu LDCs [Hea01, San02]
Algebraic STBCs The Alamouti code works by transmitting two symbols and then their
conjugates arranged in the appropriate way. Algebraic codes also transmit a symbol twice, but
instead of transmitting a conjugate transmit a rotated version of the first set of symbols. In
terms of the codeword marix this can be written as
C =
_
u
1
φ
1/2
v
1
φ
1/2
v
2
u
2
_
(103)
with
_
u
1
u
2
_
= M
1
_
c
1
c
2
_ _
v
1
v
2
_
= M
2
_
c
3
c
4
_
(104)
M
1
and M
2
are unitary matrices and the constellation points come from QAM that represent the
rotations. Fundamentally designing an algebraic code comes down to choosing the appropriate
matrices M
1
, M
2
, and φ.
28
B
2,φ
code In this code [DTB02]
M
1
= M
2
=
1
2
_
1 e

1 e
−jω
_
(105)
and ω is chosen by numerical optimization to fit the given constellation. Finally, φ = e

.
Threaded Algebraic Space-Time Code(TAST) This code [GD03] is similar to the B
2,φ
code
M
1
= M
2
=
1
2
_
1 e
jπ/4
1 e
−jπ/4
_
(106)
Tilted QAM This code [YW03] is given by
M
i
=
1

2
_
cos ω
i
sin ω
i
−sin ω
i
cos ω
i
_
(107)
This choice of M
i
is literally a rotation matrix that rotates points about the origin by ω
i
radians.
Optimization methods can be used to find φ.
Golden Code This code [BRV05] is given by
M
1
=
1

10
_
α αθ
α αθ
_
(108)
M
2
=
1

10
_
1 0
0 j
_
(109)
α and θ are chosen in terms of the golden ratio
1+

5
2
and the constellation.
The figure below shows how these space-time codes compare to the optimal diversity-multiplexing
tradeoff:
29
Figure 9: Diversity-Multiplexing Tradeoff For Several Techniques [OC06]
The figure below shows the error performance of several space-time codes:
Figure 10: Error Performance For Several Techniques [OC06]
6.3 Bell Labs Space Time Architectures
The sections on the MIMO channel have demonstrated that MIMO can provide both a de-
gree of freedom gain (increased capacity) and a diversity gain (better error performance). The
Diagonal and Vertical Bell Labs Space Time Architectures (D-BLAST/V-BLAST) suggest gen-
eral architectures to achieve the gains of MIMO. The general idea of the BLAST architectures
30
is to multiplex several streams of symbols (possibly demultiplexed from one original stream)
onto the multiple antennas and then receive and decode the streams. Historically, G. Foschini
suggested the D-BLAST architecture first and then V-BLAST was developed later as a simplifi-
cation. However, logically it makes more sense to present V-BLAST first and then discuss how
D-BLAST is logically an extension of V-BLAST.
6.3.1 V-BLAST
The general architecture of V-BLAST is described in the figure below [GFVW99]
Figure 11: VBLAST Architecture
The independent streams are multiplexed by the matrix Q onto the transmit antennas. At the
receiver the streams are decoded jointly or individually. In V-BLAST there is a large degree of
freedom in choosing the exact receiver structure. The choice of receiver structure affects error
rates, capacity, and the complexity of decoding. The design of efficient V-BLAST receivers is
an active area of research.
There are two natural choices for Q depending on whether there is CSIT or not. If there is
CSIT, then the matrix V from the SVD of H can be used. At the receiver the received vector y
is multiplied by the matrix U from the SVD of H. These actions create an equivalent channel
model:
˜ y = Σ˜ x + ˜ v (110)
The complex MIMO channel is reduced to several parallel scalar channels with each subchannel
carrying one stream. The action of Q is to rotate the input streams, so that the action of
the channel can be expressed in a simple form. Sometimes a system provides a codebook of
Q matrices that the transmitter can use. The feedback from receiver is just an index into the
codebook that tells the transmitter, which Q to use. This form of feedback massively reduces
the required bandwidth in the feedback channel.
If there is not CSIT, then the situation is considerably more complicated and interesting. In this
case the best choice for Q is simply the identity matrix I
n
t
. In this case the choice of receiver
is an interesting problem and there are many choices all with different choices.
V-BLAST Receiver Structures There are two general steps in the V-BLAST receiver. The
first is demodulation in which the receiver estimates what symbol was sent and hence which bits
were sent. The next step is decoding in which any codes that were applied to individual streams
31
are decoded. Basically any convolutional and block code can be applied to individual stream, so
we are primarily interested in different architectures for demodulation. The optimal V-BLAST
receiver is the ML-receiver that jointly decodes the streams. The ML receiver estimates the
transmitted streams by the rule [TV05]
ˆs = arg min
s∈C
[[y −Hs[[
2
(111)
Practically what this method does is pick the closest point to the received vector in the lattice of
points formed by Hs, where s is a point in the original constellation. This problem is known as
the integer least squares problem. Although this method is optimal it is computationally complex
(NP-hard) as it must be performed over all possible transmit vectors. This computational
complexity generally makes it infeasible to use an ML detector.
Sphere Decoding Although the ML detector is basically computationally infeasible in many
practical system there has been considerable interest in algorithms that are similar to ML-
detection in methodology and performance but with considerably less complexity. In addition,
these ML-like algorithms can feed soft decisions to the decoders to improve their performance.
Sphere decoding is one such algorithm [VB99].
The basic idea behind sphere decoding is to look only at points within a sphere of radius d about
the received vector and then choose the closest point inside the sphere [HV05]. If the sphere
actually contains any points, then obviously it must contain the closest point, which is what
ML detection would pick as an estimate of the transmitted lattice point. In this case sphere
decoding agrees with ML detection.
Figure 12: Idea Behind Sphere Decoding [HV05]
This process reduces the search space and necessary number of computations. In addition, since
the transmitted vector is corrupted by AWGN, the actual transmitted lattice point is likely to
be close by the received vector and in the sphere. Then there are two key problems that sphere
decoding has to deal with [HV05].
1. How to find lattice points inside the sphere?
The detector can not compare the received vector to every point in the lattice to find
the points inside the sphere or it would be performing an exhaustive search offering no
advantage over normal ML-detection.
32
2. How to choose the sphere radius?
If d is too large, then the detector considers too many points. If d is too small, then the
sphere may contain no points. One way to choose the sphere radius is to compute the
Babai estimate for the transmitted symbol ˆ s
B
. This estimate is not actually a point in
the lattice, but the least squares solution (not constrained to the lattice) given by
ˆ s
B
= arg min
s
[[y −Hs[[
2
(112)
Then choose d = [[y −H ˆ s
B
. There are other heurestic methods to choose d.
A solution to problem number one above is based on a simple observation: the problem is
difficult in general but easy in one dimension. In one dimension the sphere is simply an interval,
so the problem reduces to finding the lattice points inside this interval. Now the algorithm
proceeds inductively by assuming that all k-dimensional points within the sphere of radius d
have been found. Then the set of k+1-dimensional points that lies within radius d is an interval,
which is the easy one-dimensional problem. This process continues until the full dimension of
the search space is reached. This process is usually visualized as a tree where the kth level of
the tree corresponds to the points of dimension k inside the sphere of radius d.
Figure 13: Tree for Sphere Decoding [HV05]
To see exactly how sphere decoding works suppose the lattice we are working on is the integer
lattice Z
m
[HV05]. Fix a sphere radius d. Suppose the channel matrix H ∈ R
n×n
and that
n ≥ m. The goal is to find the points s ∈ Z
m
such that
[[y −Hs[[
2
≤ d
2
(113)
where y is the received vector. The algorithm proceeds first by calculating the QR factorization
of the matrix H:
H = Q
_
R
0
(n−m)×m
_
(114)
33
where Q is an n n orthogonal matrix and R is an m m upper triangular matrix. This
decomposition will make later calculations simpler. Expand the orthogonal matrix Q as
Q =
_
Q
1
Q
2
¸
(115)
with Q
1
∈ R
n×m
and Q
1
∈ R
n×(n−m)
. Then the points inside the sphere satisfy:
d
2
≥ [[y −
_
Q
1
Q
2
¸
_
R
0
(n−m)×m
_
s[[
2
= [[
_
Q

1
Q

2
_
y −
_
R
0
_
s[[
2
= [[Q

1
y −Rs[[
2
+[[Q

2
y[[
2
This expression can be rearranged to the condition:
d
2
−[[Q

2
y[[
2
≥ [[Q

2
y −Rs[[
2
(116)
Define
˜
d
2
= d
2
−[[Q

2
y[[
2
and ˜ y = Q

1
y. Then the condition to be in the sphere is given by
˜
d
2
≥ [[˜ y −Rs[[
2
(117)
=
m

i=1
_
˜ y
i

m

j=i
R
ij
s
j
_
2
(118)
The sum can be written term by term as
˜
d
2
≥ (˜ y
m
−R
mm
s
m
)
2
+ (˜ y
m−1
−R
m−1,m
s
m
−R
m−1,m−1
s
m−1
)
2
+ (119)
We observe that the first term depends on only ¦s
m
¦, the second term depends on only
¦s
m−1
, s
m
¦ and so on. Then the following is a necessary condition for any point s to be in
the sphere:
˜
d
2
≥ (˜ y
m
−R
mm
s
m
)
2
(120)
Basically the last coordinate of s must be within
˜
d of ˜ y. Finding the integers that satisfy this
necessary condition is easy; they are simply the integers
¸

˜
d + ˜ y
m
R
mm
| ≤ s
m
≤ ¸
˜
d + ˜ y
m
R
mm
| (121)
The key step in this process is how to proceed from finding the s
m
in the sphere to finding which
¦s
m−1
, s
m
¦ are in the sphere. This is done by ensuring the first two terms in equation 119 are
less than
˜
d
2
:
˜
d
2
≥ (˜ y
m
−R
mm
s
m
)
2
+ (˜ y
m−1
−R
m−1,m
s
m
−R
m−1,m−1
s
m−1
)
2
(122)
To make use of this condition proceed as follows: For each s
m
define
˜
d
2
m−1
=
˜
d
2
−(y
m
−R
mm
s
m
)
2
(123)
34
Then we can obtain a condition that s
m−1
must satisfy to be in the sphere:
¸

˜
d
m−1
+y
m−1
−R
m−1,m
s
m
R
m−1,m−1
| ≤ s
m−1
≤ ¸
˜
d
m−1
+y
m−1
−R
m−1,m
s
m
R
m−1,m−1
| (124)
By applying this method to each s
m
the points ¦s
m−1
, s
m
¦ inside the sphere of radius d can be
found. This process can be continued until the full m-dimensional problem has been solved. It
is clear why a tree is an appropriate structure to represent the operation of sphere decoding,
since each leaf gives rise to some number of children (possibly zero) in the next iteration all
of whom are inside the sphere as one more dimension of the problem is solved. It is also clear
that if we choose the radius to be too small one of the conditions like equation 124 may not be
satisfied by any integer and thus no points are in the sphere. If the sphere radius is too large,
then too many points may satisfy equation 124 making computing the closest point tricky.
Non-Joint Detection Besides joint detection there are a wealth of detectors that work on
detecting individual streams from the received signal and don’t attempt to decode all the streams
simultaneously. Consider trying to decode one stream x
k
. The system in this case can be
modeled as
y[n] = h
k
x
k
[n] +

i=k
h
i
x
i
[n] +v[n] (125)
where h
i
is the ith column of the channel matrix H. In this system there is a stream of interest
plus several interfering streams represented by the sum terms plus a noise terms. To successfully
decode the stream of interest the receiver must deal with the interference term and the noise
term.
Zero Forcing Nulling At high SNR performance will be interference limited not noise limited
[GFVW99]. ZF-Nulling attempts to remove all the interfering terms in the sum to leave only the
stream of interest. This can be done linearly with a single vector multiplication. The weighting
vector q
k
to decode the kth stream satisfies
q
T
k
h
j
= δ
kj
(126)
where δ
kj
is the Kronecker delta which is 1 when k = j and 0 otherwise. Then
q
T
k
y[n] = q
T
k
h
k
x
k
[n] +

i=k
q
T
k
h
i
x
i
[n] +q
T
k
v[n]
= δ
kk
x
k
[n] +

i=k
δ
ki
x
i
[n] +q
T
k
v[n]
= x
k
[n] +q
T
k
v[n]
This weighting vector has an obvious geometric interpretation; the weighting vector projects the
received vector y onto a subspace orthogonal to h
1
, . . . , h
k−1
, h
k+1
, . . . , h
nt
.
35
Figure 14: Zero Forcing Nulling MIMO
The weighting vectors are just the columns of the pseudoinverse of Hgiven by H

= (H

H)
−1
H

,
so it is not too difficult to compute the appropriate weighting vectors given the channel matrix
H. It is easy to calculate the SNR out for each stream using weighting vectors as
SNR
k
=
P
[[q
k
[[
2
N
o
(127)
ZF-Nulling with Successive Interference Cancellation The SNR has an inverse relation
to [[q
k
[[
2
, so if [[q
k
[[
2
can be reduced the SNR will be increased. Results from linear algebra
indicate that the higher the dimension of the space that q
k
must be orthogonal to the larger
[[q
k
[[
2
is. So if q
k
must be orthogonal to fewer vectors, then [[q
k
[[
2
will be reduced. Successive
interference cancellation(SIC) can reduce the dimension and increase the SNR. The diagram
below shows the operation of SIC.
Figure 15: Successive Interference Cancellation [TV05]
With this scheme as each stream is decoded it is subtracted from the received vector. As a
result the subtracted scheme does not interfere with any subsequent streams. So then q
k
must
be orthogonal to h
k+1
, . . . , h
nt
. The reduced number of vectors means [[q
k
[[
2
is reduced and
SNR
k
is increased.
36
One practical issue when implementing SIC is the order of cancellation. The last decoded stream
has the least interference and achieves the best performance. It has been demonstrated that a
greedy choice of order is optimal relative to the maximin criteria [GFVW99]. This means that
the kth stream to be decoded should be chosen from the reaming streams as the one that will
achieve the highest SNR of the remaining streams if it is decoded now. The maximin criteria
means that the smallest SNR
k
is maximized by choosing the optimal order.
The major drawback to SIC is error propagation. Mistakes at the beginning of the decoding chain
can introduce mistakes later on. So if one stream is inaccurately decoded, then all subsequent
streams will likely be decoded inaccurately.
Matched Filter At very low SNR noise is the problem, so a matched filter can be used to
deal with the noise. In the MIMO case the matched filter for each stream is simply maximum
ratio combining(MRC) performed on the appropriate column of H.
MMSE Receiver The matched filter performs well at low SNR and ZF-nulling performs well
at high SNR. But at high SNR the matched filter has bad performance and at low SNR ZF-
nulling has bad performance. So naturally one may wonder if there is a receiver that operates
well at both low and high SNR. The MMSE receiver is such a receiver [TV05].
To understand how the MMSE receiver works consider the following SIMO system modeled as
y = hx +z (128)
with z colored noise having invertible correlation matrix K
z
. The first operation is to whiten
the noise by multiplying by K

1
2
z
. Then the system becomes
K

1
2
z
y = K

1
2
z
hx +K

1
2
z
z (129)
Then apply a matched filter (K

1
2
z
h)

to yield the system
h

K
−1
z
y = (h

K
−1
z
h)x +h

K
−1
z
z (130)
Thus the receiver simply multiplies the received signal by h

K
−1
z
and performs normal demod-
ulation. This is the MMSE receiver, which maximizes the SNR, while minimizing the MMSE
between the estimate of x and x itself.
For V-BLAST the corrupting non-white noise is the interference terms plus the additive noise.
The covariance matrix for this noise is given by
K
z
k
= N
o
I
nr
+

i=k
P
i
h
i
h

i
(131)
A similar derivation shows that the MMSE receiver in this case the weighting vector is
q
k
=
_
N
o
I
nr
+

i=k
P
i
h
i
h

i
_
−1
h
k
(132)
37
It is pretty easy to see that the MMSE receiver is a tradeoff between the matched filter and
ZF-Nulling. At low SNR
K
z
k
≈ N
o
I
nr
(133)
so the receiver is given by h
k
, which is a matched filter. At high SNR
K
z
k

i=k
P
i
h
i
h

i
(134)
and it can be seen that q
k
is simply the kth column of the pseudoinverse of H. Thus the MMSE
receiver is like ZF-Nulling at high SNR. In addition, the MMSE receiver has good performance
in the region between high and low SNR. If SIC is used in conjunction with MMSE, then
MMSE-SIC can achieve the channel capacity.
6.3.2 D-BLAST
Consider the kth stream. It is transmitted by one antenna and received by all n
r
receive
antennas. Thus the maximum possible diversity gain for any individual stream is n
r
and there
is a limit to how much MIMO diversity techniques can protect a stream [Fos96]. If a SIC
structure is used with either the MMSE receiver or ZF-nulling, then if one stream is incorrectly
decoded, then subsequent streams will likely be incorrectly decoded. The main reason for this
problem is that no coding is performed spatially across the multiple streams. Coding across
streams is used to ensure each stream is reliably decoded, but in order to decode the spatial
code across the streams each stream must already be decoded in V-BLAST. The solution to this
problem is to alter the way the streams are transmitted.
Consider the case with two transmit antennas. Suppose that there are two separate streams
each consisting of two blocks. Denote this by a
(i)
and b
(i)
for i = 0, 1. Then the D-BLAST
codeword is
( =
_
a
(1)
b
(1)
a
(2)
b
(2)
_
(135)
From this codeword matrix it is obvious where D-BLAST gets its name from, since the layers
are now diagonal. The receiver works as follows:
1. First receive a
(1)
with MRC
2. Next receive a
(2)
with MMSE or ZF-nulling, while ignoring b
(1)
.
3. Next decode the spatial code across the first layer [a
(1)
a
(2)
].
4. Now a
(2)
has been reliably decoded, so it can be cancelled out and b
(1)
can be received.
5. Finally, b
(2)
can be received with MRC. Then the second layer [b
(1)
b
(2)
] can be decoded.
Now both streams have been decoded reliably. The key observation is that for a single layer
if one of the blocks for one stream is initially decoded incorrectly, there is still a chance to fix
38
the error with the code applied across the layer. The major price to pay for using D-BLAST
is the lost capacity during the startup process due to the blank spots in the codeword. For
example, during the first block the second transmit antenna transmits nothing, and so some
capacity is lost. Finally, there is also the cost in implementation complexity of applying coding
and decoding across streams.
6.4 Space-Time Trellis Codes
A space-time trellis code (STTC) is an extension of normal convolutional codes to multiple
antennas [TSC98]. The key idea behind a STTC is to make the output of the encoder a function
of the input bits and the state of the encoder, which is in turn a function of the previous inputs.
Trellis codes provide better error performance compared to block codes and coding gain at the
expense of implementation complexity.
6.4.1 Trellis Representation
Suppose B bits are input into the encoder, which has 2
ν
states. A trellis diagram is a way of
representing the action of a STTC [OC06]. The diagram below shows a trellis. The number of
nodes is the number of states in the code. The left column represents the current state of the
code and the right column represents the next state. The possible outputs are listed on the left
hand side of the trellis. If the output is 02 for example then the 0
th
is sent on the first antenna
and the 2
nd
symbol is sent on the second antenna. The transition arrows are driven by the input
bits. There are 2
B
arrows from each state on the left to states on the right for each possible
combination of inputs.
Figure 16: Trellis Coding [OC06]
The decoder’s job is estimate which sequence, path through the trellis, was sent with. One way
39
to do this is with a Maximum Likelihood Sequence Estimator (MLSE). There is a well known
algorithm - the Viterbi algorithm - to efficiently estimate, which sequence was sent.
Trellis Complexity There is a fundamental lower bound to the complexity of a STTC. For a
STTC with B input bits and minimum rank r
min
has at least 2
B(r
min
−1)
states. Obviously as the
number of states increases the complexity of decoding increases, so this lower bound on states
puts a lower bound on the possible complexity.
6.4.2 Delay-Diversity Scheme
This is one of the simplest trellis codes to achieve diversity [Wit93, SW94]. The codeword for
T transmitted symbols is given by
( =
1

2
_
c
1
c
2
c
T
0
0 c
1
c
T−1
c
T
_
(136)
The trellis diagram below represents this code
The effect of this code is to convert spatial diversity to frequency diversity. Consider a 2 1
MIMO system. When c
1
is transmitted during the first symbol period, it sees the channel h
1
.
When c
1
is transmitted during the second symbol period, it sees channel h
2
. This is equivalent
to passing c
1
through a frequency selective channel with two taps in the frequency domain: h
1
and h
2
. So spatial diversity becomes frequency diversity by applying this code.
7 Space-Time Coding for Frequency Selective Channels
There are two basic approaches to MIMO over frequency selective channels as in normal SISO
frequency selective channels:
1. Single carrier
2. Multicarrier - OFDM
Many modern wireless standards that use MIMO also use OFDM, so MIMO-OFDM is of par-
ticular interest.
7.1 Single Carrier
In this case the system can be modeled as
y
k
=
L−1

l=0
H[l]c
k−l
+v
k
(137)
40
This complicated system involving a summation can be expressed as a simple system of the form
y
k
= [H[0] H[L −1]]
_
c
T
k
c
T
k−L+1
¸
T
+v
k
(138)
which is similar to the narrowband MIMO case.
7.2 MIMO-OFDM
MIMO-OFDM is an extension of normal OFDM to the MIMO case where there are multiple
antennas.
7.2.1 OFDM
OFDM uses the FFT and IFFT to decompose the wideband frequency selective channel into
several smaller narrowband frequency flat channels. The cyclic prefix is added to prevent ISI.
Figure 17: OFDM System Model [OC06]
The system model can be expressed in the DFT frequency domain as
Y [k] = H[k]X[k] + V [k] (139)
with V [k] the corrupting noise. H[k]X[k] results in a circular convolution in the time domain,
which can be expressed in matrix form as
_
¸
¸
¸
_
y[0]
y[1]
.
.
.
y[N −1]
_
¸
¸
¸
_
=
_
¸
¸
¸
_
h[0] h[N −1] h[N −2] h[1]
h[1] h[0] h[N −1] [2]
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
h[N −1] h[N −2] h[N −3] h[1]
_
¸
¸
¸
_
_
¸
¸
¸
_
x[0]
x[1]
.
.
.
x[N −1]
_
¸
¸
¸
_
+
_
¸
¸
¸
_
v[0]
v[1]
.
.
.
v[N −1]
_
¸
¸
¸
_
The singular value decomposition of H is TΛT

. Where T is the matrix that performs the
DFT. The matrix Λ is a diagonal matrix specific to each H
Λ =
_
¸
¸
¸
_
λ
1
0
λ
2
.
.
.
0 λ
N
_
¸
¸
¸
_
(140)
but T is not specific to each H.
41
7.2.2 Extension to MIMO-OFDM
A MIMO-OFDM system can be modeled like a SISO OFDM system with the channel taps
replaced by channel matrices [OC06]. First, start with the frequency selective MIMO channel
y
k
=
L−1

l=0
H[l]x
k−l
+v
k
(141)
Then append a cyclic prefix, X
g
, to prevent ISI to produce the modified system
˜ y = H
g
[X
g
X] +v (142)
with the channel matrix given by
H
g
=
_
¸
¸
¸
_
H[l −1] H[1] H[0] 0 0
0 H[l −1]
.
.
. H[1] H[0] 0
.
.
.
0 0 H[l −1] H[l −2] H[0]
_
¸
¸
¸
_
(143)
As in SISO OFDM the cyclic prefix, which is necessary for practical implementation, can be
removed from the analytical model. A large blockwise circulant matrix can represent the effective
channel seen by the whole MIMO-OFDM codeword.
H
cp
=
_
¸
_
H[0] 0 0 H[l −1] H[1]
.
.
.
0 0 H[l −1] H[1] H[0]
_
¸
_
(144)
Since H
cp
is blockwise circulant, the SVD of H
cp
is given by
H
cp
= T

Λ
cp
T (145)
where T is the IDFT matrix as usual. Thus the complicated MIMO-OFDM channel can be
regarded as a diagonal channel with the appropriate coordinate change given by the DFT.
Given
H
k
=
L−1

l=0
H[l]e
−j2π/Tkl
(146)
then ML detection is given by
ˆ
X = arg min
C
T−1

k=0
[[y
k
−H
k
c
k
[[
2
(147)
42
Like OFDM, MIMO-OFDM has issues with PAPR and frequency offset estimation. A block
diagram for MIMO-OFDM follows below:
Figure 18: MIMO OFDM [OC06]
7.2.3 Space-Frequency Coded MIMO-OFDM
For normal OFDM the frequency domain channel coefficients H[k] can be viewed as the channel
coefficients in a narrowband fast fading time channel. The frequency index k can be reinterpreted
as a time domain index. Thus codes designed for fast fading time channels can be applied across
the subcarriers. In MIMO-OFDM the same idea can be used to code across the subcarriers
[TV05].
7.2.4 Space-Time Coded MIMO-OFDM
This is the simplest MIMO-OFDM system with no coding across the subcarriers. Instead the
OFDM part of the system chops the frequency selective channel into frequency flat channels on
which normal space time coding techniques can be applied. For example in the 2 n
r
case the
Alamouti code can be used on each subcarrier through the following process:
1. Transmit [c
1
c
2
]
T
on a given tone during the first OFDM symbol
2. Transmit [−c

2
c

1
]
T
on the same tone during the second OFDM symbol
3. Perform normal Alamouti decoding
This idea certainly works, but it limits the system, since the channel has to remain static for the
duration of two OFDM symbols. Depending on system parameters this may not be a reasonable
43
assumption. In general all space-time codes discussed before assume the channel is static over
the duration of a codeword, so this is a general problem in Space-Time Coded MIMO-OFDM.
7.2.5 Space-Time Frequency Coded MIMO-OFDM
In a Space-Time Frequency Coded MIMO-OFDM system coding is performed over all three
available dimensions: space, time, and frequency. Below are several examples of this idea.
Generalized Delay Diversity This code [GSP02] has matrix form
C =
1

2
_
¸
¸
_
c
1
c
2
c
T
0 0
0 c
1
c
2
c
T
0
0 c
1
c
2
c
T
0
0 0 c
1
c
2
c
T
_
¸
¸
_
(148)
This code provides a diversity gain of 3.
Lindskog-Paulraj Scheme This code [LP00] basically extends Alamouti in a natural way
to MIMO-OFDM. The fundamental units of transmission are two blocks of length T: c
1
[k] and
c
2
[k]. The scheme is then
1. Send [c
1
[k] c
2
[k]]
T
2. Send [−c

2
[k] c

1
[k]]
T
3. Decode like Alamouti except use two independent MLSE estimators. This can be accom-
plished with two parallel copies of the Viterbi algorithm.
8 Multiuser MIMO
Historically MIMO was developed for use in point to point situations. However, MIMO can also
be used as a multiple access technique to allow multiple users to seamlessly share the spatial
channel. This type of MIMO is called Multiuser MIMO(MU-MIMO). The typical application of
MU-MIMO is in a cellular system with multiple antennas at the base station and only one or two
antennas at each mobile [GKHCS07]. The collection of antennas at all the mobile users in a cell
is regarded as one big antenna array. One of the key advantages of having a distributed array
comprised of all the mobiles is that the channel matrix rarely suffers from rank deficiencies, so
spatial multiplexing is almost always possible. However, in order to actually get the benefits
of MU-MIMO the base station needs CSIT or at least partial CSIT, which entails increased
complexity.
44
For a MU-MIMO system having N transmit antennas at the base station and U users each with
M
k
antenna the downlink, broadcast channel, for each user can be modeled as
y
k
= h
k
N

l=1
x
l
+v
k
(149)
The uplink, MAC channel, can be modeled as
y =
U

i=1
h
k
x
k
+v (150)
8.1 Precoding
Information theoretic results have shown that using a type of coding called dirty paper cod-
ing(DPC) at the transmitter N users streams can be multiplexed and transmitted [SB07, GC80].
Effectively what the coding at the transmitter does is pre-cancel out interference at the receivers
like ZF-nulling does in BLAST.
8.1.1 Linear Precoding
The downlink channel can be written in a simple for making explicit how other users’ streams
produce interference.
y
k
= H
k
s
k
+H
k
N

l=1,l=k
s
k
+v
k
(151)
The simplest form of precoding is to multiply the transmit symbols by a matrix, W
k
, that will
cancel out the interferers [SSH04].
y
k
= H
k
W
k
s
k
+H
k
N

l=1,l=k
W
l
s
k
+v
k
(152)
In the case when each user has one receive antenna this problem is identical to canceling inter-
ference in BLAST. So the proper choice for W
k
is the kth column of the pseudoinverse of the
effective channel matrix H =
_
h
1
h
1
h
N
¸
.
8.1.2 Nonlinear Precoding
Nonlinear precoding is more like DPC than linear precoding and can produce better results at
the cost of increased complexity. Well known nonlinear precoding methods include perturbation
methods and Tomlinson-Harathisma codes [PHS05, HPS05].
45
8.2 Scheduling
If the number of users U is greater than the number of transmit antennas N, then the base
station can’t transmit to all the users simultaneously. So at any given time the base station
must choose some subset of the users to transmit to [GKHCS07]. The optimal scheduling
algorithm is to simply perform an exhaustive search over all possible combinations of users.
This is not computationally feasible though, so heurestic methods must be used to choose a
subset of users. A simple choice is a greedy algorithm, which selects the N users with the best
channels.
8.3 Working with Partial CSIT
To achieve CSIT each user must feedback its channel estimate to the base station, which is tricky
and reduces capacity [GA04]. To combat this problem some research has been performed into
MU-MIMO systems with only partial CSIT. Basic results have demonstrated that the gains of
MU-MIMO can still be achieved with only partial CSIT, which entails less system complexity.
9 MIMO in Wireless Standards
Many emerging wireless standards provide for MIMO to provide both diversity and multiplexing
gain as needed. This section examines three prominent new wireless standards that employ
MIMO.
9.1 3GPP LTE
The Third Generation Partnership Project Long-Term Evolution (3GPP LTE) is the emerging
4G standard that is currently being implemented and tested. The major features of LTE are
outlined below [3GPP07, 3GPPRel8, 3GPPRel9]:
• High data rates - 100 Mbps in the downlink using 2 2 MIMO and 50 Mbps in the uplink
using no MIMO
• Mobility - Best performance for 0-15 km/hr and good performance of 15-120 km/hr.
• Spectrum - No fixed spectrum size. Allowed sizes are 1.25, 1.6, 2.5, 5, 10, 15, and 20 MHz.
• OFDM - LTE uses OFDM with a variable number of subcarriers.
• IP Network - No circuit switched domain but all IP based network.
The downlink in LTE provides several options for using MIMO. The basic option for the downlink
is two antennas at the base station and two at the mobile station. Extensions to LTE allow
4 2 and 4 4 MIMO. If the base station has CSIT, then there are two methods it can apply:
46
1. Pre-coding SDM - Since the base station knows the channel at the receiver, it can pre-code
the transmitted symbols to present interference using the V matrix from the SVD of H.
2. Beamforming - Use some form of beamforming such as TMRC.
Without CSIT the base station can use Space-Frequency Block Coding by using the Alamouti
code for each tone.
In the uplink MU-MIMO can be used with the proper scheduling. The baseline case assumes
1 2 and the extension is 1 4.
9.2 WiMAX
WiMAX was originally developed to address the last mile connection to the internet. It has
evolved to provide high data rate mobile data. The key features of WiMAX are outlined below
[IS04, IS05]:
• High data rates - 75 Mbps in 802.16d and 30 Mbps in 802.16e.
• Mobility - Introduced in 802.16e. Range up to 30 miles in 802.16e.
• Spectrum - No fixed spectrum size. Allowed sizes are 1.25, 2.5, 5, 10, and 20 MHz.
• OFDM - WiMAX uses OFDM with a variable number of subcarriers.
The general structure of a WiMAX transmitter is demonstrated in the figure below:
Figure 19: WiMAX Transmitter [AGM05]
47
There are several different MIMO methods that can be employed in WiMAX. As we have seen
before the methods employed depend on whether the transmitter has channel state information
or not.
Open Loop (No CSIT) The 802.16 standard defines several options for space-time codes
for 2-4 antennas. However, the two most common codes for space-time coding are:
_
S
1
S
2
_ _
S
1
−S

2
S
2
S

1
_
(153)
where S
1
and S
2
are OFDM symbols. 802.16 also provides for space-frequency coding called the
Frequency Hopping Diversity Code (FHDC) based on the Alamouti code. The OFDM symbols
are uncoded in time and coded in the frequency domain. The figure below shows how FHDC
works:
Figure 20: WiMAX Frequency Hopping Diversity [AGM05]
Closed Loop (CSIT) With CSIT the transmitter can make better decisions. One of the
common methods used in feedback is codebook based feedback. The codebook is basically a
predetermined set of choices for the Q matrix in BLAST. The feedback is an index into the
codebook that tells the receiver, which matrix to use. Another alternative is to use a feedback
channel and have the receiver transmit a quantized version of the channel. The transmitter can
then design the optimal precoding matrix.
9.3 802.11n
802.11n is the next generation 802.11 LAN that seeks to provide very high data rates. In 802.11
a/b/g high data rates were achieved by using high order modulation like 64-QAM. However,
the cost of this approach is a loss in range because higher SNR is necessary to successfully
demodulate 64-QAM. The way 802.11n seeks to overcome this problem and provide both high
48
data rates and better range is through MIMO-OFDM. 802.11n transmits multiple data streams
from the multiple transmit and receive antennas. Thus 802.11n achieves higher data rates
without using more bandwidth or larger constellations by increasing spectral efficiency.
The key features of 802.11n are outlined below [IWG04, ?, ?]:
• High Data Rate - 130 Mbps typically
• Spectrum - 20 MHz (Optionally 40 MHz)
• OFDM - Uses OFDM
The basic case for 802.11n is 2 2 but the 802.11n standard provides for up to 4 4 MIMO. A
block diagram demonstrating the operation of 802.11n follows:
Figure 21: 802.11n Transmitter [OC06]
The 802.11n transmitter sends every other group of bits to each OFDM branch. Each branch
performs normal OFDM with spatial subcarrier mapping. Then each branch transmits on one
antenna. The receiver architecture is symmetric and is manufacturer specific.
10 Conclusion
MIMO has become a popular technology for emerging wireless standards because it can provide
better error performance in the form of diversity gain and better data rates in the form of
multiplexing gain without using more bandwidth. In addition, MIMO works well with OFDM,
which has become a ubiquitous feature of modern wireless standards. MIMO continues to be an
active research area with multiuser-MIMO as a new area of great interest for future development.
MIMO is an exciting field that looks to be a major part of research and standards in wireless
communications for many years to come.
49
A Math Review
This section reviews a few common mathematical tools used in MIMO. In particular, this section
covers some important linear algebra topics and Lagrange multipliers for optimization [HJ95].
A.1 Rank
Let A ∈ C
m×n
. Then A can be written in terms of column vectors as A = [a
1
, a
2
, . . . , a
n
]. The
column space of A denoted col(A) is given by
col(A) = span(a
1
, a
2
, . . . , a
n
) (154)
Then the rank of A denoted rank(A) is defined to be dimcol(A). So the rank of A is the largest
number of columns of A that constitute a linearly independent set.
A can be written in terms of rows as [a
T
1
, a
T
2
, . . . , a
T
m
]
T
. Then the row space of A denoted
row(A) is given by
row(A) = span(a

1
, a

2
, . . . , a

n
) (155)
With these definitions it can be demonstrated that rank(A) = dimrow(A). Listed below are
several useful facts about rank:
1. rank(A
T
) = rank(A

) = rank(A)
2. rank(A) ≤ min¦m, n¦.
3. rank(A

A) = rank(A)
4. If A is square, then A is invertible if and only if rank(A) = n.
A.2 Eigenvalues and Eigenvectors
Let A be a nn matrix over the complex numbers. A complex number λ and a complex vector
x ,= 0 are said to be an eigenvalue and its associated eigenvector if
Ax = λx (156)
By simple rearranging this expression can be written as
(A−λI) x = 0 (157)
This equations has non-trivial solutions (x ,= 0) only if A − λI is not invertible. This is true
precisely when det (A−λI) = 0. When the determinant is expanded and evaluated it becomes
an nth degree polynomial. By the Fundamental Theorem of Algebra this polynomial has n
complex roots counting multiplicity. Thus A has n eigenvalues including multiplicity.
50
A.2.1 Diagonalization
Sometimes A can be related to a diagonal matrix D by A = S
−1
DS where S is a n n
invertible matrix. If this is possible, then A is said to be diagonalizable. The matrix S can be
interpreted as a change of basis that allows the matrix A to be described as a diagonal matrix.
This representation is particularly nice in a n n MIMO system, since the channel becomes n
independent parallel channels. This transformation makes capacity calculation much easier.
So the main point of interest is determining when A is diagonalizable. The following two
conditions are sufficient to guarantee that a square matrix is diagonalizable:
1. A has n distinct eigenvalues
2. A has n linearly independent eigenvalues
A.2.2 Connection To The Determinant and Trace
The following formula is a useful connection between the eigenvalues of a matrix and its deter-
minant and trace:
det(A) = λ
1
λ
2
λ
n
(158)
tr(A) =
n

i=1
a
ii
= λ
1
+ λ
2
+ + λ
n
(159)
A.3 Inner Product Space
C
n×1
is an inner product space with the inner product given by:
< x, y >= y

x (160)
There are two important points of interest in regarding C
n×1
as an inner product space. First, it
is possible to perform orthogonal projections of a vector onto a space spanned by several other
vectors. This can be used in V-BLAST for the ZF-Nulling receiver. The second point is the
Cauchy-Schwarz inequality
[ < x, y > [ ≤

< x, x >

< y, y > (161)
with equality for y = Kx for any constant K. The Cauchy-Schwarz inequality can be used to
derive the optimal receive combining vector for MRC.
A.4 Singular Value Decomposition
It is only possible to diagonalize a square matrix, but sometimes it is desirable to decompose a
matrix with arbitrary dimensions into another matrix that is almost diagonal. The singular value
51
decomposition achieves this and is defined for all matrices in C
m×n
. Specifically the singular
value decomposition of a matrix A ∈ C
m×n
is
A = UΣV

(162)
where U ∈ C
m×m
and V ∈ C
n×n
are unitary matrices, which means
UU

= U

U = I
m
(163)
VV

= V

V = I
n
(164)
Σ ∈ C
m×n
has non-zero entries only for the entries on the diagonal, Σ
ii
. The entries Σ
ii
are
called the singular values of A. Then it is clear that there are n
min
= min¦m, n¦ singular values.
Typically the matrix Σ is constructed such that
Σ
11
≥ Σ
22
≥ ≥ Σ
n
min
n
min
(165)
Intuitively what the SVD does is use the matrix V to rotate an input vector to a coordinate
system in which the action of the matrix can be described by a simple matrix Σ. Then the
output of this simple matrix is rotated back to the original coordinate system by U to produce
the output of A.
The concept of the SVD can be viewed as a generalization of eigenvalues. For a column of
u ∈ C
m
, a corresponding column of v ∈ C
n
, and the corresponding singular value σ ∈ C
Av = σu (166)
This equation is similar to the equation that defines an eigenvalue and eigenvector, which led to
u being called a left singular vector and v a right singular vector. One of the most important
properties of the SVD is that the number of non-zero singular values is precisely the rank of the
matrix A.
A.4.1 Pseudoinverse
The inverse of a matrix is only defined for square matrices, but there is a way to define a special
matrix that is like an inverse but defined for arbitrary mn matrices called the pseudoinverse.
Let A ∈ C
m×n
have a SVD UΣV

. Then the pseudoinverse is defined to be A

= VΣ

U

where Σ

is the transpose of Σ with the non-zero singular values inverted. There are four kery
properties of the pseudoinverse that define its behavior:
1. AA

A = A - Note that AA

is not in general the identity matrix but the combination of
three matrices produces the desired effect.
2. A

AA

= A

3.
_
AA

_

= AA

4.
_
A

A
_

= A

A
52
A.4.2 Condition Number
Consider solving the linear system Ax = b. The condition number is a measure of how well this
system behaves for small changes in b. Specifically the condition number measures how small
changes in b change x. So for the perturbed system Ax = b +e the condition number is given
by
κ(A) =
[[A
−1
e[[/[[A
−1
b
[[e[[/[[b[[
(167)
This quantity can be related to the singular values of A by
κ(A) =
σ
max
σ
min
(168)
A.5 Lagrange Multipliers
Consider the following optimization problem:
Maximize : f(x
1
, x
2
, . . . , x
n
)
Subject to : g(x
1
, x
2
, . . . , x
n
) = C
The optimal solution can be found by solving the following system of equations given by the
gradient
_f = λ _g
g = C (169)
These equations can be expressed in terms of partial derivatives as
∂f
∂x
i
= λ
∂g
∂x
i
i = 1, 2, . . . , n
g = C (170)
Lagrange multipliers can be used to find the optimal power allocation to maximize capacity.
References
[3GPPRel8] 3GPP, “Technical Specification Group Radio Access Network Requirements for
Further Advancements for E-UTRA (LTE-Advanced) (Release 8)”
[3GPPRel9] 3GPP. “Overview of 3GPP Release 9 V.0.0.4 (2009-01)”
[3GPP07] 3GPP. “Physical Channels and Modulation (Release 8)”. September 2007.
[AG05] J. Akhtar and D. Gesbert. “Spatial Multiplexing over correlated MIMO channels with
a closed-form precoder”. IEEE Trans. Wireless Commun., 4(5):2400-2409, September 2005.
53
[AGM05] J. Andrews, A. Ghosh, and R. Muhamed. Fundamentals of WiMAX
[Ala98] S.M. Alamouti. “A simple transmit diversity technique for wireless communications”.
IEEE J. Select. Areas Commun., 16(10):1451-1458, October 1998.
[BRV05] J.C. Belfiore, G. Rekaya, and E.Viterbo. “The golden code: a 22 full-rate space-time
code with non-vanishing determinants”. IEEE Trans. Inform. Theory, 51(4):1432-1436, April
2005.
[CKT98] C.N. Chuah, J.M. Kahn, and D. Tse. “Capacity of multi-antenna array systems in
indoor wireless environment”. In Proc. Globecom 1998 - IEEE Global Telecommunications
Conf., volume 4, pages 1894-1899, Sydney, Australia, 1998.
[CT91] T. Cover and T. Thomas. Elements of Information Theory. Wiley, NewYork, NY, 1991.
[DTB02] M.O. Damen,A.Tewfik, and J.-C. Belfiore. “A construction of a space-time code based
on number theory”. IEEE Trans. Inform. Theory, 48(3):753-760, March 2002.
[Fle00] B.H. Fleury. “First- and second-order characterization of direction dispersion and space
selectivity in the radio channel”. IEEE Trans. Inform. Theory, 46(6):2027-2044, June 2000.
[FG98] G.J. Goschini and M.J. Gans. “On Limits of Wireless Communications in a Fading
Environment when Using Multiple Antennas,” Wireless Personal Communications. Vol 6(3),
March 1998.
[Fos96] G.J. Foschini. “Layered space-time architecture for wireless communication in a fading
environment when using multi-element antennas”. Bell Labs Tech. J., pages 41-59, Autumn
1996.
[GA04] D. Gesbert and M.-S. Alouini, “How much feedback is multi-user diversity really
worth?”, Proc. IEEE Int. Conf. on Comm. (ICC), Paris, France, June 2004, pp. 234-238
[GC80] A. El Gamal and T.M. Cover, “Multiple user information theory”, Proc. IEEE, vol. 68,
no. 12, pp. 1466-1483, Dec. 1980
[GD03] H.E. Gamal and M.O. Damen. “Universal space-time coding”. IEEE Trans. Inform.
Theory, 49(5):1097-1119, May 2003.
[GKHCS07] D. Gesbert, M. Kountouris, R. W. Heath, Jr., C.-B. Chae, and T. Salzer, “Shift-
ing the MIMO Paradigm: From Single User to Multiuser Communications”, IEEE Signal
Processing Magazine, vol. 24, no. 5, pp. 36-46, Oct. 2007
[Gold05] Andrea Goldsmith. Wireless Communications. Cambridge Press. 2005.
[GFVW99] G.D. Golden, G.J. Foschini, R.A.Valenzuela, and P.W.Wolniansky. “Detection al-
gorithm and initial laboratory results using the V-BLAST space- time communication ar-
chitecture”. Elect. Lett., 35(1):14-15, January 1999.
54
[God97] L.C. Godara. “Applications of antenna arrays to mobile communications, part II:
Beamforming and direction-of-arrival considerations”. Proceedings IEEE, 85(8):1195-1245,
August 1997.
[GSP02] D. Gore, S. Sandhu, and A. Paulraj. “Delay diversity codes for frequency selective
channels”. In Proc. ICC 2002 - IEEE Int. Conf. Commun., pages 1949-1953, NewYork, May
2002.
[Hea01] R. Heath. “Space-Time Signaling in Multi-Antenna Systems”. PhD thesis, Stanford
University, November 2001.
[Her04] M. Herdin. “Non-stationary indoor MIMO radio channels”. PhD thesis, Technische
UniversitatWien, August 2004.
[HH01] B. Hassibi and B. Hochwald. “High-rate linear space-time codes”. In Proc. ICASSP
2001 - IEEE Int. Conf. Acoust. Speech and Signal Processing, volume 4, pages 2461-2464,
Salt Lake City, UT, May 2001.
[HJ95] R.A. Horn and C.R. Johnson. Topics in matrix analysis. Cambridge University Press,
Cambridge, UK, 1995.
[HPS05] B. M. Hochwald, C. B. Peel, and A. L. Swindlehurst, “A vector-perturbation technique
for near capacity multiantenna multiuser communication - part II: perturbation”, IEEE
Trans. Comm., vol. 53, no. 3, pp. 537-544, March 2005
[HV05] B. Hassibi and H. Vikalo, “On the sphere decoding algorithm: Part I, the expected
complexity”, IEEE Transactions on Signal Processing, vol. 53, pp. 2806 - 2818, Aug. 2005.
[Jaf01] H. Jafarkhani. “A quasi orthogonal space-time block code”. IEEE Trans. Commun.,
49(1):1-4, January 2001.
[Jak71] W.C. Jakes. “ A Comparison of Specific Space Diversity Techniques for the Reduction
of Fast Fading in UHF Mobile Radio Systems,” IEEE Trans. on Veh. Techn., Vol. VT-20,
No. 4, pp. 81-93, Nov. 1971.
[Kah54] L. Kahn, “Ratio Squarer,” Proc. of IRE(Corr.), Vol. 42, pp. 1074, November 1954.
[IS04] IEEE Standard 802.16-2004
[IS05] IEEE Standard 802.16e-2005, Amendment to IEEE Standard for Local and Metropolitan
Area Networks - 6 Part 16: Air Interface for Fixed Broadband Wireless Access System
[IWG04] IEEE 802.16 Working Group, Part 16: Air Interface for Fixed and Mobile Broadband
Wireless Access Systems Amendment 2: Physical and Medium Access Control Layers for
Combined Fixed and Mobile Operation in Licensed Bands, 2004
[IWG206] IEEE 802.16 Working Group, Part 16: Air Interface for Fixed and Mobile Broadband
Wireless Access Systems Amendment 2: Physical and Medium Access Control Layers for
Combined Fixed and Mobile Operation in Licensed Bands, 2006
55
[IWG1106] IEEE 802.11Working Group, IEEE P802.11n/D1.0 Draft Amendment to Stan-
dard for Information Technology- Telecommunications and Information Exchange Between
Systems-Local and Metropolitan Networks-Specific Requirements-Part 11: Wireless LAN
Medium Access Control (MAC) and Physical Layer (PHY) Specifications: Enhancements
for Higher Throughput, March 2006.
[LP00] E. Lindskog and A.J. Paulraj. “A transmit diversity scheme for channels with intersym-
bol interference”. In Proc. ICC 2000 - IEEE Int. Conf. Commun., volume 1, pages 307-311,
New Orleans, June 2000.
[OC06] Claude Oestges and Bruno Clerckx. MIMO Wireless Communications. Artech House
Press.
[Par00] J.D. Parsons. The mobile radio propagation channel. 2nd ed., Wiley, London, UK, 2000.
[PHS05] C. B. Peel, B. M. Hochwald, and A. L. Swindlehurst, “A vector-perturbation technique
for near capacity multiantenna multiuser communication - part I: channel inversion and
regularization”, IEEE Trans. Comm., vol. 53, no. 1, pp. 195-202, Jan. 2005
[PNG03] A. Paulraj, R. Nabar, and D. Gore. Introduction to Space-Time Wireless Communi-
cations. Cambridge University Press, Cambridge, UK, 2003.
[Pro01] J.G. Proakis. Digital communications. 4th ed., McGraw-Hill, NewYork, NY, 2001.
[Rapp02] T.S. Rappaport, Wireless Communications: Principles and Practices. Prentice Hall.
[SA00] M.K. Simon and M.-S. Alouini. “Digital communications over fading channels: a unified
approach to performance analysis”.Wiley. New York, NY, 2000.
[San02] S. Sandhu. “Signal Design for Multiple-Input Multiple-Output Wireless: A Unified
Perspective”. PhD thesis, Stanford University, August 2002.
[SB07] M. Sharif and B. Hassibi, “A comparison of time-sharing, DPC, and beamforming for
MIMO broadcast channels with many users”, IEEE Trans. Comm., vol. 55, no. 1, pp. 11-15,
Jan. 2007.
[Sim01] M.K. Simon. “Evaluation of average bit error probability for space-time coding based
on a simpler exact evaluation of pairwise error probability”. Journal of Communications and
Networks, 3(3):257-264, September 2001.
[SMB01] M. Steinbauer, A.F. Molisch, and E. Bonek. “The double-directional radio channel”.
IEEE Antennas Propagat. Mag., 43(4):51-63, August 2001.
[SP03] N. Sharma and C.B. Papadias. “Improved quasi-orthogonal codes through constellation
rotation”. IEEE Trans. Commun., 51(3):332-335, March 2003.
[SSH04] Q. Spencer, A. L. Swindlehurst, and M. Haardt, “Zero-forcing methods for downlink
spatial multiplexing in multiuser MIMO channels”, IEEE Trans. Sig. Proc., vol. 52, no. 2,
pp. 462-471, Feb. 2004.
56
[SSRS03] B.A. Sethuraman, B. Sundar Rajan, and V. Shashidhar. “Full-diversity, high-rate,
space-time block codes from division algebras”. IEEE Trans. Inform. Theory, 49(10):2596-
2616, October 2003.
[SW94] N. Seshadri and J.H. Winters. “Two signaling schemes for improving the error perfor-
mance of frequency-division-duplex (FDD) transmission systems using transmitter antenna
diversity”. Int. J. Wireless Information Networks, 1:49-60, 1994.
[SX04] W. Su and X. Xia. “Signal constellations for quasi-orthogonal space- time block codes
with full diversity”. IEEE Trans. Inform. Theory, 50(10):2331-2347, October 2004.
[TBH00] O. Tirkkonen, A. Boariu, and A. Hottinen. “Minimal non-orthogonality rate 1 space-
time block code for 3+Tx antennas”. In IEEE 6th Int. Symp. on Spread-Spectrum Tech. and
Appl. (ISSSTA 2000), pages 429-432, September 2000.
[Tel95] E.Telatar. “Capacity of multiantenna Gaussian channels” .Tech. Rep.,AT&T Bell Labs.
1995.
[TJC99] V. Tarokh, H. Jafarkhani, and A.R. Calderbank. “Space-time block codes from orthog-
onal designs”. IEEE Trans. Inform. Theory, 45(7):1456-1467, July 1999.
[TSC98] V. Tarokh, N. Seshadri, and A.R. Calderbank. “Space-time codes for high data rate
wireless communication: Performance criterion and code construction”. IEEE Trans. Inform.
Theory, 44(3):744-765, March 1998.
[TV05] D. Tse and P. Viswanath. Fundamentals of wireless communication. Cambridge Univer-
sity Press, Cambridge, UK, 2005.
[VB99] E. Viterbo and J. Boutros. “A universal lattice code decoder for fading channels”. IEEE
Trans. Inform. Theory, 45(5):1639-1642, July 1999.
[Wit93] A. Wittneben. “A new bandwidth efficient transmit antenna modulation diversity
scheme for linear digital modulation”. In Proc. ICC 1993 - IEEE Int. Conf. Commun.,
pages 1630-1633, 1993.
[WX05] D. Wang and X.G. Xia. “Optimal diversity product rotations for quasiorthogonal STBC
with MPSK symbols”. IEEE Commun. Lett., 9(5):420- 422, May 2005.
[XL05] L. Xian and H. Liu. “Optimal rotation angles for quasi-orthogonal space- time codes
with PSK modulation”. IEEE Commun. Lett., 9(8):676-678, August 2005.
[Yac93] M.D.Yacoub. Foundation of mobile radio engineering. CRC Press, Boca Raton, FL,
1993.
[YW03] H.Yao andG.W.Wornell. “Structured space-time block codes with optimal diversity-
multiplexing tradeoff and minimum delay”. In Proc. Globecom 2003 - IEEE Global Telecom-
munications Conf., volume 4, pages 1941-1945, San Francisco, CA, December 2003.
[ZT03] L. Zheng and D. Tse. “Diversity and multiplexing: a fundamental tradeoff in multiple-
antenna channels”. IEEE Trans. Inform. Theory, 49(5):1073-1096, May 2003.
57

Contents
1 Introduction 2 Benefits of MIMO 2.1 Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 2.1.2 2.2 Union Bound on Probability of Error . . . . . . . . . . . . . . . . . . . . Outage Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 2 3 4 5 5 5 6 6 7 8 8 9 10 11 11 12 12 13 15 15 16 17 19

Spatial Multiplexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3 Basic Schemes for Multiple Antennas 3.1 3.2 3.3 3.4 3.5 3.6 3.7 Channel Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Scalar Rayleigh Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Maximal Ratio Combining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Selection Combining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Equal Gain Combining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Transmit Maximal Ratio Combining . . . . . . . . . . . . . . . . . . . . . . . . Alamouti Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4 MIMO Channel Modeling and Capacity 4.1 Narrowband MIMO Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.1 4.1.2 4.2 Narrowband MIMO Channel Capacity . . . . . . . . . . . . . . . . . . . Rank and Condition Number . . . . . . . . . . . . . . . . . . . . . . . .

Physical Modeling of MIMO Channels . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 4.2.2 4.2.3 4.2.4 LOS SIMO and MISO Channel . . . . . . . . . . . . . . . . . . . . . . . LOS MIMO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Geographically Separated MIMO . . . . . . . . . . . . . . . . . . . . . . Two-Ray MIMO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4.3

Statistical Modeling of MIMO Channels . . . . . . . . . . . . . . . . . . . . . . 4.3.1 Frequency Selective MIMO Channel . . . . . . . . . . . . . . . . . . . . .

i

5 Diversity-Multiplexing Tradeoff 5.1 Scalar Rayleigh Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1 5.2 5.3 QAM over the Scalar Rayleigh Channel . . . . . . . . . . . . . . . . . . .

20 20 21 21 21 23 24 25 26 30 31 38 39 39 40 40 40 41 41 42 43 43 44 44 45 45 45

MISO Rayleigh Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . MIMO Rayleigh Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6 Space-Time Coding over Narrowband Channels 6.1 6.2 Error Motivated Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Space-Time Block Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.1 Linear STBCs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.3

Bell Labs Space Time Architectures . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.1 6.3.2 V-BLAST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-BLAST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.4

Space-Time Trellis Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.1 6.4.2 Trellis Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . Delay-Diversity Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7 Space-Time Coding for Frequency Selective Channels 7.1 7.2 Single Carrier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . MIMO-OFDM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.1 7.2.2 7.2.3 7.2.4 7.2.5 OFDM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Extension to MIMO-OFDM . . . . . . . . . . . . . . . . . . . . . . . . . Space-Frequency Coded MIMO-OFDM . . . . . . . . . . . . . . . . . . . Space-Time Coded MIMO-OFDM . . . . . . . . . . . . . . . . . . . . . . Space-Time Frequency Coded MIMO-OFDM . . . . . . . . . . . . . . . .

8 Multiuser MIMO 8.1 Precoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1.1 8.1.2 Linear Precoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nonlinear Precoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii

. . A. . . . . . . . . .1 9. . . . . . . . . . . . . . . . . . . . . . . . .2 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Inner Product Space . . . . . . . .2 Connection To The Determinant and Trace . . . . . A. . . . . . . . .1 Pseudoinverse . . .2 Condition Number . . . . . . . . . . . . . . . . . . .2. . . . . . . . . . . . . . .3 Scheduling . . . . . . . . A. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. . . . . . . . . . . . . . . . . . . . .4. .4. . . References iii . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 46 46 46 47 48 49 50 50 50 51 51 51 51 52 53 53 53 9 MIMO in Wireless Standards 9. . . . . . . . . . . . .2 8. . . . . . . . . . . . . .4 Singular Value Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5 Lagrange Multipliers . Working with Partial CSIT . . . . . . .2. . . .1 Diagonalization .11n . . . . . . . . A. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 802. . . . . . . . . . . . .3 3GPP LTE . . . . . . . . . . . . . .2 9. . . .1 Rank . . . . . . . . . . . . . . . WiMAX . . . . . . . . . . . . . . . . . . . 10 Conclusion A Math Review A. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8. A. . . . . . . A. A. . . . . . . . . . . . . . . . . . . . .

1 . MIMO in Wireless Standards 2 Benefits of MIMO The two major benefits of MIMO are diversity gain. Multiple Input and Multiple Output (MIMO) wireless communication systems have become a hot research topic because they promise to deal with all of these issues by providing both increased resilience to fading and increased capacity without using more bandwidth or power. The figure below shows a simple MIMO setup with nt transmit antennas and nr receive antennas. In the 1990’s MIMO systems with multiple antennas at both the transmitter and receiver were proposed. Multi-User MIMO and Applications 6. better quality of service.1 Introduction Wireless systems face several challenges including demands for higher data rates. which achieved high spectral efficiency on the order of 10-20 bits/s/Hz [Fos96]. Instead of just using diversity to combat fading MIMO systems actively take advantage of multipath to work. increased rate of transmission by exploiting the increased degrees of freedom offered by the spatial MIMO channel. Space-Time Coding and Architectures 4. and increased network capacity while working with limited amounts of spectrum. In the 2000’s MIMO has continued to be developed and there are now plans to implement MIMO in several new wireless standards such as 802. One of the early seminal works in MIMO was Telatar’s paper. Early methods provided for spatial diversity to improve error performance and beamforming to increase SNR by focusing the energy from an antenna into a desired direction. Around the same time Bell Labs developed the BLAST architectures.11n. WiMAX. which demonstrated the potential for improved capacity with no extra spectrum [Tel95]. increased resilience to fading in the form of better error performance. and multiplexing gain. Diversity-Multiplexing Tradeoff 3. and LTE. This tutorial paper focuses on the following major topics in MIMO: 1. MIMO Channel Modeling and Capacity 2. Also around the same time the first space-time coding methods were proposed [TSC98]. Methods to take advantage of multiple antennas at the receiver or the transmitter were known from the 1950’s onward. Space-Time Coding in Frequency Selective Channels 5.

1 Diversity Diversity is an attempt to exploit redundancies in the way information is sent to achieve better error performance by cleverly using multiple copies of the same signal. orthogonal frequency division multiplexing(OFDM) can apply modulation order adaption to each subcarrier depending on the quality of a given subchannel. Finally.Figure 1: MIMO System Concept [Gold05] 2. It is generally of interest to quantify exactly how much diversity a given scheme provides. The simplest example is the repetition code. Frequency diversity exploits the variations in a frequency selective channel. L is the diversity gain. The second type of antenna diversity is to use multiple antennas with different polarizations. This type of antenna diversity is one of the main focuses of this paper. Time diversity involves averaging the fading effects of the channel over time. For the case of nt × nr narrowband MIMO the maximum possible diversity gain is nt nr . there are several different types of antenna diversity. The most obvious type is to simply use multiple antennas. which transmits the same symbol multiple times with the transmissions separated by more than the coherence time of the channel. The receiver decodes each symbol independently and estimates the transmitted symbol by majority rule. Three fundamental types of diversity are time. For example. and antenna diversity. which is the maximum number of independent copies of the same signal that the receiver sees. frequency. 2 . This can be done through calculating either the average probability of error or the outage probability. The third type of antenna diversity is to use multiple antennas with different non-overlapping beam patterns. Both of these expressions can generally be approximated as SNR−L at high SNR. This diversity gain can be more rigorously defined as L = − lim log(Pe ) SNR→∞ log(SNR) (1) This is just a formalization of the intuition above that replaces “at high SNR” with a limit.

Consider an arbitrary constellation C containing M points.1 Union Bound on Probability of Error It can be difficult to calculate an exact expression for the probability of error for an arbitrary modulation. Write the constellation as C = {c1 . . cM } (2) Let Pe be the probability of symbol error. This is denoted P [cm → cl ]. . c2 . A simplifying approximation is the pairwise error probability(PEP) in which it is assumed for the purposes of calculation that only cm and cl are in the constellation. so   M 2 P ||cm − cl ||  Pe|cm ≤ Q No 2 l=1 l=m (5) 3 .1. Assuming all symbols are equally likely then Pe = 1 M M Pe|cm m=1 (3) The conditional probability of symbol of error can be expanded as Pe|cm = P [cm is detected incorrectly | cm was transmitted] M = l=1 l=m P [cm is estimated as cl | cm was transmitted] Computing each of these probabilities is difficult and requires integration over a possibly complicated Voronoi region specific to each type of modulation.2. so it is useful to calculate an upper bound on the probability of error. Let Pe|ci be the probability of symbol error given ci ∈ C was sent. . For complex AWGN   2 Ex ||cm − cl ||  P [cm → cl ] = Q  (4) No 2 PEP overestimates the probability of decoding cm as cl . .

For the typical Gaussian memoryless channel the channel capacity is C = B log2 (1 + SNR). Thus P [B log2 (1 + SNR) < R] = P SNR < 2R/B − 1 Generally for other channels the outage condition reduces to the SNR being below a certain threshold.1.2 Outage Probability Formally the channel is in outage if the rate of transmission exceeds the channel capacity. 2. The outage probability is the probability that this situation occurs: P [C < R]. min Then   M M 1 P ||cm − cl ||2  Pe ≤ Q M m=1 l=1 No 2 l=m ≤ 1 M 1 M M M  Q m=1 l=1 l=m M P No 2  d2  min   = (M − 1)Q  m=1  = (M − 1)Q  The Chernoff bound on the Q-function is Q(x) ≤ P d2  min No 2 P No 2  d2  min (6) 1 −x2 /2 e 2 (7) So then the probability of error can be approximated as Pe ≤ P M − 1 − N d24 min e o 2 (8) This bound is very useful in calculating the diversity gain for simple multiple antenna systems. Thus the diversity gain can generally be found from the probability: P [SNR < γ] 4 . From expressions for outage probability one can also find the diversity gain in a manner similar to the average probability of error method.Then let d2 be the square of the minimum distance between points in the constellation C.

which are placed on the cosine and sine terms. The major assumptions underlying these schemes are whether the receiver has channel information(CSIR) or the transmitter has channel information(CSIT).1 Channel Models The channel model for these basic MIMO schemes is a simple extension of the scalar Rayleigh channel. which is demultiplexed by the symbol mapping operation into two streams. however. CSIT is trickier to achieve as the receiver must estimate the channel and feed the estimate back to the transmitter through a feedback channel in FDD or the transmitter must assume that it sees the same channel as the receiver in TDD. This channel model is justified in terms of physical propagation models in the next section. This system has two real degrees of freedom (1 complex degree of freedom) because independent streams of bits could be transmitted on the cosine and sine terms. 3 Basic Schemes for Multiple Antennas Now consider a few basic multiple antennas schemes that can provide diversity. To see how MIMO achieves this first consider QAM.2. the two independent streams usually come from one original stream. Feedback entails a cost in terms of lost capacity and bandwidth. so there are four degrees of freedom. Fundamentally MIMO provides increased rates in a similar way by providing even more degrees of freedom. each transmit antenna can send an independent stream.2 Spatial Multiplexing Besides providing diversity gain and improved error performance MIMO can also provide increased data rates and spectral efficiency through spatial multiplexing. In the 4 × 4 MIMO case. The channels are now modeled as complex Gaussian vectors with CN (0. nr }. I) distribution. 3. 5 . In practice. The transmitted signal can be expressed as xn (t) = an (t)cos(2πfc t) − bn (t)sin(2πfc t) (9) assuming the appropriate normalizations have been made. which will be received by all four antennas simultaneously. The maximum possible complex degrees of freedom for MIMO is min{nt . CSIR is a pretty common assumption and can be achieved through several estimation methods. for example. The degrees of freedom in MIMO come from the multiple antennas transmitting independent streams.

No ).3. For a complex gaussian vector of length m with correlation matrix R the pdf for the vector h is given by 1 ∗ −1 e−h R h (11) fh = m π | det R| Also by the definition of the pdf ··· 1 π m | det R| e−h ∗ R−1 h (10) dh = 1 (12) Then the average probability of error for the Rayleigh channel is Pe M − 1 − h∗ hP d24 min ≤ E e No 2 M −1 1 − h∗ hP d24 −h∗ h min e dh = e No 2 π M −1 = 2 M −1 = 2 = 1 −h∗ e π 1+ „ P 1+ N o d2 min 4 «−1 h dh d2 min 4 o 2 P dmin No 4 2 P dmin No 4 −1 −1 e „ P −h∗ 1+ N «−1 h dh π 1+ 2 P dmin No 4 M −1 2 1+ (13) At high SNR P e ≈ SNR−1 (14) This corresponds to a diversity gain of 1.the Single-In Multiple-Out (SIMO) case. which is to be expected as there is only one copy of the signal. This system can be modeled as y[n] = hx[n] + v 6 (15) .3 Maximal Ratio Combining Consider a system with a single transmit antenna and nr receive antennas .2 Scalar Rayleigh Channel For comparison consider the scalar Rayleigh channel y[n] = hx[n] + v[n] with h¬ CN (0. 3. 1) and v[n]¬ CN (0.

In MRC this is done with a weighted summation of the received branches performed by a complex vector q. Multiplying by the conjugate of the channel co-phases the signals and then weights the branches by the channel amplitude [Rapp02]. z[n] = qhx[n] + qv (16) In this case the SNR can be calculated and bounded with the Cauchy-Schwarz inequality.4 Selection Combining This method has the same general setup as MRC but the receiver selects the best receive antennas with largest |hi | as opposed to combining the signal from all antennas [Jak71]. SNR = |qh|2 P |q|2 No |q|2 |h|2 P ≤ |q|2 No |h|2 P ≤ No (17) Thus the optimal choice for q is q = h∗ . This is effectively a matched filter.This model is basically an extension of the scalar Rayleigh channel to the vector iid Rayleigh channel. which achieves the maximum SNR [Kah54]. Now to calculate the diversity gain of MRC through the average probability of error with the union upper bound consider: Pe ≤ E M − 1 − ||h||2 P d24 min e No 2 M − 1 − h∗ hP d24 min = E e No 2 Then since R = I.the number of receive antennas and also the number of copies of the symbol that the receiver sees. 3. The receiver must take the received parallel signal and estimate the transmitted symbol. Pe ≤ M −1 2 ··· 1 − h∗ hP d24 −h∗ h min e No e dh nr π P d2 4No min −nr M −1 = 2 = 2 Then at high SNR +1 −nr ··· π nr nr e −h∗ “ P ( 4No d2 +1) min −1 ” I h dh P d2 4No min +1 (18) M −1 P d2 4No min +1 P e ≈ SNR−nr (19) From this calculation it is evident that the diversity gain is nr . This 7 . This kind of action is similar to the RAKE receiver for CDMA.

3. However.5 Equal Gain Combining The branches from each antenna are first co-phased to cancel out the effects of the channel and then they are simply added together to produce the output. If the channel tap hi = αi ejθi . To get an intuitive explanation for this one can consider the effective channel that the receiver sees to be h1 + h2 + · · · + hnt ¬ CN (0. . This effective channel behaves like a scalar Rayleigh n channel. but with a 1-3 dB penalty depending on the exact setup and number of antennas. Assuming each branch has amplitude sk . With multiple transmit antennas it is important to keep the total transmit power P constant to allow a fair comparison to the cases with only one transmit antenna. Most of the gain comes from going from one receive antenna to two and three receive antennas. . 1) (23) √ nt The √1 t normalizes the transmit power. this method does not achieve any diversity gain. then the co-phasing operation is simply a multiplication of each branch by e−jθi . snr } < S] = 1 − e−S So the pdf of smax is given by psmax (S) = nr 2Se−S Then the average received SNR is nr 2 2 nr (20) 1 − e−S 2 nr −1 (21) SNR = P i=1 1 i (22) It is obvious from this equation that increasing the number of receive antennas provides a diminishing return. . 3. which requires CSIT and is a close analog of MRC.6 Transmit Maximal Ratio Combining We have considered the SIMO case and now it is time to consider the Multiple-In Single-Out (MISO) case with nt transmit antennas. An approach that does work is Transmit Maximal Ratio Combining (TMRC). The question now is if any diversity gain can be achieved and if so how? A first attempt to achieve diversity gain with multiple transmit antennas is to simply transmit the same symbol on each branch. then as outlined in [SA00] P [max{s1 .nr . The system can be modeled as y = hx + v 8 (24) . . s2 . which provides no diversity gain beyond the scalar Rayleigh channel. Equal gain combining produces performance similar to MRC and achieves the full diversity gain as demonstrated in [Yac93].method can achieve the same diversity gain as MRC .

To transmit two symbols u1 and u2 do the following over two symbol times: 1. which entails a host of other problems including delay issues and channel estimation accuracy issues. Alamouti in [Ala98] showed that in the 2 × nr case it is possible to achieve the full diversity. To get the idea of the Alamouti code consider the 2 × 1 case. During the second symbol time send x1 [n + 1] = −u∗ and x2 [n + 1] = u∗ . 1 2 The system can then be written in matrix form as y[n] y[n + 1] = h1 h2 u1 −u∗ 2 u 2 u∗ 1 + v[n] v[n + 1] (27) The receiver is trying to detect u1 and u2 . During the first symbol time send x1 [n] = u1 and x2 [n] = u2 . 2. 3. without CSIT using a clever transmit scheme with minimal drawbacks.7 Alamouti Code Using TMRC requires CSIT. The diversity gain for this scheme is nt . For a narrowband channel the system can be modeled as y[n] = h1 x1 [n] + h2 x2 [n] + v[n] (26) with h1 and h2 the channel coefficients. Then the received signal for each symbol that is used for detection is ri = ||h||2 ui + vi ˜ (29) 9 . However. so it is more convenient to write the system in the following form obtained by conjugating y[n + 1]: y[n] y [n + 1] ∗ = h1 h2 h∗ −h∗ 2 1 u1 u2 + v[n] v [n + 1] ∗ (28) The two columns of the square matrix are orthogonal: h∗ h2 1 h∗ −h1 2 h1 h2 h∗ −h∗ 2 1 = |h1 |2 + |h2 |2 0 2 0 |h1 | + |h2 |2 Thus this detection problem can be decomposed into simple scalar detection problems by projecting the receiver vector y onto each column of the H matrix.A weighting vector q sends a weighted version of the current symbol x to each antenna. 2nt . So then y = hqx + v (25) Then by a derivation similar to MRC it can be shown that the optimal choice for q is q = h∗ [God97].

In fact. This power loss hurts detection but no so much as to make the Alamouti code useless. For all MIMO channels we will assume the rate of transmission is high enough that the channel will be slow fading. In detection the vector channel is decomposed into a scalar Rayleigh ˜ channel for each symbol. At lower power it is easier to find cheap amplifiers that can operate in the linear region. The Alamouti code is representative of a larger class of codes call orthogonal space-time block codes (O-STBCs) that also have easy detection due to orthogonality. It can be shown that the diversity gain is 2. there are some advantages to using antennas transmitting with lower power. Figure 2: Comparison of Alamouti and MRC Error Performance [OC06] 4 MIMO Channel Modeling and Capacity In this section we will consider several MIMO channels and the physical meaning behind these channels. the Alamouti code can be extended to the full 2×nr case by using the same transmission scheme as the 2 × 1 case and MRC. Finally. This method provides the full 2nr diversity gain. Since each symbol is transmitted twice. the transmit power of each antenna must be reduced by 3 dB compared to the single antenna case to normalize the total power.with vi ¬ CN (0. Also the Alamouti code transmits two symbols over two symbol periods. which is a reasonable assumption in any modern high speed wireless system. Of particular interest is how the structure of a MIMO channel suggests the gains of MIMO. so its effective rate of transmission is the same as the original symbol rate. No ). 10 .

No Inr ). vnr (30) with vi ¬ CN (0.  .  +  .1 Narrowband MIMO Channel First.1. nt }. . The singular value decomposition (SVD). Both U and V are unitary. . which means UU∗ = U∗ U = Inr VV∗ = V∗ V = Int Then the system becomes y = (UΣV∗ )x + v Define y = U∗ y. Then the matrix Σ is zero ˜ except on the diagonals where Σii = σi is the ith singular value of H. 4. No ). the capacity is easy to compute. In this case the system can be modeled with matrices as y = Hx + v.4. and v = U∗ v. In addition by convention. since U∗ is unitary. This coordinate change transforms the complicated system described by H into the simple system with independent parallel channels described by Σ. and Σ ∈ Rnr ×nt . This is a nice mathematical formulation. however. Let nmin = min{nr . . The capacity that a MIMO system can support in this case assuming CSIR and CSIT is n 2 Pi σi (36) Csum = B log2 1 + N0 i=1 11 . σ1 ≥ σ2 ≥ · · · ≥ σnmin . This can be written as       y1 h11 · · · h1nt x1  . ynr hnr 1 · · · hnr nt xnt  v1 . The SVD of H is H = UΣV∗ with U ∈ Cnr ×nr . .1 Narrowband MIMO Channel Capacity Since the MIMO channel has been decomposed into several parallel channels. = . Then ˜ ˜ ˜ y = Σ˜ + v ˜ x ˜ (35) (34) (32) (33) (31) with v ¬ CN (0.  .   . . consider the narrowband MIMO channel in which the channel is modeled as a single complex coefficient hij between the jth transmit antenna and the ith receive antenna [Gold05].. . x = V∗ y. V ∈ Cnt ×nt . can provide the desired insight.  . . but it offers little insight into what constitutes desirable properties for H.  .    .

The power allocation Pi can be chosen by trying to maximize Csum subject to the constraint nmin Pi = P . 4. So obviously we want k as large as possible. ∂ ∂Pi nmin B log2 i=1 2 Pi σi 1+ N0 ∂ = λ ∂Pi nmin Pi i=1 2 Bσi = λ 2 (Pi σi + No ) log(2) Pi = B No − 2 λ log(2) σi (37) with λ chosen such that nmin Pi = P . which is nmin at most in the case that H has full rank. Jensen’s inequality can give more information about behavior of the capacity with respect to H. This power allocation method is known as the waterfilling i=1 B o power allocation. κ(H). This is achieved precisely when i=1 all the singular values are roughly equal. In matrix theory this quantity σmin is the condition number.2 Physical Modeling of MIMO Channels The major goal of this section is to see how MIMO’s ability to spatially multiplex depends on the actual propagation environment. Tel95]. Thus H should be well conditioned to ensure a large capacity. Also. CKT98. Lagrange multipliers can be used in i=1 this case to compute the optimal power allocation. so k C≈ i=1 B log2 1 + 2 P σi kNo k ≈ k log2 (SNR) + i=1 log2 2 σi k (38) k is thus the parameter that controls the number of spatial degrees of freedom and hence the number of independent streams that can be multiplexed [TV05]. At high SNR the waterfilling allocation is close to the uniform power allocation. That the channel capacity increases linearly in nmin at high SNR is one of the most attractive features of MIMO.2 Rank and Condition Number Let k be the number of nonzero singular values of H. and a matrix with κ(H) ≈ 1 is said to be well-conditioned. In other words σmax ≈ 1. 4. The term λ log(2) represents the surface of the water and the N2 term represents σi the depth of the water for any singular value. this section will examine what must be true of the propagation to ensure that the rank and condition number criteria are satisfied. All antenna arrays in this section are assumed to be linear and uniformly spaced. k k 2 P P σi B log2 1 + ≤ B log2 1 + σ2 (39) kNo kNo i=1 i i=1 2 This suggests that the quantity k σi should be maximized.1. which is also the rank of H. 12 .as demonstrated in [CT91.

This normalization eliminates many λs from subsequent equations.2. Figure 3: LOS MISO and SIMO [TV05] 13 .1 LOS SIMO and MISO Channel Suppose the antennas are uniformly and linearly spaced by ∆r λc where ∆r represents the spacing as a fraction of the wavelength.4.

. nr   . . . . e−j2π(nr −1)∆r Ω      e−j2π(nr −1)∆r Ω (45) (43) (42) (44) 1 (1) (1) + ej2π∆r Ω nr ˆ∗ (Ω)ˆr (Ω) = 1 ar a = Then the channel h can be written as e−j2π∆r Ω + · · · + ej2π(nr −1)∆r Ω √ a h = a e−j2πd/λc nr ˆr (Ω) as demonstrated in [SMB01]. . −j2π(nr −1)∆r Ω e Then the following important identity holds:  ˆ∗ (Ω)ˆr (Ω) = ar a 1 nr 1 ej2π∆r Ω · · · ej2π(nr −1)∆r Ω     1 e−j2π∆r Ω . For large d di ≈ d + (i − 1)∆r λc cos(φ) Define Ω = cos(φ). The MISO case is similar and involves the use of  1 ˆt (Ω) = √ a nt          (48) 1 e−j2π∆t Ω . No Inr ). h2 . Thus there is a power gain and increased capacity potentially but no degree of freedom gain and so no spatial multiplexing is possible. . . The channel capacity is C = B log2 1 + P ||h||2 No = B log2 1 + P a2 n r No (46) (47) as given in [TV05]. At baseband the channel gain is given by hi = a e−j2πdi /λc (41) So then the channel can be modeled with AWGN as y = hx + n with h = [h1 . . hnr ] and w¬ CN (0. . Define the following quantity from [Fle00]:   1  e−j2π∆r Ω 1    ˆr (Ω) = √  a  . e−j2π(nt −1)∆t Ω 14 .The impulse responses between the transmit antenna and each receive antenna are hi (τ ) = aδ(τ − di /c) (40) a models the path loss of the propagating wave and the di /c term models the time it takes for a propagating EM wave to reach the ith receive antenna [SMB01].

Figure 4: Geographically Distributed Antenna Arrays [TV05] 15 .2.3 Geographically Separated MIMO Still consider LOS propagation and the narrowband case.2. Define Ωr = cos(φr ) and Ωt = cos(φt ).2 LOS MIMO Similarly to the SIMO case the baseband equivalent channel is hij = ae−j2πdij /λc If d is large then dij ≈ d + (i − 1)∆r λc cos(φr ) − (j − 1)∆t λc cos(φt ) (50) as shown in [TV05].4. Then the channel matrix is given by √ a at (51) H = a nt nr e−j2πd/λc ˆr (Ωr )ˆ∗ (Ωt ) √ In this case H has rank 1 and the only singular value is a nt nr . 4. Then the capacity is C = B log2 1 + P a2 n t n r No (52) (49) This is the same result as the SIMO/MISO case: no degree of freedom gain.

Define √ i ai = ai nt nr e−j2πd /λc (59) Then the channel matrix can be expressed as H = a1 ˆr (Ωr1 )ˆ∗ (Ωt1 ) + a2 ˆr (Ωr2 )ˆ∗ (Ωt2 ) a at a at (60) 16 . This angle satisfies a | cos(θ)| = |ˆ∗ (Ωr1 )ˆr (Ωr2 )| ar sin(πLr Ωr ) = nr sin(πLr Ωr /nr ) with Lr = nr ∆r . Now what remains to be considered is whether H is wellconditioned. In the 2 × nr case as long as the two angles are not a multiple of 1/∆r the two rows of H are linearly independent and thus H has full rank. 4. so ˆr (Ωr1 ) and ˆr (Ωr2 ) are linearly independent as long as Ωr1 − Ωr2 is a a not an integer multiple of 1/∆r .4 Two-Ray MIMO Consider the full MIMO case with antenna arrays at both the transmitter and receiver. Let d(i) be the distance between transmit antenna 1 and receiver antenna 1 along path i. Thus in this case spatial multiplexing is possible. To determine this consider the angle θ between the two columns of H associated with the two transmit antennas.Then the channel between the kth transmit antenna and all the receive antennas is √ hk = ak nr e−j2πdk /λc ˆr (Ωrk ) a (53) with dk the distance between the kth transmit antenna and the first receive antenna [PNG03.2. So basically when the difference between two directional cosines of two 1 angular paths are within Lr the receiver can’t distinguish between the two paths. the function ˆr (Ω) doesn’t take on the same a a value twice in one period. Her04]. which occurs when |Ωr − m 1 | << ∆r Lr (58) for some integer m. ˆr (Ω) is periodic with period 1/∆r . Then the two singular values are λ1 = Thus κ(H) = 1 + | cos θ| 1 − | cos θ| (57) a2 nr (1 + | cos θ|). λ2 = a2 nr (1 − | cos θ|) (56) (54) (55) Thus the matrix is ill conditioned whenever | cos(θ)| ≈ 1. This is similar to the case in frequency selective channels in which the bandwidth of the system controls which multipath delays can be resolved. Also.

3 Statistical Modeling of MIMO Channels In the case of a frequency selective channel the channel can be modeled as an FIR filter with taps {h[n]}. In modeling a MIMO channel the interest is not in time resolution of multipath but angular resolution at the transmitter and receiver [Par00].Figure 5: Two-Ray MIMO [TV05] as in [PNG03. In this case not all individual multipath components can be resolved but only multipath components that differ in delay by a sufficient amount related to the system bandwidth. The term hij is j 1 1 the aggregation of all paths of angular spacing Lt about Lt and angular spacing Lr about Lir . Paths that have Ωs that differ 1 1 by less than Lt at the transmitter or Lr at the receiver can not be resolved. Her04]. This expression for the channel can be put in matrix form as H= a1 ˆr (Ωr1 ) a2 ˆr (Ωr2 ) a a ˆ∗ (Ωt1 ) at ˆ∗ (Ωt2 ) at (61) To ensure H has rank 2 the following two conditions must hold: Ωt1 = Ωt2 mod Ωr1 = Ωr2 1 ∆r 1 mod ∆r (62) (63) H has rank 2 so spatial multiplexing is possible. If there are an arbitrary number of paths then the channel is given by H= i ai ˆr (Ωri ) ˆ∗ (Ωti ) a at (64) The received and transmitted signals can always be expressed in terms of the follow pair of 17 . 4. Suppose the transmit and receive antenna lengths are Lt and Lr . To ensure that H is well conditioned it is 1 1 necessary that Ωr2 − Ωr1 ≥ Lr and Ωt2 − Ωt1 ≥ Lt that is to say there must be sufficient angular separation at the transmitter and receiver to ensure that the paths can be resolved.

If x is a vector transmitted by the antennas. nr − 1 1 ). ˆt ( ). ˆr ( a ) Lr Lr 1 nt − 1 ˆt (0). like the Rayleigh channel. then in the angular domain xa are related by x = Ut xa . .basis: Sr = St = which represent the angular bins. ˆt ( a a a ) Lt Lt ˆr (0). Then define ya = U∗ y.v. . Let Ut be the nt × nt matrix with columns from St . r In this coordinate system ya = U∗ HUt xa + va r = Ha xa + va (68) Each element ha can be reasonably modeled as independent circularly symmetric complex Gausij sian r. . . . ˆr ( a a (65) (66) Figure 6: Angular Domain MIMO [TV05] Each basis can be used to represent transmitted and received signals in the angular domain in terms of the directional cosine Ω. xa = U∗ x t (67) By examining the matrix Ut it can be seen that xa is the IDFT of x. . . . The validity of this assumption rests on two key factors 18 .

then the fading is not Rayleigh but Ricean. Thus in the narrowband case the MIMO channel is basically an extension of the scalar Rayleigh channel where each coefficient of the channel matrix is a complex Gaussian random variable. 4. In addition.3. As a rule of thumb antenna spacing of at least λ is desirable and results in uncorrelated coefficients [FG98]. The channel in this case can be modeled as N y[n] = l=1 Hl x[n − l] + v[n] (70) as in [TV05]. Thus the channel in this model can support spatial multiplexing.• Amount of scattering and reflection in the multipath environment . The exact amount of correlation depends on the angular spread of the antennas.Short antenna arrays lump many multipath components into the same angular bin. so the achievable diversity gain is reduced. results from random matrix theory show that H with this distribution has full rank with probability 1. Since Ut and Ur are unitary and H = Ur Ha U∗ t (69) H has the same iid Gaussian distribution [CT91]. If there is a strong line-of-sight component. Antenna Spacing The assumption that the coefficients of H are independent or at least uncorrelated depends heavily on the antenna spacing. In practice the channel coefficients are never completely uncorrelated but as a simplifying assumption to make analysis tractable we assume they are uncorrelated and independent. The justification for this model is a straightforward extension of the angular model outlined in the previous sections. Since the coefficients are highly correlated the receiver does not see as many independent copies of the transmitted signal. 19 . A longer antenna array results in better angular resolution of paths and more non-zero entries in Ha . For antennas with small angular spread at separations on the order of λ or smaller 4 the coefficients are highly correlated.1 Frequency Selective MIMO Channel The extension of the preceding flat MIMO channel model to the frequency selective MIMO channel model is fairly straightforward. As the antenna spacing decreases towards λ the channel coefficients become 4 strongly correlated.this model needs several multipath components in each angular bin • The lengths of Lt and Lr . As the antenna spacing 2 increases there is still a diversity gain but it is not quite as large as if the antennas were spaced further. In this model the channel between any two pairs of antennas is modeled as a scalar frequency selective channel in which the output is a convolution of the input and the channel taps.

transmitting at a given rate what is the maximum possible diversity gain. This tradeoff curve is difficult to compute. d∗ (r). In particular. Thus the outage probability is approximately 1 pout ≈ (75) SNR1−r Thus d∗ (r) = 1 − r is the optimal tradeoff. For sufficiently large . . At high SNR the MIMO capacity is C ≈ nmin log2 (SNR) (71) for a channel with full rank. so ∗ pout ≈ SNR−d (r) (72) Thus it makes sense to define d∗ (r) = − lim log pout (r log SNR) SNR→∞ log SNR log Pe (r log SNR) log SNR (73) Alternatively d∗ (r) can be defined in terms of the probability of error d∗ (r) = − lim SNR→∞ (74) Before tackling the full MIMO channel it is useful to consider the diversity-multiplexing tradeoff in scalar and SIMO/MISO channels.1 Scalar Rayleigh Channel The scalar channel is in outage if the capacity it supports falls below the rate of transmission. Of great interest is whether a given space-time code or modulation can achieve this frontier and thus be optimal. P [|h|2 < ] ≈ . 1. On the other hand a MIMO system can transmit nmin independent streams to provide the maximum possible rate with the minimum error protection. This kind of analysis leads to a curve relating the transmit rate and the optimal diversity gain. nmin . Tse and Zheng proposed in [ZT03] studying this tradeoff by making assumptions on the possible rates of transmission and letting the SNR approach infinity. . The diversity-multiplexing tradeoff involves investigating what happens between these two extremes and in particular what constitutes the optimal tradeoff. . So pout is given by pout = P log 1 + |h|2 SNR < r log SNR SNRr − 1 = P |h|2 < SNR |h|2 is chi-squared distributed. 5. Tse and Zheng assumed that only rates R = r log(SNR) are possible with r = 0. . but some methods have been proposed to simplify the study of this tradeoff. 20 .5 Diversity-Multiplexing Tradeoff A MIMO system can transmit one symbol on all the transmit antennas and use the right processing to obtain the full diversity gain nt nr . The optimal diversity gain. is the exponent in the outage probability.

Then the outage (79) The Alamouti code effectively decomposes the MISO channel into parallel Rayleigh channel. 5.1 QAM over the Scalar Rayleigh Channel 2R . can be used. It can be easily demonstrated that the optimal tradeoff curve for this parallel Rayleigh channel is d∗ (r) = 2(1 − r). This scheme 21 . So if QAM is used on each of the scalar channels along with the Alamouti code. SNR It can be demonstrated that for QAM that Pe ≈ d(r) = − lim Then log Pe log SNR log 2r log SNR /SNR = − lim SNR→∞ log SNR r log SNR − log SNR = − lim SNR→∞ log SNR = 1−r SNR→∞ (76) Thus QAM achieves the optimal diversity-multiplexing tradeoff of the scalar Rayleigh channel. The power allocation at the transmitter directly affects the SNR at the receiver.1.3 MIMO Rayleigh Channel The outage probability is given by pout = min Kx :Tr[Kx ]≤SN R P [log det (Inr + HKx H∗ ) < r log SNR] (80) The matrix Kx is the covariance matrix of the input and basically represents a power allocation. then the resulting system is tradeoff optimal for the MISO channel.2 MISO Rayleigh Channel In this case the system can be modeled as y[n] = hx[n] + w[n] Taking the rate R = r log SNR as usual the outage probability is pout = P log 1 + ||h||2 SNR nt < r log SNR nt (77) (78) ||h||2 is χ2n distributed so the approximation P [||h||2 < ] ≈ probability is roughly pout ≈ SNR−nt (1−r) So it is apparent that the optimal tradeoff d∗ (r) = nt (1 − r). 5.5.

In this case H has rank r and H is in the space Vr of rank r matrices in the space Cnt ×nr . so SNR (81) pout = P log det Inr + HH∗ < r log SNR nt This outage probability can be written in terms of the singular values of H as nmin pout = P i=1 log 1 + SNR 2 σ nt i < r log SNR (82) There are no neat approximations to evaluate this outage probability but there is a neat geometric argument to evaluate the outage probability [TV05. For the remainder of this argument restrict our consideration to N . then 0 ∈ Vr . The following paragraph is very technical but the fundamental result is simple: Vr can be considered to be a linear space in a sufficiently small neighborhood. Outage occurs when H is close to 0. If ⊥ the portion of H in Vr vanishes. so the input covariance matrix 1 must be chosen not to exceed the limit. The situation seems hopeless but it has been shown by Tse and Zheng that although there are many ways for the channel to be in outage the most common way is for r eigenchannels to be good and the remained to be bad. But clearly 0 has rank 0. which is orthogonal to Vr . To see that Vr is not linear consider that if Vr were a linear space. the notion of orthogonality can be used. although it turns out that Vr may not be a linear subspace. The question of interest is what happens when H is close to Vr . The worst covariance matrix Kx is approximately nt Inr . This question is tractable but also a little tricky. So the question of whether H puts the channel in outage is the question of whether H is close to Vr in the appropriate sense. since Vr is not a linear space. Close can be evaluated in terms of the Froebnius norm ||H − 0||F = ||H||F nmin = i=1 2 σi = i. and so the channel is 22 . the surface of Earth is a manifold since a small neighborhood looks like a portion of R2 even though the overall space is clearly not linear. so it is sufficient to consider a small neighborhood N of a point of Vr containing H. so 0 ∈ Vr . Now if r is an integer greater than 0 the situation becomes considerably more complicated. A manifold is a space with the property that small neighborhoods of a point look like linear subspaces of Rk or Ck . For example. First consider r close to 0. N looks like a linear subspace of Cnt ×nr . it is a manifold embedded in Cnt ×nr . Thus Vr is not a linear / nt ×nr subspace of C . Since Vr can be considered locally linear. ZT03]. since there are more ways to choose bad λi to put the channel in outage.j |hij |2 Thus the magnitude of each channel coefficient |hij | must be close to 0 for the channel to be in outage.makes a specific assumption about the rate R at a given SNR. then H is basically in Vr . However. H has rank r. Then H can ⊥ be decomposed into a portion in Vr and a portion in the space Vr .

. . . then r rows of length nt can be chosen and the remaining nr rows can be written as linear combinations of the first r rows. ⊥ nt nr = dim Cnt ×nr = dim Vr + dim Vr Thus ⊥ dim Vr = nt nr − (nt r + (nr − r)r) = (nt − r)(nr − r) (83) Thus pout ≈ SNR−(nt −r)(nr −r) and so the optimal tradeoff is given by d∗ (r) = (nt − r)(nr − r) for r = 0. where d is the dimension of Vr . Since Vr and Vr⊥ decompose the nt × nr space. nmin . The rate at which the 23 . Figure 7: Diversity-Multiplexing Tradeoff For MIMO [TV05] 6 Space-Time Coding over Narrowband Channels There are two major types of space-time codes: block codes and trellis codes. A trellis code is a convolutional code in which the current output depends on a block of input bits and the previous input bits represented by the state of the trellis code. which assumes that the channel remains constant over the duration of a code. From this it follows that dim Vr = nt r + (nr − r)r. which are derived from the similar structures in the single antenna case.⊥ in outage as discussed before. There names imply their structures. 1. If H is of rank r. The basic idea of a space time block code is to map Q symbols into a block of transmitted symbols of size nt × T for some integer T . The probability that the portion of H in Vr vanishes(the outage ⊥ probability) is SNR−d . . One general assumption on almost all space-time codes is the quasi-static assumption.

A good space-time code should then achieve a high diversity gain and a high coding gain. but not in the middle of codewords. One approach to finding these conditions for the slow fading MIMO channel is to consider what factors affect ML decoding of the codewords. The channel can change between codewords. Conditioning on the channel matrix H the PEP is [Pro01]   T SNR ||H(ck − ek )||2  (85) P [C → E|H] = Q  F 2 k=0 Averaging over all channel realization gives the average PEP: P [C → E]. Then the error probability of interest is the paired error probability(PEP) that a codeword C is incorrectly decoded as E. The quantity c improves performance and is called the coding gain. dg can be defined in terms of the PEP as dg = − lim log P [C → E] SNR→∞ log SNR (86) Generally at high SNR the PEP is of the form (c × SNR)−dg . then noise can lead to incorrect estimation of a codeword as another codeword. which is in turn related to the Doppler spread. If two codewords are close together. The covariance of two 24 .1 Error Motivated Design It is important and interesting to find conditions that will guarantee a good error performance for a space-time code. The optimal way to detect a codeword is with ML detection is given by ˆ C = arg min ||Y − HC||2 C∈C (84) The operation of this detector is limited mainly by the closest pair of codewords.Figure 8: Space-Time Encoder Structure channel changes is related to the coherence time. In a way similar to the diversity-multiplexing tradeoff the diversity gain. The system must be designed to ensure that the duration of a codeword is less than the coherence time. The relevant question now is how to achieve diversity and coding gains. 6.

A quantity of interest is the effective symbol rate of the code: Q rs = (91) T 25 .E∈C C=E (90) These criteria guarantee good codes at high SNR.Maximize the product of the nonzero eigenvalues to achieve coding gain   ˜ rank(E) dλ = min  C. Then the PEP is given by [SA00. This expression can be further bounded to yield −nr  ˜ ˜ rank(E) −nr rank(E) SNR  λi  (87) P [C → E] ≤ 4 i=1 rank( ˜ ˜ Thus the diversity gain is nr rank(E) and the coding gain is i=1 E) λi .Maximize the minimum rank of the codeword difference matrix to achieve a good diversity gain always:   ˜ max  min rank(E) C. P [C → E] = 1 π π/2 det Int + 0 ˜ π/2 rank(E) 0 i=1 SNR ˜ E 4 sin2 β −nr dβ −nr 1 = π ≤ SNR 1+ λi 4 sin2 β −nr dβ ˜ rank(E) i=1 SNR λi 1+ 4 ˜ with the second expansion due to expressing the determinants in terms of the eigenvalues λi (E) and the last expansion valid at high SNR. 6.2 Space-Time Block Codes A space-time block code(STBC) maps a block of Q input symbols into a block of symbols of size nt × T to be transmitted on the antennas.˜ codewords C and E is the matrix E = (E−C)(E−C)∗ .E∈C C=E (88) • Determinant Criterion . Sim01].E∈C C=E i=1 λi  (89) In the case where the codeword matrix always has full rank this becomes maximize ˜ dλ = min det E C. Given these two gains there are two criterion for a good space-time code at high SNR are as follows [TSC98]: • Rank Criterion .

The codeword of the linear block matrix can be expressed as a linear function of complex nt × T basis matrices φq and input symbols c1 . cQ as follows [HH01]: Q C= q=1 φq {cq } + φq+C {cq } (92) It may seem a little odd to break up the real and imaginary components of the symbols. . In the case of linear STBCs if the basis matrices are unitary meaning φ∗ φ = Int if T ≤ nt (Tall matrix) or φφ∗ = IT if T ≥ nt (Wide matrix). 6. The following example with the Alamouti code shows that this is possible. but the advantage of this approach is that conjugation of symbols can be used in linear STBCs. which represents the Alamouti code: c1 −c∗ 2 c2 c∗ 1 Then the code can be represented with basis matrices as: φ1 = φ3 = 1 0 0 1 1 0 0 −1 φ2 = φ4 = 0 −1 1 0 0 1 1 0 (94) (93) Code Design Criteria for Linear STBCs As we saw in the previous section minimizing the worst PEP is a good strategy to develop a good space-time code. c2 . Example: Alamouti code The two complex symbols c1 and c2 are mapped into the following matrix. An orthogonal STBC has codewords C that satisfy the following key property Q T ∗ |cq |2 Int (97) CC = Qnt q=1 26 . but one of the most common is the linear block code.2. For rs < 1 the system on average transmits less than one symbol per symbol period. . Codes with rs < 1 effectively reduce the rate of transmission. .For rs = 1 the system effectively transmits one symbol per symbol period. then the PEP condition is φq φ∗ + φp φ∗ = 0 q = p (Wide) p q φ∗ φp + φ∗ φq = 0 q = p (Tall) p q (95) (96) Orthogonal STBCs There are a special class of linear STBCs that have special orthogonality property that leads to easy decoding [TJC99].1 Linear STBCs There are many different classes of space-time block codes. .

. . However. this is not very useful as many constellations such as QAM are complex. . If the O represent Alamouti codewords. However. c2Q ) O(cQ+1 . then the codeword matrix is   c1 −c∗ c3 −c∗ 2 4 1 c c∗ c4 c∗  1 3  (100) Q(c1 . c4 ) =  2 2  c3 −c∗ c1 −c∗  4 2 c4 c∗ c2 c∗ 1 3 Then during decoding the codeword matrix is multiplied  a 0 b 1 0 a 0  QQ∗ = 4 a 0 b 0 a 0 where 4 by its conjugate. . . . Quasi Orthogonal STBCs O-STBC achieve full diversity but at the expense of any spatial multiplexing. . Quasi Orthogonal STBCs (QO-STBCs) attempt to achieve some of the benefits of O-STBCs while also providing for some spatial multiplexing by using smaller O-STBCs as building blocks. . . . It is clear in the case of Alamouti that it takes two symbol times to transmit two symbols. . which works on complex constellations. . . C2Q ) = O(c1 . . For more than two transmit antennas. . it turns out that the Alamouti code is the only O-STBC that works on complex symbols that achieves a transmit rate rs of one symbol per second. If rs < 1 then it is always 2 possible to find an O-STBC that achieves good diversity. . cQ ) (99) gs ) gs ∈ [0. rs ] rs (98) were each O is a codeword matrix for a smaller O-STBC on only Q input symbols [TBH00]. For example a QO-STBC could be Q(c1 . For a purely real constellation it is always possible to find a real O-STBC for an nt that achives rs = 1. . rs < 1 always.This property is very nice because it implies that easy decoding is possible due to the orthogonality. so the transmit rate rs = 1. The key example of an O-STBC is the Alamouti code. c3 . which yields  0 b   0  b (101) a = b = q=1 c1 c∗ 3 |cq |2 + c3 c∗ − c2 c∗ − c4 c∗ 4 2 1 27 . cQ ) O(cQ+1 . . The diversity multiplexing tradeoff for O-STBCs is given by [OC06] as d∗ (gs ) = nt nr (1 − for QAM constellations. . . c2Q ) O(c1 . c2 .

. . LDCs are derived through numerical optimization to determine. . Linear Dispersion Codes The BLAST architecture achieves high multiplexing gain at the expense of diversity gain. Heath and Sandhu LDCs [Hea01. c2Q ) O(c1 . . Fundamentally designing an algebraic code comes down to choosing the appropriate matrices M1 . . . . O-STBC in contrast achieve high diversity gain at the expense of multiplexing gain. which means the QO-STBC fails the rank condition. . . but instead of transmitting a conjugate transmit a rotated version of the first set of symbols. and φ. C2Q ) = O(c1 . then det(E) = 0. c2Q )∗ O(cQ+1 . A way to improve on this is to use rotated variations of the base constellation to prevent rank deficiencies and achieve good diversity gain [SP03. 28 . Other combinations of O-STBCs have been proposed including the following Alamouti like scheme [Jaf01] Q(c1 . XL05]. Hassibi and Hochwald LDCs [HH01] 2. . which basis matrices are optimal relative to some criteria that balances diversity and multiplexing gain. M2 . .The codeword matrix doesn’t nicely decouple like in the case of O-STBC. . cQ ) −O(cQ+1 . . Linear dispersion codes(LDC) try to achieve a little of both. . Rotated QO-STBCs Because of the way quasi orthogonal matrices are constructed if two ˜ codewords E and C each contain one point from the constellation. There have been several LDCs proposed including 1. cQ )∗ (102) Decoding with this scheme has complexity similar to the previous case of QO-STBCs. . . Algebraic codes also transmit a symbol twice. In terms of the codeword marix this can be written as C= with u1 u2 = M1 c1 c2 v1 v2 = M2 c3 c4 (104) u1 φ1/2 v1 φ1/2 v2 u2 (103) M1 and M2 are unitary matrices and the constellation points come from QAM that represent the rotations. . . . San02] Algebraic STBCs The Alamouti code works by transmitting two symbols and then their conjugates arranged in the appropriate way. WX05. This implies that in some cases QO-STBCs will have bad diversity gain. SX04. . which greatly reduces complexity. but at least the first/third and second/fourth columns can be decoded separately.

(108) (109) √ 1+ 5 2 The figure below shows how these space-time codes compare to the optimal diversity-multiplexing tradeoff: 29 . Finally. Golden Code This code [BRV05] is given by 1 M1 = √ 10 1 M2 = √ 10 α and θ are chosen in terms of the golden ratio α αθ α αθ 1 0 0 j and the constellation.φ code 1 1 ejπ/4 M1 = M2 = (106) 2 1 e−jπ/4 Tilted QAM This code [YW03] is given by 1 Mi = √ 2 cos ωi sin ωi − sin ωi cos ωi (107) This choice of Mi is literally a rotation matrix that rotates points about the origin by ω i radians. Threaded Algebraic Space-Time Code(TAST) This code [GD03] is similar to the B2.B2. Optimization methods can be used to find φ. φ = ejω .φ code In this code [DTB02] M1 = M2 = 1 2 1 ejω 1 e−jω (105) and ω is chosen by numerical optimization to fit the given constellation.

The Diagonal and Vertical Bell Labs Space Time Architectures (D-BLAST/V-BLAST) suggest general architectures to achieve the gains of MIMO.Figure 9: Diversity-Multiplexing Tradeoff For Several Techniques [OC06] The figure below shows the error performance of several space-time codes: Figure 10: Error Performance For Several Techniques [OC06] 6.3 Bell Labs Space Time Architectures The sections on the MIMO channel have demonstrated that MIMO can provide both a degree of freedom gain (increased capacity) and a diversity gain (better error performance). The general idea of the BLAST architectures 30 .

6.is to multiplex several streams of symbols (possibly demultiplexed from one original stream) onto the multiple antennas and then receive and decode the streams. Sometimes a system provides a codebook of Q matrices that the transmitter can use. The next step is decoding in which any codes that were applied to individual streams 31 . There are two natural choices for Q depending on whether there is CSIT or not. The design of efficient V-BLAST receivers is an active area of research. logically it makes more sense to present V-BLAST first and then discuss how D-BLAST is logically an extension of V-BLAST. and the complexity of decoding. If there is CSIT. V-BLAST Receiver Structures There are two general steps in the V-BLAST receiver. In V-BLAST there is a large degree of freedom in choosing the exact receiver structure. The action of Q is to rotate the input streams. so that the action of the channel can be expressed in a simple form. These actions create an equivalent channel model: y = Σ˜ + v ˜ x ˜ (110) The complex MIMO channel is reduced to several parallel scalar channels with each subchannel carrying one stream. The choice of receiver structure affects error rates. The feedback from receiver is just an index into the codebook that tells the transmitter. Foschini suggested the D-BLAST architecture first and then V-BLAST was developed later as a simplification. capacity.3. In this case the choice of receiver is an interesting problem and there are many choices all with different choices. then the situation is considerably more complicated and interesting. G. However. At the receiver the received vector y is multiplied by the matrix U from the SVD of H. The first is demodulation in which the receiver estimates what symbol was sent and hence which bits were sent. Historically. At the receiver the streams are decoded jointly or individually. If there is not CSIT. then the matrix V from the SVD of H can be used. which Q to use.1 V-BLAST The general architecture of V-BLAST is described in the figure below [GFVW99] Figure 11: VBLAST Architecture The independent streams are multiplexed by the matrix Q onto the transmit antennas. In this case the best choice for Q is simply the identity matrix Int . This form of feedback massively reduces the required bandwidth in the feedback channel.

Then there are two key problems that sphere decoding has to deal with [HV05]. where s is a point in the original constellation. the actual transmitted lattice point is likely to be close by the received vector and in the sphere. Basically any convolutional and block code can be applied to individual stream. In this case sphere decoding agrees with ML detection. In addition. Figure 12: Idea Behind Sphere Decoding [HV05] This process reduces the search space and necessary number of computations. How to find lattice points inside the sphere? The detector can not compare the received vector to every point in the lattice to find the points inside the sphere or it would be performing an exhaustive search offering no advantage over normal ML-detection. Although this method is optimal it is computationally complex (NP-hard) as it must be performed over all possible transmit vectors. The ML receiver estimates the transmitted streams by the rule [TV05] ˆ = arg min ||y − Hs||2 s s∈C (111) Practically what this method does is pick the closest point to the received vector in the lattice of points formed by Hs. which is what ML detection would pick as an estimate of the transmitted lattice point. If the sphere actually contains any points. since the transmitted vector is corrupted by AWGN. these ML-like algorithms can feed soft decisions to the decoders to improve their performance. 1. The optimal V-BLAST receiver is the ML-receiver that jointly decodes the streams.are decoded. Sphere decoding is one such algorithm [VB99]. Sphere Decoding Although the ML detector is basically computationally infeasible in many practical system there has been considerable interest in algorithms that are similar to MLdetection in methodology and performance but with considerably less complexity. In addition. This problem is known as the integer least squares problem. 32 . then obviously it must contain the closest point. This computational complexity generally makes it infeasible to use an ML detector. The basic idea behind sphere decoding is to look only at points within a sphere of radius d about the received vector and then choose the closest point inside the sphere [HV05]. so we are primarily interested in different architectures for demodulation.

Now the algorithm proceeds inductively by assuming that all k-dimensional points within the sphere of radius d have been found. which is the easy one-dimensional problem. The algorithm proceeds first by calculating the QR factorization of the matrix H: R H=Q (114) 0(n−m)×m 33 . Figure 13: Tree for Sphere Decoding [HV05] To see exactly how sphere decoding works suppose the lattice we are working on is the integer lattice Zm [HV05].2. Fix a sphere radius d. B A solution to problem number one above is based on a simple observation: the problem is difficult in general but easy in one dimension. Suppose the channel matrix H ∈ Rn×n and that n ≥ m. One way to choose the sphere radius is to compute the Babai estimate for the transmitted symbol sˆ . then the sphere may contain no points. There are other heurestic methods to choose d. This process continues until the full dimension of the search space is reached. This estimate is not actually a point in B the lattice. This process is usually visualized as a tree where the kth level of the tree corresponds to the points of dimension k inside the sphere of radius d. The goal is to find the points s ∈ Zm such that ||y − Hs||2 ≤ d2 (113) where y is the received vector. then the detector considers too many points. but the least squares solution (not constrained to the lattice) given by sˆ = arg min ||y − Hs||2 B s (112) Then choose d = ||y − Hsˆ . In one dimension the sphere is simply an interval. Then the set of k +1-dimensional points that lies within radius d is an interval. How to choose the sphere radius? If d is too large. so the problem reduces to finding the lattice points inside this interval. If d is too small.

they are simply the integers ˜ ˜ ˜ ˜ −d + ym d + ym ≤ sm ≤ Rmm Rmm (121) The key step in this process is how to proceed from finding the sm in the sphere to finding which {sm−1 .m sm − Rm−1. sm } are in the sphere.m−1 sm−1 )2 y y To make use of this condition proceed as follows: For each sm define 2 ˜ ˜2 d2 m−1 = d − (ym − Rmm sm ) (122) (123) 34 . This decomposition will make later calculations simpler.m−1 sm−1 )2 + · · · y y (119) We observe that the first term depends on only {sm }. Expand the orthogonal matrix Q as Q= Q1 Q2 (115) with Q1 ∈ Rn×m and Q1 ∈ Rn×(n−m) . Finding the integers that satisfy this necessary condition is easy. Then the condition to be in the sphere is given by 1 2 ˜ d2 ≥ ||˜ − Rs||2 y m m 2 (117) Rij sj j=i = i=1 yi − ˜ (118) The sum can be written term by term as ˜ d2 ≥ (˜m − Rmm sm )2 + (˜m−1 − Rm−1. Then the following is a necessary condition for any point s to be in the sphere: ˜ d2 ≥ (˜m − Rmm sm )2 y (120) ˜ ˜ Basically the last coordinate of s must be within d of y. the second term depends on only {sm−1 . Then the points inside the sphere satisfy: d2 ≥ ||y − = || Q∗ 1 Q∗ 2 Q1 Q2 y− R 0 R 0(n−m)×m s||2 s||2 = ||Q∗ y − Rs||2 + ||Q∗ y||2 2 1 This expression can be rearranged to the condition: d2 − ||Q∗ y||2 ≥ ||Q∗ y − Rs||2 2 2 (116) ˜ ˜ Define d2 = d2 − ||Q∗ y||2 and y = Q∗ y. sm } and so on. This is done by ensuring the first two terms in equation 119 are ˜ less than d2 : ˜ d2 ≥ (˜m − Rmm sm )2 + (˜m−1 − Rm−1.where Q is an n × n orthogonal matrix and R is an m × m upper triangular matrix.m sm − Rm−1.

Then we can obtain a condition that sm−1 must satisfy to be in the sphere: ˜ ˜ −dm−1 + ym−1 − Rm−1,m sm dm−1 + ym−1 − Rm−1,m sm ≤ sm−1 ≤ Rm−1,m−1 Rm−1,m−1 (124)

By applying this method to each sm the points {sm−1 , sm } inside the sphere of radius d can be found. This process can be continued until the full m-dimensional problem has been solved. It is clear why a tree is an appropriate structure to represent the operation of sphere decoding, since each leaf gives rise to some number of children (possibly zero) in the next iteration all of whom are inside the sphere as one more dimension of the problem is solved. It is also clear that if we choose the radius to be too small one of the conditions like equation 124 may not be satisfied by any integer and thus no points are in the sphere. If the sphere radius is too large, then too many points may satisfy equation 124 making computing the closest point tricky. Non-Joint Detection Besides joint detection there are a wealth of detectors that work on detecting individual streams from the received signal and don’t attempt to decode all the streams simultaneously. Consider trying to decode one stream xk . The system in this case can be modeled as y[n] = hk xk [n] + hi xi [n] + v[n] (125)
i=k

where hi is the ith column of the channel matrix H. In this system there is a stream of interest plus several interfering streams represented by the sum terms plus a noise terms. To successfully decode the stream of interest the receiver must deal with the interference term and the noise term. Zero Forcing Nulling At high SNR performance will be interference limited not noise limited [GFVW99]. ZF-Nulling attempts to remove all the interfering terms in the sum to leave only the stream of interest. This can be done linearly with a single vector multiplication. The weighting vector qk to decode the kth stream satisfies qT hj = δkj k where δkj is the Kronecker delta which is 1 when k = j and 0 otherwise. Then qT y[n] = qT hk xk [n] + k k
i=k

(126)

qT hi xi [n] + qT v[n] k k δki xi [n] + qT v[n] k

= δkk xk [n] +
i=k

= xk [n] +

qT v[n] k

This weighting vector has an obvious geometric interpretation; the weighting vector projects the received vector y onto a subspace orthogonal to h1 , . . . , hk−1 , hk+1 , . . . , hnt .

35

Figure 14: Zero Forcing Nulling MIMO The weighting vectors are just the columns of the pseudoinverse of H given by H† = (H∗ H)−1 H∗ , so it is not too difficult to compute the appropriate weighting vectors given the channel matrix H. It is easy to calculate the SNR out for each stream using weighting vectors as SNRk = P ||qk ||2 No (127)

ZF-Nulling with Successive Interference Cancellation The SNR has an inverse relation to ||qk ||2 , so if ||qk ||2 can be reduced the SNR will be increased. Results from linear algebra indicate that the higher the dimension of the space that qk must be orthogonal to the larger ||qk ||2 is. So if qk must be orthogonal to fewer vectors, then ||qk ||2 will be reduced. Successive interference cancellation(SIC) can reduce the dimension and increase the SNR. The diagram below shows the operation of SIC.

Figure 15: Successive Interference Cancellation [TV05] With this scheme as each stream is decoded it is subtracted from the received vector. As a result the subtracted scheme does not interfere with any subsequent streams. So then qk must be orthogonal to hk+1 , . . . , hnt . The reduced number of vectors means ||qk ||2 is reduced and SNRk is increased.

36

One practical issue when implementing SIC is the order of cancellation. The last decoded stream has the least interference and achieves the best performance. It has been demonstrated that a greedy choice of order is optimal relative to the maximin criteria [GFVW99]. This means that the kth stream to be decoded should be chosen from the reaming streams as the one that will achieve the highest SNR of the remaining streams if it is decoded now. The maximin criteria means that the smallest SNRk is maximized by choosing the optimal order. The major drawback to SIC is error propagation. Mistakes at the beginning of the decoding chain can introduce mistakes later on. So if one stream is inaccurately decoded, then all subsequent streams will likely be decoded inaccurately. Matched Filter At very low SNR noise is the problem, so a matched filter can be used to deal with the noise. In the MIMO case the matched filter for each stream is simply maximum ratio combining(MRC) performed on the appropriate column of H. MMSE Receiver The matched filter performs well at low SNR and ZF-nulling performs well at high SNR. But at high SNR the matched filter has bad performance and at low SNR ZFnulling has bad performance. So naturally one may wonder if there is a receiver that operates well at both low and high SNR. The MMSE receiver is such a receiver [TV05]. To understand how the MMSE receiver works consider the following SIMO system modeled as y = hx + z (128)

with z colored noise having invertible correlation matrix Kz . The first operation is to whiten −1 the noise by multiplying by Kz 2 . Then the system becomes Kz 2 y = Kz 2 hx + Kz 2 z Then apply a matched filter (Kz 2 h)∗ to yield the system h∗ K−1 y = (h∗ K−1 h)x + h∗ K−1 z z z z (130)
−1 −1 −1 −1

(129)

Thus the receiver simply multiplies the received signal by h∗ K−1 and performs normal demodz ulation. This is the MMSE receiver, which maximizes the SNR, while minimizing the MMSE between the estimate of x and x itself. For V-BLAST the corrupting non-white noise is the interference terms plus the additive noise. The covariance matrix for this noise is given by Kzk = No Inr +
i=k

P i hi h∗ i

(131)

A similar derivation shows that the MMSE receiver in this case the weighting vector is
−1

qk =

No Inr +
i=k

P i hi h∗ i 37

hk

(132)

Suppose that there are two separate streams each consisting of two blocks. b(2) can be received with MRC. which is a matched filter. If a SIC structure is used with either the MMSE receiver or ZF-nulling. 5. 3. Next decode the spatial code across the first layer [a(1) a(2) ]. then subsequent streams will likely be incorrectly decoded. Thus the maximum possible diversity gain for any individual stream is nr and there is a limit to how much MIMO diversity techniques can protect a stream [Fos96]. Thus the MMSE receiver is like ZF-Nulling at high SNR. At low SNR Kzk ≈ No Inr (133) so the receiver is given by hk . Now both streams have been decoded reliably. Consider the case with two transmit antennas. Coding across streams is used to ensure each stream is reliably decoded. In addition. Denote this by a(i) and b(i) for i = 0. but in order to decode the spatial code across the streams each stream must already be decoded in V-BLAST. while ignoring b(1) . The key observation is that for a single layer if one of the blocks for one stream is initially decoded incorrectly. so it can be cancelled out and b(1) can be received. First receive a(1) with MRC 2. The solution to this problem is to alter the way the streams are transmitted. If SIC is used in conjunction with MMSE. Then the second layer [b(1) b(2) ] can be decoded.It is pretty easy to see that the MMSE receiver is a tradeoff between the matched filter and ZF-Nulling. Now a(2) has been reliably decoded. Then the D-BLAST codeword is a(1) b(1) (135) C= a(2) b(2) From this codeword matrix it is obvious where D-BLAST gets its name from. then if one stream is incorrectly decoded. Next receive a(2) with MMSE or ZF-nulling. The main reason for this problem is that no coding is performed spatially across the multiple streams. since the layers are now diagonal. It is transmitted by one antenna and received by all nr receive antennas.2 D-BLAST Consider the kth stream.3. 4. there is still a chance to fix 38 . Finally. then MMSE-SIC can achieve the channel capacity. The receiver works as follows: 1. 1. 6. the MMSE receiver has good performance in the region between high and low SNR. At high SNR Kzk ≈ i=k P i hi h∗ i (134) and it can be seen that qk is simply the kth column of the pseudoinverse of H.

Figure 16: Trellis Coding [OC06] The decoder’s job is estimate which sequence. The transition arrows are driven by the input bits. and so some capacity is lost. A trellis diagram is a way of representing the action of a STTC [OC06]. Finally. One way 39 . For example. Trellis codes provide better error performance compared to block codes and coding gain at the expense of implementation complexity. The diagram below shows a trellis. which has 2ν states.4 Space-Time Trellis Codes A space-time trellis code (STTC) is an extension of normal convolutional codes to multiple antennas [TSC98]. The key idea behind a STTC is to make the output of the encoder a function of the input bits and the state of the encoder. If the output is 02 for example then the 0th is sent on the first antenna and the 2nd symbol is sent on the second antenna. The number of nodes is the number of states in the code. path through the trellis. The left column represents the current state of the code and the right column represents the next state. during the first block the second transmit antenna transmits nothing. There are 2B arrows from each state on the left to states on the right for each possible combination of inputs. which is in turn a function of the previous inputs. 6. The major price to pay for using D-BLAST is the lost capacity during the startup process due to the blank spots in the codeword. The possible outputs are listed on the left hand side of the trellis. 6.the error with the code applied across the layer.1 Trellis Representation Suppose B bits are input into the encoder.4. was sent with. there is also the cost in implementation complexity of applying coding and decoding across streams.

1 Single Carrier In this case the system can be modeled as L−1 yk = l=0 H[l]ck−l + vk 40 (137) . 6.OFDM Many modern wireless standards that use MIMO also use OFDM. so MIMO-OFDM is of particular interest. Multicarrier . Obviously as the number of states increases the complexity of decoding increases.the Viterbi algorithm . so this lower bound on states puts a lower bound on the possible complexity. This is equivalent to passing c1 through a frequency selective channel with two taps in the frequency domain: h1 and h2 . Consider a 2 × 1 MIMO system. 7.4. which sequence was sent.to do this is with a Maximum Likelihood Sequence Estimator (MLSE). Trellis Complexity There is a fundamental lower bound to the complexity of a STTC. it sees the channel h1 . The codeword for T transmitted symbols is given by 1 C=√ 2 c1 c2 · · · cT 0 0 c1 · · · cT −1 cT (136) The trellis diagram below represents this code The effect of this code is to convert spatial diversity to frequency diversity. SW94]. There is a well known algorithm .2 Delay-Diversity Scheme This is one of the simplest trellis codes to achieve diversity [Wit93. When c1 is transmitted during the second symbol period. it sees channel h2 . So spatial diversity becomes frequency diversity by applying this code. Single carrier 2. When c1 is transmitted during the first symbol period.to efficiently estimate. For a STTC with B input bits and minimum rank rmin has at least 2B(rmin −1) states. 7 Space-Time Coding for Frequency Selective Channels There are two basic approaches to MIMO over frequency selective channels as in normal SISO frequency selective channels: 1.

2 MIMO-OFDM MIMO-OFDM is an extension of normal OFDM to the MIMO case where there are multiple antennas.   0 λN but D is not specific to each H.This complicated system involving a summation can be expressed as a simple system of the form yk = [H[0] · · · H[L − 1]] cT · · · cT k k−L+1 which is similar to the narrowband MIMO case. which can be expressed in matrix form as        y[0] h[0] h[N − 1] h[N − 2] · · · h[1] x[0] v[0]  y[1]   h[1] h[0] h[N − 1] · · · [2]   x[1]   v[1]          = +  . . . .2. . . H[k]X[k] results in a circular convolution in the time domain. . Where D is the matrix that performs the DFT. .. . . . 7.. Figure 17: OFDM System Model [OC06] The system model can be expressed in the DFT frequency domain as Y [k] = H[k]X[k] + V [k] (139) with V [k] the corrupting noise. .. . . . . T + vk (138) 7.1 OFDM OFDM uses the FFT and IFFT to decompose the wideband frequency selective channel into several smaller narrowband frequency flat channels.  .  . 41 . . . .       . The matrix Λ is a diagonal matrix specific to each H   λ1 0   λ2   Λ= (140)  . . The cyclic prefix is added to prevent ISI. y[N − 1] h[N − 1] h[N − 2] h[N − 3] · · · h[1] x[N − 1] v[N − 1] The singular value decomposition of H is DΛD∗ .

2. A large blockwise circulant matrix can represent the effective channel seen by the whole MIMO-OFDM codeword.7. .  = . can be removed from the analytical model. 0 H[1] . First. Thus the complicated MIMO-OFDM channel can be regarded as a diagonal channel with the appropriate coordinate change given by the DFT. which is necessary for practical implementation. Given L−1 Hk = l=0 H[l]e−j2π/T kl (146) then ML detection is given by T −1 ˆ X = arg min C k=0 ||yk − Hk ck ||2 (147) 42 . .. to prevent ISI to produce the modified system y = Hg [Xg X] + v ˜ with the channel matrix given by    Hg =   H[l − 1] 0 .2 Extension to MIMO-OFDM A MIMO-OFDM system can be modeled like a SISO OFDM system with the channel taps replaced by channel matrices [OC06]. Xg .  H[0] 0 ··· 0 H[l − 1] · · · H[1]  . 0 · · · 0 H[l − 1] · · · H[1] H[0]  Hcp (144) Since Hcp is blockwise circulant. the SVD of Hcp is given by Hcp = D∗ Λcp D (145) where D is the IDFT matrix as usual. start with the frequency selective MIMO channel L−1 yk = l=0 H[l]xk−l + vk (141) Then append a cyclic prefix. . H[l − 1] ··· 0 ··· H[0] H[1] 0 H[0] ··· ··· 0 0      (143) (142) H[l − 1] H[l − 2] · · · H[0] As in SISO OFDM the cyclic prefix.  .

MIMO-OFDM has issues with PAPR and frequency offset estimation. Perform normal Alamouti decoding This idea certainly works. A block diagram for MIMO-OFDM follows below: Figure 18: MIMO OFDM [OC06] 7. For example in the 2 × nr case the Alamouti code can be used on each subcarrier through the following process: 1.Like OFDM. Thus codes designed for fast fading time channels can be applied across the subcarriers. Transmit [−c∗ c∗ ]T on the same tone during the second OFDM symbol 2 1 3.2. Depending on system parameters this may not be a reasonable 43 . but it limits the system. The frequency index k can be reinterpreted as a time domain index. In MIMO-OFDM the same idea can be used to code across the subcarriers [TV05].2. since the channel has to remain static for the duration of two OFDM symbols. Transmit [c1 c2 ]T on a given tone during the first OFDM symbol 2.4 Space-Time Coded MIMO-OFDM This is the simplest MIMO-OFDM system with no coding across the subcarriers.3 Space-Frequency Coded MIMO-OFDM For normal OFDM the frequency domain channel coefficients H[k] can be viewed as the channel coefficients in a narrowband fast fading time channel. 7. Instead the OFDM part of the system chops the frequency selective channel into frequency flat channels on which normal space time coding techniques can be applied.

The scheme is then 1. In general all space-time codes discussed before assume the channel is static over the duration of a codeword. MIMO can also be used as a multiple access technique to allow multiple users to seamlessly share the spatial channel. However.assumption. Send [c1 [k] c2 [k]]T 2. However. The collection of antennas at all the mobile users in a cell is regarded as one big antenna array. The fundamental units of transmission are two blocks of length T : c1 [k] and c2 [k]. The typical application of MU-MIMO is in a cellular system with multiple antennas at the base station and only one or two antennas at each mobile [GKHCS07]. Decode like Alamouti except use two independent MLSE estimators. Below are several examples of this idea. form     (148) 8 Multiuser MIMO Historically MIMO was developed for use in point to point situations. and frequency. time. in order to actually get the benefits of MU-MIMO the base station needs CSIT or at least partial CSIT. Generalized Delay Diversity This code [GSP02] has matrix  c1 c2 · · · cT 0 0 1  0 c1 c2 · · · cT 0 C= √  2  0 c1 c2 · · · cT 0 0 0 c1 c2 · · · cT This code provides a diversity gain of 3. 7. One of the key advantages of having a distributed array comprised of all the mobiles is that the channel matrix rarely suffers from rank deficiencies. This can be accomplished with two parallel copies of the Viterbi algorithm. Lindskog-Paulraj Scheme This code [LP00] basically extends Alamouti in a natural way to MIMO-OFDM. 44 . which entails increased complexity. Send [−c∗ [k] c∗ [k]]T 1 2 3. This type of MIMO is called Multiuser MIMO(MU-MIMO).2. so this is a general problem in Space-Time Coded MIMO-OFDM.5 Space-Time Frequency Coded MIMO-OFDM In a Space-Time Frequency Coded MIMO-OFDM system coding is performed over all three available dimensions: space. so spatial multiplexing is almost always possible.

MAC channel. Well known nonlinear precoding methods include perturbation methods and Tomlinson-Harathisma codes [PHS05. Effectively what the coding at the transmitter does is pre-cancel out interference at the receivers like ZF-nulling does in BLAST.For a MU-MIMO system having N transmit antennas at the base station and U users each with Mk antenna the downlink.l=k Wl sk + vk (152) In the case when each user has one receive antenna this problem is identical to canceling interference in BLAST.1.1 Linear Precoding The downlink channel can be written in a simple for making explicit how other users’ streams produce interference. N yk = Hk Wk sk + Hk l=1.2 Nonlinear Precoding Nonlinear precoding is more like DPC than linear precoding and can produce better results at the cost of increased complexity. that will cancel out the interferers [SSH04]. 45 . 8.1. can be modeled as U y= i=1 hk x k + v (150) 8.1 Precoding Information theoretic results have shown that using a type of coding called dirty paper coding(DPC) at the transmitter N users streams can be multiplexed and transmitted [SB07. GC80]. HPS05]. N yk = Hk sk + Hk l=1.l=k sk + vk (151) The simplest form of precoding is to multiply the transmit symbols by a matrix. broadcast channel. 8. So the proper choice for Wk is the kth column of the pseudoinverse of the effective channel matrix H = h1 h1 · · · hN . Wk . for each user can be modeled as N yk = hk l=1 xl + vk (149) The uplink.

5. • Spectrum . 9. 9 MIMO in Wireless Standards Many emerging wireless standards provide for MIMO to provide both diversity and multiplexing gain as needed.2 Scheduling If the number of users U is greater than the number of transmit antennas N . which selects the N users with the best channels. The basic option for the downlink is two antennas at the base station and two at the mobile station. 5. Allowed sizes are 1. Basic results have demonstrated that the gains of MU-MIMO can still be achieved with only partial CSIT.3 Working with Partial CSIT To achieve CSIT each user must feedback its channel estimate to the base station. If the base station has CSIT. which is tricky and reduces capacity [GA04]. The major features of LTE are outlined below [3GPP07. 8. • IP Network .No fixed spectrum size. which entails less system complexity. So at any given time the base station must choose some subset of the users to transmit to [GKHCS07]. then there are two methods it can apply: 46 . then the base station can’t transmit to all the users simultaneously. 2.8. 15. 3GPPRel9]: • High data rates .6. so heurestic methods must be used to choose a subset of users.LTE uses OFDM with a variable number of subcarriers.100 Mbps in the downlink using 2 × 2 MIMO and 50 Mbps in the uplink using no MIMO • Mobility . This section examines three prominent new wireless standards that employ MIMO. The optimal scheduling algorithm is to simply perform an exhaustive search over all possible combinations of users. and 20 MHz.No circuit switched domain but all IP based network.1 3GPP LTE The Third Generation Partnership Project Long-Term Evolution (3GPP LTE) is the emerging 4G standard that is currently being implemented and tested. A simple choice is a greedy algorithm. This is not computationally feasible though. Extensions to LTE allow 4 × 2 and 4 × 4 MIMO.Best performance for 0-15 km/hr and good performance of 15-120 km/hr. 1. The downlink in LTE provides several options for using MIMO. • OFDM . 3GPPRel8.25. 10. To combat this problem some research has been performed into MU-MIMO systems with only partial CSIT.

Since the base station knows the channel at the receiver. Pre-coding SDM .No fixed spectrum size. and 20 MHz. In the uplink MU-MIMO can be used with the proper scheduling. 9.16e.1. IS05]: • High data rates . it can pre-code the transmitted symbols to present interference using the V matrix from the SVD of H.WiMAX uses OFDM with a variable number of subcarriers.75 Mbps in 802. • Spectrum .2 WiMAX WiMAX was originally developed to address the last mile connection to the internet. 2. Without CSIT the base station can use Space-Frequency Block Coding by using the Alamouti code for each tone.Introduced in 802. 10. The baseline case assumes 1 × 2 and the extension is 1 × 4.Use some form of beamforming such as TMRC. • Mobility . It has evolved to provide high data rate mobile data. Range up to 30 miles in 802.16e. 5. 2. Allowed sizes are 1.16e. • OFDM .16d and 30 Mbps in 802. Beamforming .5. The general structure of a WiMAX transmitter is demonstrated in the figure below: Figure 19: WiMAX Transmitter [AGM05] 47 . The key features of WiMAX are outlined below [IS04.25.

16 standard defines several options for space-time codes for 2-4 antennas. However. The way 802.16 also provides for space-frequency coding called the Frequency Hopping Diversity Code (FHDC) based on the Alamouti code. the cost of this approach is a loss in range because higher SNR is necessary to successfully demodulate 64-QAM. The feedback is an index into the codebook that tells the receiver. the two most common codes for space-time coding are: S1 S2 ∗ S1 −S2 ∗ S2 S 1 (153) where S1 and S2 are OFDM symbols.3 802. Open Loop (No CSIT) The 802. However. Another alternative is to use a feedback channel and have the receiver transmit a quantized version of the channel. 9. The OFDM symbols are uncoded in time and coded in the frequency domain.11n is the next generation 802. As we have seen before the methods employed depend on whether the transmitter has channel state information or not. 802.11n 802. The transmitter can then design the optimal precoding matrix.11 a/b/g high data rates were achieved by using high order modulation like 64-QAM. One of the common methods used in feedback is codebook based feedback. In 802. The codebook is basically a predetermined set of choices for the Q matrix in BLAST.There are several different MIMO methods that can be employed in WiMAX. which matrix to use.11 LAN that seeks to provide very high data rates. The figure below shows how FHDC works: Figure 20: WiMAX Frequency Hopping Diversity [AGM05] Closed Loop (CSIT) With CSIT the transmitter can make better decisions.11n seeks to overcome this problem and provide both high 48 .

A block diagram demonstrating the operation of 802. MIMO continues to be an active research area with multiuser-MIMO as a new area of great interest for future development.20 MHz (Optionally 40 MHz) • OFDM . which has become a ubiquitous feature of modern wireless standards.11n follows: Figure 21: 802. 49 . ?]: • High Data Rate .11n Transmitter [OC06] The 802. 10 Conclusion MIMO has become a popular technology for emerging wireless standards because it can provide better error performance in the form of diversity gain and better data rates in the form of multiplexing gain without using more bandwidth.11n achieves higher data rates without using more bandwidth or larger constellations by increasing spectral efficiency. The receiver architecture is symmetric and is manufacturer specific. Thus 802.Uses OFDM The basic case for 802. 802. The key features of 802. MIMO works well with OFDM. Then each branch transmits on one antenna.11n standard provides for up to 4 × 4 MIMO. In addition.11n transmitter sends every other group of bits to each OFDM branch.11n are outlined below [IWG04. ?. Each branch performs normal OFDM with spatial subcarrier mapping.11n transmits multiple data streams from the multiple transmit and receive antennas. MIMO is an exciting field that looks to be a major part of research and standards in wireless communications for many years to come.data rates and better range is through MIMO-OFDM.130 Mbps typically • Spectrum .11n is 2 × 2 but the 802.

. A. . a2 . . Listed below are several useful facts about rank: 1. 3. . So the rank of A is the largest number of columns of A that constitute a linearly independent set. By the Fundamental Theorem of Algebra this polynomial has n complex roots counting multiplicity. . T A can be written in terms of rows as [a1T . then A is invertible if and only if rank(A) = n.1 Rank Let A ∈ Cm×n . If A is square. . . . . A complex number λ and a complex vector x = 0 are said to be an eigenvalue and its associated eigenvector if Ax = λx By simple rearranging this expression can be written as (A − λI) x = 0 (157) (156) This equations has non-trivial solutions (x = 0) only if A − λI is not invertible. A. . Then A can be written in terms of column vectors as A = [a1 . Thus A has n eigenvalues including multiplicity. rank(A∗ A) = rank(A) 4. rank(A) ≤ min{m. a2T . an ]. The column space of A denoted col(A) is given by col(A) = span(a1 . . am ]T . an ) (154) Then the rank of A denoted rank(A) is defined to be dim col(A). n}.A Math Review This section reviews a few common mathematical tools used in MIMO. . . . Then the row space of A denoted row(A) is given by row(A) = span(a1 . rank(AT ) = rank(A∗ ) = rank(A) 2. In particular. this section covers some important linear algebra topics and Lagrange multipliers for optimization [HJ95]. This is true precisely when det (A − λI) = 0. 50 . a2 . a2 . When the determinant is expanded and evaluated it becomes an nth degree polynomial. an ) (155) With these definitions it can be demonstrated that rank(A) = dim row(A).2 Eigenvalues and Eigenvectors Let A be a n × n matrix over the complex numbers. . .

2 Connection To The Determinant and Trace The following formula is a useful connection between the eigenvalues of a matrix and its determinant and trace: det(A) = λ1 λ2 · · · λn (158) n tr(A) = i=1 aii = λ1 + λ2 + · · · + λn (159) A. it is possible to perform orthogonal projections of a vector onto a space spanned by several other vectors. The singular value 51 . This can be used in V-BLAST for the ZF-Nulling receiver.4 Singular Value Decomposition It is only possible to diagonalize a square matrix. since the channel becomes n independent parallel channels.3 Inner Product Space Cn×1 is an inner product space with the inner product given by: < x.2. y > with equality for y = Kx for any constant K.A. The matrix S can be interpreted as a change of basis that allows the matrix A to be described as a diagonal matrix. The second point is the Cauchy-Schwarz inequality √ √ (161) | < x. but sometimes it is desirable to decompose a matrix with arbitrary dimensions into another matrix that is almost diagonal. First. The following two conditions are sufficient to guarantee that a square matrix is diagonalizable: 1. A has n distinct eigenvalues 2. This representation is particularly nice in a n × n MIMO system. If this is possible. So the main point of interest is determining when A is diagonalizable. y > | ≤ < x. then A is said to be diagonalizable. This transformation makes capacity calculation much easier. The Cauchy-Schwarz inequality can be used to derive the optimal receive combining vector for MRC. A.1 Diagonalization Sometimes A can be related to a diagonal matrix D by A = S−1 DS where S is a n × n invertible matrix.2. y >= y∗ x (160) There are two important points of interest in regarding Cn×1 as an inner product space. x > < y. A has n linearly independent eigenvalues A.

2. and the corresponding singular value σ ∈ C Av = σu (166) This equation is similar to the equation that defines an eigenvalue and eigenvector.Note that AA† is not in general the identity matrix but the combination of three matrices produces the desired effect. Specifically the singular value decomposition of a matrix A ∈ Cm×n is A = UΣV∗ where U ∈ Cm×m and V ∈ Cn×n are unitary matrices. A† A ∗ ∗ = AA† = A† A 52 . Then the pseudoinverse is defined to be A† = VΣ† U∗ where Σ† is the transpose of Σ with the non-zero singular values inverted.decomposition achieves this and is defined for all matrices in Cm×n . Typically the matrix Σ is constructed such that Σ11 ≥ Σ22 ≥ · · · ≥ Σnmin nmin (165) Intuitively what the SVD does is use the matrix V to rotate an input vector to a coordinate system in which the action of the matrix can be described by a simple matrix Σ. n} singular values.1 Pseudoinverse The inverse of a matrix is only defined for square matrices. A† AA† = A† 3. a corresponding column of v ∈ Cn . One of the most important properties of the SVD is that the number of non-zero singular values is precisely the rank of the matrix A. Then the output of this simple matrix is rotated back to the original coordinate system by U to produce the output of A. For a column of u ∈ Cm . The entries Σii are called the singular values of A. Σii . Let A ∈ Cm×n have a SVD UΣV∗ . which means UU∗ = U∗ U = Im VV∗ = V∗ V = In (163) (164) (162) Σ ∈ Cm×n has non-zero entries only for the entries on the diagonal. Then it is clear that there are nmin = min{m.4. The concept of the SVD can be viewed as a generalization of eigenvalues. AA† 4. but there is a way to define a special matrix that is like an inverse but defined for arbitrary m × n matrices called the pseudoinverse. A. AA† A = A . which led to u being called a left singular vector and v a right singular vector. There are four kery properties of the pseudoinverse that define its behavior: 1.

“Spatial Multiplexing over correlated MIMO channels with a closed-form precoder”. Wireless Commun. September 2007. IEEE Trans. x2 . . [AG05] J. x2 . September 2005. “Overview of 3GPP Release 9 V. Gesbert.5 Lagrange Multipliers Consider the following optimization problem: Maximize : Subject to : f (x1 . .0.2 Condition Number Consider solving the linear system Ax = b. n ∂xi ∂xi g = C (170) Lagrange multipliers can be used to find the optimal power allocation to maximize capacity. 4(5):2400-2409. . .4. . . The condition number is a measure of how well this system behaves for small changes in b. . . xn ) = C The optimal solution can be found by solving the following system of equations given by the gradient f = λ g = C g (169) These equations can be expressed in terms of partial derivatives as ∂f ∂g = λ i = 1. “Technical Specification Group Radio Access Network Requirements for Further Advancements for E-UTRA (LTE-Advanced) (Release 8)” [3GPPRel9] 3GPP. 2. “Physical Channels and Modulation (Release 8)”.4 (2009-01)” [3GPP07] 3GPP.0. . xn ) g(x1 .A. Akhtar and D. 53 . .. Specifically the condition number measures how small changes in b change x. So for the perturbed system Ax = b + e the condition number is given by ||A−1 e||/||A−1 b (167) κ(A) = ||e||/||b|| This quantity can be related to the singular values of A by κ(A) = σmax σmin (168) A. References [3GPPRel8] 3GPP. . .

NY.J. on Comm. Gesbert. 234-238 [GC80] A. “Shifting the MIMO Paradigm: From Single User to Multiuser Communications”. Cover and T. January 1999.A. Wiley. March 2002. “How much feedback is multi-user diversity really worth?”. “On Limits of Wireless Communications in a Fading Environment when Using Multiple Antennas. IEEE Trans. IEEE Trans. Belfiore. El Gamal and T. “Layered space-time architecture for wireless communication in a fading environment when using multi-element antennas”. pp. Muhamed. IEEE J. A.-S.M. 24. Gesbert and M. Alamouti. Vol 6(3). Australia. no. IEEE. Sydney. [DTB02] M. Foschini. G...J. R. Golden. 51(4):1432-1436. R. and E. Dec.Viterbo. [FG98] G. Proc.IEEE Global Telecommunications Conf. Rekaya.N. Gans. [GKHCS07] D. Theory. and R. Inform. “First. 46(6):2027-2044.E. 54 . Foschini. May 2003. Fundamentals of WiMAX [Ala98] S. Autumn 1996. Cambridge Press. Paris. France.-B.-C. Elect. 12. Bell Labs Tech.Tewfik.O. 1998. Theory. Cover. Theory.time communication architecture”. Chuah.W. Fleury. Proc.A. 1980 [GD03] H. 49(5):1097-1119.M. “A construction of a space-time code based on number theory”.O. Theory. 5. pages 1894-1899. Select. IEEE Int. [GFVW99] G. October 1998. “Detection algorithm and initial laboratory results using the V-BLAST space. [CKT98] C. pages 41-59. Globecom 1998 .. pp. 1466-1483. Conf.” Wireless Personal Communications. 16(10):1451-1458.H. April 2005. Ghosh. vol. “Capacity of multi-antenna array systems in indoor wireless environment”. [Fle00] B. “A simple transmit diversity technique for wireless communications”. Jr. Wireless Communications. and T. J. Goschini and M. Salzer. Andrews. vol. Belfiore. 2005. Damen. In Proc. C.M. NewYork. “Multiple user information theory”. Elements of Information Theory. 36-46. March 1998. Chae. Kahn. Inform. 48(3):753-760.C.[AGM05] J. Tse. (ICC). 2007 [Gold05] Andrea Goldsmith.Valenzuela. Kountouris. “Universal space-time coding”. Areas Commun. Thomas. Oct. W.J. volume 4. Damen. Lett. and D. 35(1):14-15.. “The golden code: a 22 full-rate space-time code with non-vanishing determinants”. 68. Gamal and M. [Fos96] G. IEEE Trans. G. [GA04] D. and J.J. [BRV05] J. Heath.and second-order characterization of direction dispersion and space selectivity in the radio channel”. June 2000.. IEEE Trans. Inform.D. no. J. pp. Inform. Alouini. and P. [CT91] T.Wolniansky. June 2004. M. IEEE Signal Processing Magazine. 1991.

Horn and C. and A.. 2806 . 53.IEEE Int. “Ratio Squarer. No.C. Conf. S. 2005. vol. Godara. Peel. part II: Beamforming and direction-of-arrival considerations”. Gore. VT-20. Paulraj. “A quasi orthogonal space-time block code”. Hassibi and B. [Kah54] L. UT.2818. Hochwald. Johnson.6 Part 16: Air Interface for Fixed Broadband Wireless Access System [IWG04] IEEE 802. volume 4. vol. pages 1949-1953. Conf. March 2005 [HV05] B. Part 16: Air Interface for Fixed and Mobile Broadband Wireless Access Systems Amendment 2: Physical and Medium Access Control Layers for Combined Fixed and Mobile Operation in Licensed Bands. ICC 2002 . [Hea01] R.. PhD thesis. IEEE Trans. pp. Technische UniversitatWien. of IRE(Corr. Stanford University. and A. [IS04] IEEE Standard 802. 1971. Herdin. Salt Lake City. UK. 2004 [IWG206] IEEE 802. no. Hochwald.” IEEE Trans. the expected complexity”. May 2001. Topics in matrix analysis.16-2004 [IS05] IEEE Standard 802. August 2004. L. “High-rate linear space-time codes”. pp.A. Sandhu. 1995. 3. pp. IEEE Trans. Aug. 85(8):1195-1245. Comm. 49(1):1-4. Jafarkhani. January 2001. [Jak71] W.).” Proc. Cambridge University Press. ICASSP 2001 .C.. Part 16: Air Interface for Fixed and Mobile Broadband Wireless Access Systems Amendment 2: Physical and Medium Access Control Layers for Combined Fixed and Mobile Operation in Licensed Bands.16e-2005. August 1997. IEEE Transactions on Signal Processing. 53. [Her04] M. C. May 2002. Amendment to IEEE Standard for Local and Metropolitan Area Networks . “On the sphere decoding algorithm: Part I. Kahn.. M. pages 2461-2464. [HJ95] R. [HH01] B. In Proc. November 1954. 537-544. [Jaf01] H.16 Working Group. Swindlehurst.IEEE Int. Vikalo. “Space-Time Signaling in Multi-Antenna Systems”.part II: perturbation”. In Proc. Commun. Nov. [HPS05] B. Cambridge. Jakes.[God97] L. November 2001. 4. [GSP02] D. PhD thesis. Heath. “A vector-perturbation technique for near capacity multiantenna multiuser communication . Speech and Signal Processing. pp. Proceedings IEEE. Hassibi and H. on Veh. “Delay diversity codes for frequency selective channels”. “ A Comparison of Specific Space Diversity Techniques for the Reduction of Fast Fading in UHF Mobile Radio Systems. Commun. 2006 55 . “Applications of antenna arrays to mobile communications. Acoust. B. Techn.R. Vol. “Non-stationary indoor MIMO radio channels”. Vol. 1074. 42. 81-93. NewYork.16 Working Group.

Proc. Paulraj. L. Sharma and C. and M. Steinbauer.. IEEE Trans.[IWG1106] IEEE 802. “A vector-perturbation technique for near capacity multiantenna multiuser communication . Bonek. 11-15. Wiley. Lindskog and A. NY. [SB07] M. [OC06] Claude Oestges and Bruno Clerckx. 53. 2003. 3(3):257-264. 462-471. Comm.. Alouini. Gore. vol. B. [LP00] E.IEEE Int. IEEE P802.F. Rappaport. McGraw-Hill. Sharif and B.B. Spencer. Swindlehurst. 52.0 Draft Amendment to Standard for Information Technology. 56 . 2001. [Sim01] M.D. In Proc. Journal of Communications and Networks. M. 2000. 2000. Proakis.K. ICC 2000 . IEEE Antennas Propagat. Commun.J. Artech House Press. Digital communications. Simon. 1. Swindlehurst. no. B.Telecommunications and Information Exchange Between Systems-Local and Metropolitan Networks-Specific Requirements-Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications: Enhancements for Higher Throughput. 2005 [PNG03] A.. Hochwald. London. “A transmit diversity scheme for channels with intersymbol interference”. 2nd ed. pp. Peel. vol. 51(3):332-335. UK. New Orleans. PhD thesis. “A comparison of time-sharing. 4th ed. Haardt. UK. Parsons. [Par00] J. 43(4):51-63. and E. “Evaluation of average bit error probability for space-time coding based on a simpler exact evaluation of pairwise error probability”. March 2003.11n/D1. Sig. [Pro01] J.G. 2.. [SMB01] M. and D. IEEE Trans. “Digital communications over fading channels: a unified approach to performance analysis”. NewYork. volume 1.Wiley. R.. A. [SP03] N. “The double-directional radio channel”. Cambridge University Press. pages 307-311. Paulraj. 195-202.part I: channel inversion and regularization”. 1. June 2000. March 2006. Stanford University. Commun. IEEE Trans. Jan.. no.11Working Group.. Papadias. L. pp. August 2001. Introduction to Space-Time Wireless Communications. Prentice Hall. [SSH04] Q. Cambridge. 2007. 55. [Rapp02] T. The mobile radio propagation channel.. no. vol.-S. 2004. A. “Improved quasi-orthogonal codes through constellation rotation”. DPC.S. and A. “Zero-forcing methods for downlink spatial multiplexing in multiuser MIMO channels”. [PHS05] C.K. Molisch. MIMO Wireless Communications. August 2002. Jan. Wireless Communications: Principles and Practices. Sandhu. September 2001. [SA00] M. [San02] S. pp. Feb. Conf. Simon and M. Nabar. Mag. and beamforming for MIMO broadcast channels with many users”. New York. “Signal Design for Multiple-Input Multiple-Output Wireless: A Unified Perspective”. IEEE Trans. NY. Hassibi. Comm.

44(3):744-765. San Francisco. “Diversity and multiplexing: a fundamental tradeoff in multipleantenna channels”. Cambridge University Press. pages 1941-1945. In Proc. J. July 1999. Boca Raton. IEEE Trans. Zheng and D. May 2005. IEEE Commun.D. [TJC99] V. Seshadri and J. 45(5):1639-1642. B. Wittneben. Calderbank. [Yac93] M. “Minimal non-orthogonality rate 1 spacetime block code for 3+Tx antennas”. Boariu.time codes with PSK modulation”.Wornell.Yacoub. Rep. “A universal lattice code decoder for fading channels”. [TV05] D. Int. October 2004. [Tel95] E.[SSRS03] B.R. and A. Inform. FL. IEEE Trans. 9(5):420. Winters. Inform. Tse and P.. H.R. Theory.422. N. “Full-diversity. and A. August 2005. 1995. 9(8):676-678. Calderbank. Globecom 2003 . 49(10):25962616. Inform.AT&T Bell Labs. UK.. Lett. Commun. “Capacity of multiantenna Gaussian channels” . July 1999. Hottinen. 57 . Xia. “Two signaling schemes for improving the error performance of frequency-division-duplex (FDD) transmission systems using transmitter antenna diversity”. Inform. Inform. May 2003. [XL05] L. [SW94] N. and A. Tse. and Appl. In Proc. pages 429-432. Seshadri. Theory. “Optimal diversity product rotations for quasiorthogonal STBC with MPSK symbols”. Sethuraman. IEEE Trans. Lett. Wireless Information Networks. Tarokh. [VB99] E. 1993. Wang and X. (ISSSTA 2000). [YW03] H. pages 1630-1633. space-time block codes from division algebras”. “Space-time codes for high data rate wireless communication: Performance criterion and code construction”. “Signal constellations for quasi-orthogonal space.Tech. “Structured space-time block codes with optimal diversitymultiplexing tradeoff and minimum delay”. A. Theory.IEEE Int. IEEE Trans. on Spread-Spectrum Tech. Xia. Jafarkhani. Fundamentals of wireless communication. Conf. Cambridge. Theory. CRC Press. Foundation of mobile radio engineering. Su and X.G. high-rate. Symp. Inform. March 1998.IEEE Global Telecommunications Conf. [SX04] W. and V. [WX05] D. Shashidhar. Sundar Rajan. In IEEE 6th Int. December 2003. 2005. IEEE Trans. 49(5):1073-1096. [Wit93] A. Xian and H. “A new bandwidth efficient transmit antenna modulation diversity scheme for linear digital modulation”. Tirkkonen. ICC 1993 . [TSC98] V. Tarokh. September 2000. Viterbo and J. volume 4. Boutros. Liu. 1:49-60. [TBH00] O. Theory. CA. October 2003. 50(10):2331-2347. Theory. 1993.Telatar.W.. “Space-time block codes from orthogonal designs”. IEEE Trans. [ZT03] L.. Viswanath.. 1994. IEEE Commun. “Optimal rotation angles for quasi-orthogonal space.time block codes with full diversity”.A.H. 45(7):1456-1467.Yao andG.

Sign up to vote on this title
UsefulNot useful