Professional Documents
Culture Documents
Majorization Theory and Matrix-Monotone Functions in Wireless Communications is an Eduard Jorswieck and Holger Boche
now
the essence of knowledge
Majorization and
Matrix-Monotone
Functions in Wireless
Communications
Majorization and
Matrix-Monotone
Functions in Wireless
Communications
Eduard Jorswieck
Department of Electrical Engineering
Royal Institute of Technology
11400 Stockholm, Sweden
eduard.jorswieck@ee.kth.se
Holger Boche
Fraunhofer Institute for Telecommunications
Heinrich-Hertz-Institut
Einsteinufer 37
10587 Berlin, Germany
holger.boche@hhi.fhg.de
Boston – Delft
Foundations and Trends
R
in
Communications and Information Theory
The preferred citation for this publication is E. Jorswieck and H. Boche, Majoriza-
tion and Matrix-Monotone Functions in Wireless Communications, Foundation and
Trends
R
in Communications and Information Theory, vol 3, no 6, pp 553–701, 2006
ISBN: 978-1-60198-040-3
c 2007 E. Jorswieck and H. Boche
All rights reserved. No part of this publication may be reproduced, stored in a retrieval
system, or transmitted in any form or by any means, mechanical, photocopying, recording
or otherwise, without prior written permission of the publishers.
Photocopying. In the USA: This journal is registered at the Copyright Clearance Cen-
ter, Inc., 222 Rosewood Drive, Danvers, MA 01923. Authorization to photocopy items for
internal or personal use, or the internal or personal use of specific clients, is granted by
now Publishers Inc for users registered with the Copyright Clearance Center (CCC). The
‘services’ for users can be found on the internet at: www.copyright.com
For those organizations that have been granted a photocopy license, a separate system
of payment has been arranged. Authorization does not extend to other kinds of copy-
ing, such as that for general distribution, for advertising or promotional purposes, for
creating new collective works, or for resale. In the rest of the world: Permission to pho-
tocopy must be obtained from the copyright owner. Please apply to now Publishers Inc.,
PO Box 1024, Hanover, MA 02339, USA; Tel. +1-781-871-0245; www.nowpublishers.com;
sales@nowpublishers.com
now Publishers Inc. has an exclusive license to publish this material worldwide. Permission
to use this content must be obtained from the copyright license holder. Please apply to now
Publishers, PO Box 179, 2600 AD Delft, The Netherlands, www.nowpublishers.com; e-mail:
sales@nowpublishers.com
Foundations and Trends
R
in
Communications and Information Theory
Volume 3 Issue 6, 2006
Editorial Board
Editor-in-Chief:
Sergio Verdú
Depart of Electrical Engineering
Princeton University
Princeton, New Jersey 08544
USA
verdu@princeton.edu
Editors
1
Royal Institute of Technology, Department of Electrical Engineering,
Signal Processing, 11400 Stockholm, Sweden, eduard.jorswieck@ee.kth.se
2
Technical University of Berlin, Department of Electrical Engineering,
Heinrich-Hertz Chair for Mobile Communications, HFT-6 Einsteinufer 25,
10587 Berlin, Germany
3
Fraunhofer German-Sino Lab for Mobile Communications MCI,
Einsteinufer 37, 10587 Berlin, Germany
4
Fraunhofer Institute for Telecommunications, Heinrich-Hertz-Institut,
Einsteinufer 37, 10587 Berlin, Germany, holger.boche@hhi.fhg.de
Abstract
This short tutorial presents two mathematical techniques namely
Majorization Theory and Matrix-Monotone Functions, reviews their
basic definitions and describes their concepts clearly with many illus-
trative examples. In addition to this tutorial, new results are presented
with respect to Schur-convex functions and regarding the properties of
matrix-monotone functions.
The techniques are applied to solve communication and informa-
tion theoretic problems in wireless communications. The impact of
spatial correlation in multiple antenna systems is characterized for
many important performance measures, e.g., average mutual informa-
Eb
tion, outage probability, error performance, minimum N 0
and wide-
band slope, zero-outage capacity, and capacity region. The impact of
user distribution in cellular systems is characterized for different scenar-
ios including perfectly informed transmitters and receivers, regarding,
e.g., the average sum rate, the outage sum rate, maximum throughput.
Finally, a unified framework for the performance analysis of multiple
antenna systems is developed based on matrix-monotone functions. The
optimization of transmit strategies for multiple antennas is carried out
by optimization of matrix-monotone functions. The results within this
framework resemble and complement the various results on optimal
transmit strategies in single-user and multiple-user multiple-antenna
systems.
Contents
1 Introduction 1
1.1 Majorization Theory 1
1.2 Matrix-Monotone Functions 3
1.3 Classification and Organization 4
1.4 Notation 7
2 Majorization Theory 9
2.1 Definition and Examples 10
2.2 Basic Results 18
2.3 Majorization and Optimization 30
3 Matrix-Monotone Functions 33
ix
5 Application of Matrix-Monotone Functions
in Wireless Communications 103
5.1 Generalized Multiple Antenna Performance Measures 103
5.2 Optimization of Matrix-Monotone Functions 114
6 Appendix 133
6.1 Linear Algebra 133
6.2 Convex Optimization 134
7 Acknowledgments 141
References 143
1
Introduction
1
2 Introduction
antenna systems, e.g., sum rate of the multiple access channel (MAC)
with K users and channels α1 , . . . , αK as a function of the power allo-
cation p1 , . . . , pK with inverse noise power ρ
K
!
X
C(p) = log 1 + ρ pk αk .
k=1
PK
Assume that the sum power is constraint to K, i.e., k=1 pk =
K. Order the components α1 ≥ α2 ≥ · · · ≥ αK ≥ 0 and p1 ≥ p2 ≥ · · · ≥
pK ≥ 0. The function C turns out to be Schur-convex with respect to
p, i.e., monotonic decreasing with respect to the Majorization order.
If p q then C(p) ≥ C(q). Therefore, the maximum value is attained
for a power allocation vector with elements, i.e., C([K, 0, . . . , 0]) ≥
C(p) ≥ C(1).
This monotony behavior is illustrated for K = 2 with power allo-
cation p = [2 − p, p] in Figure 1.1. This result implies that TDMA is
optimal, because the complete transmit power is optimally allocated to
one user [80].
Fig. 1.1 Sum rate of MAC with channels α1 = 2, α2 = 1 as a function of the power allocation
p = [2 − p, p].
1.2 Matrix-Monotone Functions 3
Most of the basic definitions and basic properties can be found in the
text books [8, 48, 50, 51, 92]. Majorization theory is a valuable tool and
it is successfully applied in many research areas, e.g., in optimization
[39, 168], signal processing and mobile communications [59, 105], and
quantum information theory [101].
X ≥ Y ⇒ f (X) ≥ f (Y )
1.3.2 Organization
The first two chapters present the definitions, properties, and many
examples to explain the foundations and concepts of the two techniques.
The three main topics discussed are
The main goal of these two chapters is to make the reader famil-
iar with the basic concepts and to enable her to apply these tech-
niques to problems in his or her respective research area. The various
examples illustrate the theoretical concepts and reconnect to practi-
cal problem statements. In “Majorization Theory,” we present novel
results with respect to Schur-convexity and Schur-concavity for the
most general classes of functions and constraints. Later in “Application
of Majorization in Wireless Communications,” these functions obtain
their operational meaning in the context of communication theory.
In “Matrix-Monotone Functions,” we present novel results in terms
of bounds for matrix-monotone functions, optimization of matrix-
monotone functions, and discuss the connection to matrix norms as
well as to connections and means.
In “Application of Majorization in Wireless Communications” and
“Application of Matrix-Monotone Functions in Wireless Communica-
tions,” we apply the learned techniques to concrete problem statements
from wireless communications. The four main application areas are
The main goal of these two chapters is to show under which condi-
tions and assumptions both techniques can be used. Furthermore, it is
shown how to interpret the results carefully to gain engineering insights
into the design of wireless communication systems. In “Application of
Majorization in Wireless Communications,” a measure for spatial corre-
lation in multiple antenna communications is developed. This measure
is exploited for various performance measures and in many scenarios
to analyze the impact of spatial correlation. A measure for the user
distribution in cellular systems is developed and the sum performance
of up- and downlink communication as a function of the user distri-
bution is analyzed. In “Application of Matrix-Monotone Functions in
Wireless Communications,” we develop a generalized performance mea-
sure which unifies mutual information and MMSE criteria. Finally, the
1.4 Notation 7
1.4 Notation
Vectors are denoted by boldface small letters a, b, and matrices by
boldface capital letters A, B. AT , AH , and A−1 are the transpose, the
conjugate transpose, and the inverse matrix operation, respectively.
The identity matrix is I, and 1 is the vector with all ones. ◦ is the
Schur-product and ⊗ is the Kronecker product. diag(X) is a vector
with the entries of X on the diagonal. Diag(x) is a diagonal matrix
with the entries of the vector x on its diagonal. Diag(A, B) is a block-
diagonal matrix with matrices A and B on the diagonal. A1/2 is the
square root matrix of A and [A]j,k denotes the entry in the jth row
and the kth column of A.
The set of real numbers is denoted by R and the set of complex
numbers by C. The set of positive integers is N+ . Denote the set of all
n × n positive semi-definite matrices by Hn . The multivariate complex
Gaussian distribution with mean m and covariance matrix Q is denoted
by CN (m, Q). The expectation is denoted by E. The partial order for
vectors x y says x majorizes y, or equivalently x y means x is
majorized by y. For matrices the order A ≥ B means that A − B is
positive semi-definite. The strict versions of these orders for vectors and
matrices are denoted by , ≺, >, and <. [a]+ denotes the maximum
of a and 0.
2
Majorization Theory
9
10 Majorization Theory
that are monotone with respect to this order are called “Schur-convex”
(monotonic increasing) or “Schur-concave” (monotonic decreasing)
functions. Standard results as well as novel results regarding Schur-
convex functions are reviewed, presented, and discussed. In order to
keep the representation simple and increase readability, many exam-
ples illustrate the definitions and results.
The next theorem connects the partial order majorization and doubly
stochastic matrices.
Example
2.4. Consider the example from the beginning, i.e., x =
0.8 0.6
= y. The corresponding doubly stochastic matrix is
0.2 0.4
given by
2
!
1
0.6 3 3 0.8
= 1 2 .
0.4 0.2
3 3
Remark 2.2. Note that by Theorems from von Neumann the existence
of such a doubly stochastic matrix is assured [92, Thm. 2.C.1]. The
14 Majorization Theory
necessary and sufficient condition for weak (sub and super) majoriza-
tion are
y w x if and only if x = P 1 y
y w x if and only if x = P 2 y
x log y implies x w y.
x y on A ⇒ Φ(x) ≥ Φ(y).
2.1 Definition and Examples 15
Example 2.5. Suppose that x, y ∈ Rn+ are positive real vectors and
the function Φ is defined as the sum of the squared components of
the vectors, i.e., Φ2 (x) = nk=1 |xk |2 . Then, it is easy to show that the
P
1A function is called symmetric if the argument vector can be arbitrarily permuted without
changing the value of the function.
16 Majorization Theory
In fact, the results derived in the next section for vectors of size
n could be carefully applied to these normalized functions as well.
This can be observed by replacing the sum by an integral in most
equations.
Remark 2.6. Proposition 2.7 can also be stated with concave function
g and ending up in the Schur-concavity.
Remark 2.7. Note that the reverse statement is not true. Consider the
following Schur-convex function which is not convex [166, Ex. II.3.15].
Let φ : [0, 1]2 → R be the function
1 1
φ(x1 , x2 ) = log − 1 + log −1 .
x1 x2
Checking Schur’s condition it can be observed that φ is Schur-convex on
x ∈ [0, 1]2 : x1 + x2 ≤ 1. However, the function log(1/t − 1) is convex
on (0, 1/2] but not on [1/2, 1).
20 Majorization Theory
λ(H) [H ii ]ni=1 .
Proof. Use Proposition 2.7 with g(x) = log x. g(x) = log(x) is a concave
function and λ [H11 , . . . , Hnn ] by Proposition 2.9.
wk−1 exp(−w)
p(w) = .
Γ(k)
Furthermore, the vector µ will have non-negative entries that are
ordered in non-increasing order µ1 ≥ µ2 ≥ · · · ≥ µn ≥ 0. Denote the
cumulative distribution function (cdf) of the maximum as P (x), i.e.,
n x n n
!
Y γ n, µk Y X xl
P (x) = = 1 − exp(−x) .
Γ(n) l!
k=1 k=1 l=0
Rx
with the incomplete Gamma function γ(n, x) = 0 tn−1 exp(−t)dt.
be written as
Z ∞ ∞ Z ∞
0
f (x)P (x)dx = f (x)(P (x) + 1 ) − f 0 (x)(P (x) + 2 )dx.
0 0 0
(P (x)+)
The constants 1 and 2 arise because . dx = P 0 (x). With assump-
tions (1) and (2) the first term on the RHS exists. The second term
exists by the third assumption. Note that c1 and c2 are independent of
µ. Define
Z ∞ n
Y γ n, µxk
f (µ) = − f 0 (x) + dx.
0 Γ(n)
k=1
1
The next example chooses f (x) = 1+x . Again, let 1 = 2 = −1 to get
1 1
limx→0 1+x (P (x) − 1) = 1 and limx→∞ 1+x exp(−x) = 0.
The proof can be found in [68]. Interestingly, this result implies that
the optimal p∗ µ. Note that this result cannot be generalized to an
arbitrary concave function f .
2.2 Basic Results 27
The proof can be found in [73]. The special case for random Gaussian
variables was solved in [133]. It relies heavily on the properties of the
distribution of the sum of weighted random variables, especially on the
unimodality of the corresponding pdf. With respect to majorization
there is no clear behavior in the interval f (1) < x < f (2). However, the
minimum of the probability can be characterized in a closed form [73,
Thm. 3].
the vector µ will have non-negative entries that are ordered in non-
increasing order µ1 ≥ µ2 ≥ · · · ≥ µn ≥ 0.
Note, that Γ(0) = 0. The first derivative of the function M (t) with
respect to t is smaller than or equal to zero for all s ≥ 0 and all W 1 ,
W 2 . Therefore, the integral in (2.17) is smaller than or equal to zero,
because the outer integral is over a pdf, which is positive for all s by
definition and has only positive steps. With T = [R(s) + t∆]−1 it holds
∂M (t)
= − tr (∆T T ∆T ) − tr (∆T ∆T T )
∂t
= −2 tr T ∆T
| {z∆} T .
(2.19)
Q
Note, that the matrix T is positive definite. Finally, the matrix Q can
be written as
Q = W 1T W 1 − W 2T W 1 − W 1T W 2 + W 2T W 2
h ih i
= W 1 T 1/2 − W 2 T 1/2 T 1/2 W 1 − T 1/2 W 2
| {z }
C
H
= CC ≥ 0. (2.20)
Inequality (2.20) shows that the matrix Q is positive definite and
therefore the first derivative of M (t) with respect to t in (2.19) is
smaller than or equal to zero. Therefore, the function Φ(t) in (2.17) is
smaller than or equal to zero and this verifies Schur’s condition and
completes the proof.
30 Majorization Theory
√
Example 2.9. Let f (x) = nk=1 xk and X = 1. From Proposition
P
Next, the question arises what happens if the equality constraint in the
optimization problems (2.21) and (2.22) is relaxed, i.e., if the constraint
set is, e.g., |x| ≤ X. The corresponding programming problem can be
stated as
max max f (x) . (2.23)
0≤ξ≤X x≥0,|x|=ξ
ξ∗
For the optimum of (2.23), it follows that x∗ = n1 if f is Schur-concave
where .∗ denotes the optimum.
√
Example 2.11. f (x) = nk=1 xk is Schur-concave and monotonic
P
√
Example 2.12. Consider f (x) = |x|(2 − |x|) nk=1 xk . This func-
P
33
34 Matrix-Monotone Functions
That means, the function φ affects only the eigenvalues of the matrix
A and keeps the eigenvectors unaltered.
for A, B ∈ A.
(1) φ is matrix-convex on A.
(2) For all fixed A and B in A, the function g(α) = φ(αA +
ᾱB) is convex in α ∈ [0, 1] in the sense that ηg(α) + η̄g(β) −
g(ηα + η̄β) is positive semidefinite for all α, β, η ∈ [0, 1].
2 g(α)
(3) For all fixed A, B ∈ A, d dα 2 is positive semidefinite for 0 <
α < 1.
3.1 Definition and Examples 37
d2 log(I + αX)
= −X[I + αX]−2 X ≤ 0.
dα2
As a result, for g(α) = log(I + αX + ᾱY ) it holds
d2 g(α)
≤0
dα2
and matrix-convexity follows by Proposition 3.3. Matrix-monotonicity
follows from Theorem 3.2.
Fig. 3.1 Venn-diagram: Matrix-monotone functions are matrix-concave, concave, and mono-
tone.
Dφ(A)(B) = AB + BA.
Dφ(A)(B) = AH B + B H A.
The derivative is linear, i.e., D(φ1 + φ2 )(A) = D(φ1 (A)) + D(φ2 (A)).
The composite of two differentiable maps φ and γ is differentiable.
40 Matrix-Monotone Functions
Proof. Since the function is continuous and analytic, the function can
be approximated arbitrarily well by a polynomial. Both sides of (3.4) in
Lemma 3.5 are linear in φ. Therefore, it suffices to prove Equation (3.4)
for the powers φ(t) = tp with p ∈ N+ . It holds
p
!
∂ X
k−1 p−k
tr [φ(C + D)] = tr C DC
∂ =0
k=1
= tr DC p−1 + CDC p−2 + · · ·
= p · tr DC p−1
0
= tr φ (C) · D .
D2 φ(X)(A)(B) = D2 φ(X)(B)(A)
Dφ2 (A)(B 1 , B 2 ) = AB 1 B 2 B 1 AB 2 + B 1 B 2 A + AB 2 B 1
+ B 2 AB 1 + B 2 B 1 A.
0
denote by φ the function on I × I defined as
φ(λ1 ) − φ(λ2 )
φ[1] (λ1 , λ2 ) = , if λ1 6= λ2
λ1 − λ 2
φ[1] (λ1 , λ1 ) = φ0 (λ1 ).
The expression φ[1] (λ1 , λ2 ) is called the first divided difference of φ at
(λ1 , λ2 ).
The next corollary shows directly the relationship between the first
divided difference matrix φ[1] and the simple derivative φ0 of the scalar
function φ(t).
= tr φ[1] (Λ) ◦ U H
| {zDU}
Z
n h
!
X i
= tr φ[1] (Λ) Z k,k
k,k
k=1
n
!
X 0
= tr φ (Λ) k,k Z k,k
k=1
0
= tr φ (Λ)U H DU
= tr φ0 (A) · D .
(3.6)
In (3.6), the fact used that the diagonal of φ[1] and φ0 are identical and
that the trace of AB is equal to the trace of BA.
Remark 3.5. There are different approaches and proofs of this fun-
damental theorem in the literature [34, 83, 88, 156].
Example 3.7. The list of the following functions provides only a small
number of representatives. However, these functions will be of certain
importance in applications later.
Z ∞
st 1 log(1 + t)
φ(t) = −1 + 3
ds = − .
1 s+ts t
3.2 Basic Characterizations 45
(3) M M = (0, 0, δ(s − 1)) with the delta function δ(s) leads to
Z ∞
st 1
φ(t) = δ(s − 1)ds = .
0 s+t 1+t
sin(rπ) ∞ st r−2
Z
φ(t) = s ds = tr .
π 0 s + t
Proof. The
directional derivative is defined as Dφ(A)(B) =
∂φ(A+B)
∂ .
=0
Z ∞
φ(A + B) = aI + b(A + B) + s(A + B) [sI + A + B]−1 dµ(s).
0
At point = 0 we obtain
Z ∞h
∂φ(A + B) i
= bB + I − A [sI + A]−1 sB [sI + A]−1 dµ(s)
∂
0
=0
46 Matrix-Monotone Functions
Z ∞
= bB + [sI + A − A] [sI + A]−1 sB
0
· [sI + A]−1 dµ(s)
which is equal to (3.11).
for all 1 ≤ k ≤ n.
Remark 3.9. The Theorem says that in order to have ||A|| ≤ ||B||
it is necessary and sufficient to check the Ky Fan norms 1 ≤ k ≤ n.
Note that this condition corresponds exactly to the statement that x
is majorized by y, i.e., x y. In other words, the symmetric gauge
function is Schur-convex with respect to x as stated in Lemma 2.6.
Note that the Löwner order and the Majorization of the eigenvalue
vector are two possible partial orders of positive semidefinite matrices.
If a matrix is greater than or equal to another matrix with respect
to the Löwner order, i.e., A ≥ B, then the eigenvalues of matrix A
weakly majorize also the eigenvalues of matrix B, i.e., λ(A) w λ(B).
In particular, if A ≥ B, then |||A||| ≥ |||B|||.
A map Φ that maps from the set of positive semidefinite matrices to the
set of positive semidefinite matrices is called a permutation operator if
for all A the entries of Φ(A) are one fixed rearrangement of those of A.
By the duality theorem [51, Thm. 5.5.14], it follows (|| · ||D )D = || · ||.
with permutation π.
Proof. The function φ(B 1/2 AB 1/2 ) can be rewritten using the
eigenvalues decomposition A = U A ΛA U H H
A , B = U B ΛB U B , and U =
UHB U A . Without loss of generality, we assume that both A and B have
full rank. The function is
1/2 1/2
φ(B 1/2 AB 1/2 ) = φ(ΛB U AU H ΛB ).
h
1/2 1/2 1/2 1/2
= tr A1 B 0 φ0 (B 0 A1 B 0 )B 0
!
i
1/2 1/2 1/2 1/2
− B 0 φ0 (B 0 A1 B 0 )B 0 A1 S
= tr SS H > 0.
(3.18)
The same approach can be used to show that for another choice of S
the derivative of φ() at the point is negative definite.
The dimensions of the three matrices in the SVD are given by: U H is
m × µ, ΛH is µ × µ, and V HH is µ × n with µ = min(m, n).
1
Proof. The value of the two optimization problems in (3.19) does not
depend on the left or right eigenvectors of H, because |||φ(U AU H )||| =
|||φ(A)||| for unitary U and because |||(U QU H )||| = |||(Q)|||. Denote
the rank of the n × m matrix H by ν. Furthermore, the value of the
matrix-monotone function φ at point zero is equal to zero, i.e., φ(0) = 0.
Write the unitary invariant norm as its symmetric gauge function, i.e.,
Φ(λ(A)) = |||A|||. As a result, the LHS of (3.19) is
φ(HQH H ) = φ(U H ΛH V H H
H QV H ΛH U H )
= φ(ΛH V H
H QV H ΛH )
= φ(ΛH V H H H
H V H U H SU H V H V H ΛH )
= φ(ΛH U H
H SU H ΛH )
= φ(V ΛH U H H
H SU H ΛH V H )
= φ(H H SH). (3.23)
C(AσB)C − (CAC)σ(CBC)
= C AσB − C −1 (CAC)σ(CBC)C −1 C
Example 3.13. Consider the function tr φ Z −1/2 HQH H Z −1/2 . It
can be represented by
tr φ Z −1/2 HQH H Z −1/2 = tr 1σφ Z −1/2 HQH H Z −1/2
= tr Z −1 Z −1 σφ HQH H .
4
Application of Majorization in Wireless
Communications
59
60 Application of Majorization in Wireless Communications
κ = E vec(H) · vec(H)H .
(4.1)
κ = RR ⊗ RT (4.2)
The channel matrix H for the case in which we have the Kronecker
assumption and correlated transmit and correlated receive antennas is
modeled as
1 1
H = RR2 · W · RT2 (4.3)
Fig. 4.1 Propagation models: Correlated transmit antennas at base station, uncorrelated
mobile with rich scattering, and key-hole channel.
62 Application of Majorization in Wireless Communications
In the case in which each receive antenna observes the same corre-
lation between the transmit antennas, i.e., the transmit correlation is
independent of the receive antenna and vice versa the receive correla-
tion is independent of the transmit antenna, the correlation model in
(4.1) simplifies to the model in (4.3). Note that the Kronecker model
arises not only in MIMO communications but also in the modeling of
electroencephalography (EEG) data. Methods to estimate the correla-
tion matrices under the Kronecker assumption are described in [154].
Note that the Kronecker model is a limited correlation model that
can only be applied successfully under certain conditions on the local
scattering at the transmitter and receiver [85, 103]. Therefore, a more
generalized model is to allow a sum of Kronecker products [11], i.e.,
n
X
κ= RR T
k ⊗ Rk . (4.4)
k=1
However, it turns out that even the model (4.4) cannot cover the com-
plete set of positive semi-definite correlation matrices. One counter
example is explicitly given here for the case nT = nR = 21
3/4 0 0 3/8
0 1/4 1/8 0
κ= .
0 1/8 1/8 0
3/8 0 0 3/8
However, the RHS of (4.6) is the sum of the average path loss from
the transmit antenna i = 1, . . . , nT . In order to study purely the impact
of correlation on the achievable capacity separately, the average path
loss is kept fixed by applying the trace constraint on the correlation
matrices R1T and R2T .
We will say that a correlation matrix R1T is more correlated than
R2T with descending ordered eigenvalues λT,1 T,1 T,1
1 ≥ λ2 ≥ · · · ≥ λnT ≥ 0
T,2 T,2 T,2
and λ1 ≥ λ2 ≥ · · · ≥ λnT ≥ 0 if
m m
λT,1 λT,2
X X
k ≥ k 1 ≤ m ≤ nT − 1. (4.7)
k=1 k=1
The measure of correlation is defined in a natural way: the larger the
first m eigenvalues of the correlation matrices are (with the trace con-
straint in (4.6)), the more correlated is the MIMO channel. As a result,
the most uncorrelated MIMO channel has equal eigenvalues, whereas
the most correlated MIMO channel has only one non-zero eigenvalue
which is given by λ1 = nT .
The following definition provides again themeasure for comparison
of two correlation matrices.
ρ1 ≥ ρ2 =⇒ λ(R(ρ1 )) λ(R(ρ2 ))
and R(ρ1 ) is more correlated than R(ρ2 ). The extreme cases are ρ = 0
which leads to completely uncorrelated R(0) = I and ρ = 1 which leads
to completely correlated R(1).
4.1 Spatial Correlation in Multiple Antenna Systems 65
with ñk is complex iid with variance σn2 because the normed matched fil-
ter matrix is unitary. The matched filter that leads to (4.11) is matched
to an effective channel that takes into account the space–time code, not
the actual physical channel.
68 Application of Majorization in Wireless Communications
gain nT nR , i.e., BERwc = Es1 [f (nT nR s1 )]. The best case performance
is achieved for completely uncorrelated transmit and receive anten-
PnT PnR
nas, i.e., BERbc = Es1,1 ,...,snT ,nR f k=1 l=1 sk,l = Ev [f (v)] with
2
χ distributed v with 2nT nR degrees of freedom. This is maximum
diversity gain of the system. In [76], it is shown that the full diversity
gain is achieved as long as the channel correlation matrices have full
rank. If the correlation matrices have full rank, the spatial correlation
shifts the BER curves only to the right but do not change the slope.
In Figure 4.2, the average BER for BPSK modulation and a 2 × 2
MIMO system applying an Alamouti STC is shown for three correlation
scenarios.
The result can be interpreted in the following way: Since no CSI
is available at the transmitter, the spatial dimension is best used if
all spatial diversity is exploited and an OSTBC is used to achieve full
diversity. Therefore, the reduction in diversity due to correlated trans-
mit or receive antennas leads to a performance degradation. This fact
is shown in the corollary above.
Closed form expressions for the BER as well as an illustration can
be found in [76]. Equal gain combining and selection combining are
studied in [75].
Fig. 4.2 Average BER for Alamouti STC 2 × 2 and different spatial correlations.
70 Application of Majorization in Wireless Communications
Proof. Since f (x) = log(1 + ax) for a > 0 is a concave function, this
result follows directly from Lemma 2.16.
Corollary 4.4. For the ergodic capacities in MISO systems with dif-
ferent levels of correlation and different CSI at the transmitter, we have
the following inequalities:
noCSI noCSI
Copt (ψ) ≤ Copt (µ2 ) ≤ Copt
noCSI
(µ1 ) ≤ Copt
noCSI
(χ)
cfCSI cfCSI
= Copt (χ) ≤ Copt (µ1 ) ≤ Copt
cfCSI
(µ2 ) ≤ Copt
cfCSI
(ψ)
pCSI pCSI pCSI pCSI
= Copt (ψ) ≤ Copt (µ2 ) ≤ Copt (µ1 ) ≤ Copt (χ). (4.16)
nT
" ! #
X
Pout (ρ, R, µ) = Pr log 1 + ρ µk sk ≤R .
k=1
4.1 Spatial Correlation in Multiple Antenna Systems 73
The proof follows directly from Theorem 2.19. For more discussion and
illustrations, the interested reader is referred to [73].
OSTBC with the same properties as the Alamouti scheme like, e.g., a
remarkably simple maximum-likelihood decoding algorithm. The per-
formance of OSTBC with respect to mutual information was analyzed
(among others) for the uncorrelated Rayleigh fading case in [118, 6] and
for the more general case with different correlation scenarios and line
of sight (LOS) components in [98]. More information about ST codes
can be found in the books [86] and [109].
Consider again the standard MISO block-flat-fading channel model
given by y = xH h + n with complex nT × 1 transmit vector x, channel
vector h (nT × 1), circularly symmetric complex Gaussian noise n with
2
variance σ2n per dimension. The inverse noise variance is denoted by
ρ = σ12 . The channel vector consists of complex Gaussian distributed
n
entries with zero mean and covariance matrix I, i.e., h ∼ CN (0, I).
The transmitter has no CSI and applies an OSTBC. For data stream
k the received signal after channel matched filtering is given by
yk = ||h||2 xk + nk . (4.17)
b nT2+1 c + 1
rc (nT ) = (4.18)
2b nT2+1 c
with the incomplete Gamma function Γ(n, x) [1]. The proof of the next
Theorem can be found in [77].
4.1 Spatial Correlation in Multiple Antenna Systems 75
even to an odd number of antennas ρ23 and ρ45 are at a higher SNR
as the switching points from even to the next even number of antennas
ρ24 and ρ46 , respectively.
y = Hx + n (4.21)
perfectly.
The channel matrix H for the case in which we have correlated
transmit and correlated receive antennas is modeled as in (4.2), i.e.,
1 1
H = RR2 · W · RT2 with transmit correlation matrix RT = U T D T U H T
and receive correlation matrix RR = U R D R U H R . U T and U R are
the matrices with the eigenvectors of RT and RR , respectively, and
D T , D R are diagonal matrices with the eigenvalues of the matrix
RT and RR , respectively, i.e., D T = diag[λT1 , . . . , λTnT ] and D R =
diag[λR R
1 , . . . , λnR ]. Without loss of generality, we assume that all eigen-
values are ordered with decreasing order, i.e., λT1 ≥ λT2 ≥ · · · ≥ λTnT . The
random matrix W has zero-mean independent complex Gaussian iden-
tically distributed entries, i.e., W ∼ CN (0, I).
The average performance measure will be defined in the next chap-
ter using matrix-monotone functions. Consider the following average
performance function and assume for the moment that φ is the mutual
information, i.e., φ(x) = log(1 + x). The characterization of a general
class of performance functions will be given in Section 5.1.4. Then the
average performance reads
nT
!
X
T R T H
Φ(λ , λ ) = E tr φ ρ λk w̃k w̃k
k=1
R
Corollary 4.5. For fixed receive correlation vector λ̄ and fixed vector
λT0 and for arbitrary vector λT1 which majorizes vector λT0 , i.e., λT1 λT0
R R
it follows that Φ(λT0 , λ̄ ) ≥ Φ(λT1 , λ̄ ).
T
For fixed transmit correlation vector λ̄ and fixed vector λR 0 and
for arbitrary vector λR 1 which majorizes vector λR
0 , i.e., λ R
1 λR
0 , it
T R T R
follows that Φ(λ̄ , λ0 ) ≥ Φ(λ̄ , λ1 ).
Fig. 4.4 Average mutual information for 2 × 2 MIMO system as a function of transmit and
receive correlation.
78 Application of Majorization in Wireless Communications
Eb
C(SNR) = SNR.
N0
4.1 Spatial Correlation in Multiple Antenna Systems 79
Eb
At low SNR, the function C N 0
can be expressed as [143]
Eb S0 Eb Eb
C ≈ − (4.23)
N0 3dB N0 dB N0 min dB
with
2
Eb loge 2 2 Ċ(0)
= and S0 = . (4.24)
N0 min Ċ(0) −C̈(0)
Eb Eb
The closer N 0
gets to N 0 min
the better is the approximation in (4.23).
Note, that the first and second derivative in (4.24) are taken of the
function common capacity function C(SNR).
In [89, 142], the two performance measures in (4.24) were computed
for the MIMO channel without CSI at the transmitter and with perfect
Eb
CSI at the receiver. The minimum N 0
and the wideband slope are given
by [143, Thm. 13]
Eb noCSI loge 2
= , (4.25)
N0 min nR
2n2T n2R
S0noCSI = nR nT . (4.26)
n2T (λR 2 n2R (λTk )2
P P
k) +
k=1 k=1
Eb
Note that we focus on the transmitted N0 as in [89].
Eb
For perfect CSI at the transmitter and the receiver, the N0 min and
the wideband slope S0 are given by
Eb pCSI loge 2
= (4.27)
N0 min Eλmax (HH H )
2(Eλmax (HH H ))2
S0pCSI = . (4.28)
E(λmax (HH H ))2
The impact of correlation on the performance metric in (4.27) is char-
acterized in the following theorem.
Theorem 4.4. With perfect CSI at the transmitter and receiver, the
Eb
minimum N 0
is Schur-concave with respect to the transmit and receive
correlation, i.e., for fixed receive correlation it holds
Eb pCSI T Eb pCSI T
λT1 λT2 =⇒ (λ1 ) ≤ (λ2 ).
N0 min N0 min
Eb
Proof. The minimum N 0 min
and the wideband slope S0 do not depend
on the eigenvectors of the transmit and receive correlation matrix since
the pdf of H is invariant against multiplication with unitary matrix
from left and from right, i.e.,
E λmax (HH H ) = E λmax (RT W RR W H )
= E λmax (D T W D R W H ) .
(4.29)
Fix the receive correlation and express the expectation in (4.29) as a
function of the vector of eigenvalues in D T
f (λT ) = E λmax (diag(λT )W D R W H )
(4.30)
We have the following two observations:
= E λmax (diag(λT )W D R W H ) .
4.1 Spatial Correlation in Multiple Antenna Systems 81
The last equality follows from the fact that the pdf of W and
ΠW is equal, because the permutation matrix Π is unitary.
(2) f (λT ) is convex with respect to λT . This holds even for each
realization W . Define Λ(t) = tΓ + (1 − t)Ψ. It holds
Using the Theorem 2.15 in Subsection 2.2.2 we observe that the con-
ditions, i.e., convexity and symmetry are fulfilled for Schur-convexity.
This completes the proof.
Eb
For covariance knowledge at the transmitter, the minimum N0 and
the wideband slope S0 are given by
Eb covCSI loge 2
= (4.32)
N0 min nR λT1
2n2R
S0covCSI = PnR R 2 . (4.33)
E k=1 λk wk
The proof follows from Lemma 2.16. Bounds on the achievable per-
formance can be found in [70].
Eb
In Figure 4.5, the spectral efficiency over N 0
is shown for different
MIMO systems with uninformed transmitter and perfectly informed
82 Application of Majorization in Wireless Communications
Eb
Fig. 4.5 Spectral Efficiency over N0
for different MIMO systems and transmitter and
Eb
receiver correlation for uninformed transmitter. The solid lines are the N0
and wideband
slope S0 approximations, the symbols are the simulated results.
Next, the inequalities for the wideband slope are presented. Denote
the completely correlated scenario, i.e., λT1 = nT and λT2 = λT3 =
· · · = λTnT = 0 and λR R R R
1 = nR and λ2 = λ3 = · · · = λnR = 0 by S0,cc and
the completely uncorrelated case, i.e., λT1 = λT2 = · · · = λTnT = λR 1 =
λR 2 = · · · = λ R = 1 as S
nR 0,uc . Then the following inequalities hold
2nT nR
S0 nCSI
uc = ≥ S0nCSI ≥ 1 = S0,cc
nCSI
nT + nR
2nR 1
S0 covCSI
uc = ≥ S0covCSI ≥ = S0,cc
covCSI
nR + 1 2
1
S0 pCSI
cc = ≥ S0pCSI .
2
allocation by
P 1
p∗ (h) = h i . (4.37)
E 1 ||h||2
||h||2
µ1 µ2 =⇒ C d (ρ, P , µ1 ) ≥ C d (ρ, P , µ2 ).
The noise power at the receivers is σk2 = ρ1 . The transmit power to noise
power is given by SNR = P ρ which is called transmit SNR. The chan-
1/2
nels are modeled by hk = wk Rk with correlation matrix Rk for
user 1 ≤ k ≤ K. Denote the eigenvalues of Rk in decreasing order
λk1 ≥ · · · ≥ λknT ≥ 0.
The following result is proven in [58] for the case where only linear
precoding is allowed at the base station and the base knows only the
norm of the channel vectors of all users.
Corollary 4.6. The guaranteed MSE region without SIC shrinks with
increasing spatial correlation at the mobile terminals, i.e., from λk γ k
for 1 ≤ k ≤ K, it follows MSE(λ1 , . . . , λK ) ⊆ MSE(γ 1 , . . . , γ K ).
In Figure 4.6, the zero-outage capacity region for two users and
two transmit antennas with symmetric correlation for different scenar-
ios is shown. Note that completely correlated transmit antennas lead
to zero-outage capacity. The uncorrelated scenario leads to E[1/α1 ] =
E[1/α2 ] = 1 whereas correlation λ increases this value to
log(λ) − log(2 − λ)
E[1/α1 ] = E[1/α2 ] = .
2λ − 2
Fig. 4.6 Zero-outage capacity region for MISO BC with two transmit antennas and two
users for different correlation scenarios λ = 1 and λ = 1.9.
Fig. 4.7 Interplay between different terms in cross-layer optimization. The arrows corre-
spond to some different types of relationships and interactions, e.g., (1) The user distribu-
tion influences the fading statistics and thereby the CSI. (2) The user distribution influences
the scheduling strategy because cell-edge users are treated in a different way than close-
to-the-base users. (3) The network utility function directly determines the optimal schedul-
ing strategy. (4) The value of the network utility function depends on the availability of
CSI. Note that there are many more relationships between the four terms.
Fig. 4.8 Fair comparison of user distributions with K = 8 users. (a) Symmetric scenario
c1 = (1, 1, . . . , 1). (b) One mobile moves to the base and another to the cell edge. The sum
of their fading variances stays constant. c2 = (1 + α, 1, 1, . . . , 1 − α). (c) All but one mobile
at the cell edge c3 = (8, 0, . . . , 0).
4.2 User Distribution in Cellular Communication Systems 91
Remark 4.4. Note, that the measure of user distribution and the mea-
sure of spatial correlation can be combined for, e.g., multiuser MIMO
systems. The channel of a user k can be modeled for all 1 ≤ k ≤ K
under the Kronecker model assumption from Section 4.1.1 as
1/2 1/2
H k = ck RR,k W k RT,k
with normalized transmit and receive correlation matrix as well as nor-
malized random matrix, i.e., for all 1 ≤ k ≤ K tr RT,k = nT , tr RR =
nR , and EW k = 1. Then, the long-term fading is captured by ck , the
spatial correlation by RT and RR and the rich multi-path environment
by W .
K
X
y(t) = hk xk (t) + n(t). (4.40)
k=1
92 Application of Majorization in Wireless Communications
Theorem 4.9. For perfect CSI at the mobiles, only the best user is
allowed to transmit at one time. The average sum rate in (4.41) is
Schur-convex function w.r.t. the fading variance vector c.
P
X
CcfCSI (ρ, c) = max E log 1 + ρ ck pk wk . (4.42)
K
i=1 pi =1 k=1
pi ≥0 ∀1≤i≤K
Theorem 4.10. For mobiles which know the fading variances and per-
fect CSI at the base, the average sum rate in (4.42) increases with less
spread out fading variances c, i.e., the average sum rate in (4.42) is a
Schur-convex function w.r.t. the fading variance vector c.
Finally, the ergodic sum capacity of the SISO MAC with perfect
CSI at the base and no CSI at the mobiles is given by
K
!
X
CnoCSI (ρ, c) = E log 1 + ρ ck wk . (4.43)
k=1
Theorem 4.11. For uninformed mobiles and perfect CSI at the base,
the average sum rate in (4.43) increases with more spread out fading
variances c, i.e., the average sum rate in (4.43) is a Schur-concave
function w.r.t. the fading variance vector c.
We omit the time index for convenience. The statistics of the fading
channel coefficients hi are completely characterized by ci . The trans-
mit power directly corresponds to the variance of the transmit signals
94 Application of Majorization in Wireless Communications
Lemma 4.12. The sum rate with perfect CSI at the base station is
achieved by TDMA. The optimal power allocation is to transmit into
direction of the best user l with ||hl ||2 > ||hk ||2 for all 1 ≤ k ≤ K and
l 6= k. The ergodic sum capacity is then given by
For no CSI at the base, the most robust transmit strategy against
worst case user distribution is equal power allocation and the ergodic
sum rate2 is given by
K
!
X
CnoCSI (ρ, c) = E log 1 + ρ ck wk . (4.46)
k=1
Next, let us characterize the impact of the spread of the fading vari-
ances on the ergodic sum capacity for the cases with perfect, covariance
and on the ergodic sum rate with no CSI at the base.
2 Sincethe optimal transmit strategy for no CSI is motivated by a compound channel
approach, we cannot talk about the sum capacity. Instead we use the term sum rate.
4.2 User Distribution in Cellular Communication Systems 95
Theorem 4.13. Assume perfect CSI at the mobiles. For perfect CSI at
the base, the ergodic sum capacity in (4.44) is a Schur-convex function
w.r.t. the fading variance vector c. For a base which knows the fading
variances, the ergodic sum capacity in (4.45) is a Schur-convex function
w.r.t. the fading variance vector c. For an uninformed base station, the
ergodic sum rate in (4.46) is a Schur-concave function w.r.t. the fading
variance vector c.
The proof and illustrations can be found in [72]. The proof is based
on Lemma 2.16 and Theorem 2.17.
is the average sum rate. For single user systems with perfect CSI, it is
called ergodic capacity [95]. In multiuser systems with perfect CSI we
can call it ergodic sum capacity and it describes the overall performance
96 Application of Majorization in Wireless Communications
Pr[C(α) < R∗ ] = 0.
nel approach that equal power allocation across all users is optimal.
Furthermore, the impact of the user distribution on the outage sum
rate is characterized in the following theorem.
Theorem 4.14. Assume that the base station is uninformed and the
user distribution is according to c. For fixed transmission rate R and
R
for SNR ρ < ρ = 2 2−1 , the sum outage probability is a Schur-concave
function of the user distribution c1 , . . . , cK , i.e., a less equal distribution
of users decreases the sum outage probability. For SNR ρ > ρ = 2R − 1,
the sum outage probability is a Schur-convex function of the user dis-
tribution c1 , . . . , cK , i.e., a less equal distribution of users increases the
sum outage probability.
Theorem 4.15. With perfect CSI at the base, the optimal scheduling
is TDMA and the outage probability is given by
Proof. In order to verify Schur’s condition, note that the outage prob-
ability can be written as
K
pCSI
Y
Pout (c) = (1 − exp(−z/ck ))
k=1
98 Application of Majorization in Wireless Communications
Table 4.1 Scaling of maximum throughput with perfect and no CSI at the base at SNR
0 dB.
CSI\M 1 2 3 5 10 20
pCSI 0.264 0.523 0.753 1.132 1.785 2.5265
noCSI 0.264 0.417 0.521 0.664 0.861 1.0521
Remark 4.5. From the proof of Corollary 4.7, it follows that there
must be at least two users in the cell, otherwise the delay limited sum
rate is zero.
5
Application of Matrix-Monotone Functions
in Wireless Communications
103
104 Application of Matrix-Monotone Functions in Wireless Communications
1 Sincewe fix the transmit covariance matrix, the term capacity can be confused. Usually, the
capacity is the ultimate rate that is achieved by optimization of the transmitter including
the covariance matrix. However, we think of the linear precoding matrix Q as fixed and
talk about the capacity for this fixed precoding Q.
106 Application of Matrix-Monotone Functions in Wireless Communications
Yn
1 + ρλk (HQH H )
= EH log
k=1
n
X
= EH log(1 + ρλk (HQH H ))
k=1
= EH tr log I + ρHQH H
(5.2)
−1
x̂ = ρQH H Z̃ + ρHQH H
y. (5.4)
The average normalized sum MSE is defined as the trace error covari-
ance matrix of the estimation error in (5.5) [49, 65]
and its average over channel realizations is called average sum MSE.
Note, that the MSE is convex in Q and concave in Z.
In SISO systems, the relationship between the rate, the SNR, and
the MSE is quite simple, i.e., C = log(1 + SNR) = log(1/MSE). In
MIMO systems, the connection is more complicated due to the spatial
dimension, e.g., the pairwise error probability (PEP) between X and
X̂ can be upper bounded by P (X → X̂) ≤ exp(−ρ||H(X − X̂)||2 ).
The connection between the performance measure mutual information
and MSE is highlighted in the next subsection.
Let us turn to the sum MSE expression. It can be written using the
matrix-valued function Φ2 (X) = I − X [I + X]−1 = [I + X]−1 as
Proof. It has already been shown in Section 3.2 that log(1 + x) and
1
1+x are matrix-monotone functions.
K
X
y= H k xk + n (5.11)
k=1
with the receiver noise n ∈ CnR ×1 which is AWGN, flat fading channel
matrices H k ∈ CnR ×nT , and transmit signals xk ∈ CnT ×1 . We assume
uncorrelated noise with covariance σn2 I nR . The inverse noise power
is denoted by ρ = σ12 . Equation (5.11) can be rewritten in compact
n
form as
y = Hx + n (5.12)
5.1.3.3 MIMO BC
Next, we study the downlink transmission from the base station to the
mobiles. The base station is equipped with nT transmit antennas and
each mobile has nR antennas. The channel matrices in the downlink
transmission correspond to the Hermitian channel matrices from the
uplink, i.e., H dl H
i = H i (reciprocity).
The received vector y k at each mobile k can be written as
K
X
yk = H H
k xk + HH
k xl + nk (5.16)
l=1,k6=l
114 Application of Matrix-Monotone Functions in Wireless Communications
The first part of Theorem 5.3 follows from Theorem 3.16. The
second part uses the optimality conditions that are described in
Appendix 6.2.3. The complete and detailed proof and illustrations for
the case Γ = I can be found in [20].
One proof can be found in [71]. The first part follows from invari-
ance properties of W and the second part follows from the optimality
conditions that are described in Appendix 6.2.3.
Remark 5.1. On the one hand, there is a closed form solution for
the optimal eigenvectors that solve (5.24), but on the other hand,
the optimal eigenvalues that solve (5.24) are only given in an indirect
form. However, the properties of the optimal eigenvalues are analyzed
based on the implicit characterization. Depending on the parameter ρ
in (5.24), the number of eigenvalues greater than zero is determined,
i.e., the higher ρ the more eigenvalues of Q are greater than zero. This
leads to the question where the transitions are, e.g., for which ρ there
is only one eigenvalues of Q greater than zero, i.e., Q has rank one.
5.2 Optimization of Matrix-Monotone Functions 117
Corollary 5.1. The optimization problem (5.24) has a rank one solu-
tion if and only if
h i
1/2 1/2
λ2 E tr RR φ0 RR w1 wH 1 RR
h i
1/2 0 1/2 H 1/2 1/2
≤ λ1 E w H
1 RR φ RR w 1 w 1 RR RR w 1 .
[14, 145], expressions like (5.26) in which φ was the mutual informa-
tion were studied under different admissible sets Z and Q. For mutual
information [106] studies a general game-theoretic framework for
min–max optimization.
The results in this section were stated in [17] without proofs.
Remark 5.2. Note that the argument of the RHS in (5.28) is inde-
pendent of the function φ.
Proof. First, we prove that the minimax performance equals the mini-
max performance of the expression
−1/2 1/2 1/2 −1/2
ΦDI = inf max tr φ ΛZ ΛH Λ Λ
Q H ΛZ . (5.29)
tr ΛZ ≤nσ 2 tr ΛQ ≤P
1/2
The singular value decomposition of H is given by H = U H ΛH V H
H.
D
At first, we show that for ΦI ≤ ΦI , we have
max tr φ(Z −1/2 HQH H Z −1/2 )
tr (Q)≤P
1/2 1/2
= max tr φ(Z −1/2 U H ΛH V H H −1/2
H QV H ΛH U H Z )
tr (Q)≤P
1/2 1/2
= max tr φ(Z −1/2 U H ΛH QΛH U H
HZ
−1/2
)
H
tr (V H QV )≤P
1/2 1/2
= max tr φ(Z −1/2 U H ΛH QΛH U H
HZ
−1/2
).
tr (Q)≤P
5.2 Optimization of Matrix-Monotone Functions 119
Now, we choose Ẑ = U H ΛZ U H
H fixed, then it directly follows
= ΦD
I . (5.30)
ΦI ≥ ΦD
I . (5.34)
Using Theorem 3.16, one can easily prove the following corollary.
Theorem 5.6. The value of (5.41) equals the value of (5.42), i.e.,
ΦIII = ΦD
III .
The KKT conditions (see (6.5) in Appendix 6.2.3) for the optimization
in (5.42) are derived via the Lagrangian
n n
!
X X
H
L̄(p, ν, ψ) = φ(H diag(p)H) − ν pk − P + pk ψ k
k=1 k=1
as
tr diag(p) ≤ P
tr diag(p)diag(ψ) = 0
0 ∗
hi φ (H diag(p )H)hH
H
i − ν + ψi = 0. (5.47)
The row vector hi is the ith row vector of H. Observe that the first
KKT condition in (5.47) is also fulfilled for all Z with fixed diagonal
entries. The KKT conditions in (5.47) correspond to the KKT condi-
tions in (5.46). This means that the value of ΦDIII is equal to the value
of ΦIIIa (Z) for
1 1
Z ∗ = Hφ0 (H H S ∗ H)H H + Ψ. (5.48)
λ λ
124 Application of Matrix-Monotone Functions in Wireless Communications
Note that the matrix Z ∗ has full rank regardless of the rank of S.
Furthermore, note that the matrix Z and the Lagrangian multiplier Ψ
to ensure positive semi-definiteness of S fulfill
tr S ∗ Ψ = 0
tr Z ∗ S ∗ = P.
λ = hk φ0 (H H diag(p)H)hH
k (5.49)
for all k for which pk > 0. The columns k of the channel H that corre-
spond to pk = 0 can be omitted. For the minimax problem in ΦIII
this means that the effective channel can be reduced by canceling
all those columns k of H. Therefore, the rows of Z −1/2 that corre-
spond to these k do not influence the value of ΦIII . Especially the
kth eigenvalue of Z can be chosen arbitrarily according to the pos-
itiveness constraint and to the diagonal constraint. As a result, the
kth diagonal entry of Z can be smaller than σ 2 without increasing
the value of the minimax problem ΦIII . Finally, these kth diagonal
entries of Z can be “filled up” to σ 2 without decreasing the value
of ΦIII .
The next result shows that the optimal covariance matrices can
be found by iterative single-user performance optimization with col-
ored noise. This approach corresponds with the iterative waterfilling
approach in [164] for sum capacity optimization in which single-user
waterfilling is iteratively performed treating the other users as noise.
On the one hand this approach provides insight into the structure of
the optimum transmit covariance matrices and on the other hand under
specific conditions this approach is computational more efficient than
the joint optimization of the transmit covariance matrices. If the num-
ber of users is large in comparison to the number of transmit antennas
of the users, the joint optimization is computational more complex than
the iterative optimization of each user separately.
max tr φ Z k + ρH k Qk H H
k
subject to tr Qk ≤ pk and Qk 0, 1 ≤ k ≤ K. (5.57)
The proof is based again on the optimality conditions in (6.5) and can
be found in [20].
These conditions in (5.59) are fulfilled by the optimum power
allocation vector for fixed covariance matrices. Observe, that for all
5.2 Optimization of Matrix-Monotone Functions 129
(1) For fixed ρ̂, it is optimal to allocate the complete sum power
to user one, i.e., the user with the largest maximum channel
eigenvalue, if and only if the following condition is satisfied
0
λmax H H s H
2 φ ρ̂H 1 Q1 (ρ̂)H 1 H 2
0
≤ λmax H H s H
1 φ ρ̂H 1 Q1 (ρ̂)H 1 H 1 (5.60)
The proof is based on the optimality conditions (6.5) and can be found
in [20]. Furthermore, in [20], an algorithm is developed which is based
on alternating optimization, the optimality in the fixed point and the
convergence is proved.
(a)
(b)
Fig. 5.3 Sum capacity optimization for MIMO MAC: Optimal power allocation and compar-
ison with suboptimal transmit strategy. (a) Single-user power level for user 1 and maximum
eigenvalue of channel matrix for user 2 as a function of the maximum transmit sum power.
(b) Sum capacity comparison between optimal transmit strategy compared to single-user
only strategy for multiple antenna MAC with two users and the channels in (5.63).
132 Application of Matrix-Monotone Functions in Wireless Communications
A = U ΛU H .
133
134 Appendix
for x ∈ X .
Lemma 6.4 (Section 5.1.3 [24]). The dual function yields lower
bounds on the optimal value of the problem (6.2). For any λ 0 and
any ν we have
g(λ, ν) ≤ p∗ .
Proof. Suppose x̄ is a feasible point for the problem (6.2), i.e., fi (x̄) ≤ 0
and h(x̄) = 0, and λ 0. Obviously
m
X p
X
λk fk (x̄) + νk hk (x̄) ≤ 0,
k=1 k=1
since the terms in the first sum are nonpositive and terms in the second
sum are all zero. As a result,
g(λ, ν) = inf L(x, λ, ν) ≤ L(x̄, λ, ν) ≤ f0 (x̄).
x∈X
d ∗ ≤ p∗ .
The difference between these two p∗ − d∗ is the optimal duality gap is
always nonnegative.
Remark 6.4. Strong duality does not, in general, hold. Even convexity
of the primal problem is not sufficient. The conditions under which
strong duality holds are called constraint qualifications. One simple
constraint qualification is Slater’s condition: There is an x ∈ X s.t.
fi (x) < 0 for all i = 1, . . . , m and hi (x) = 0 for all i = 1, . . . , p. Slater’s
theorem says that strong duality holds, if Slater’s condition holds (and
the problem is convex).
we use Theorem 1 in [35]. One result in [35, Thm. 1] states, that (6.4)
is fulfilled, if X and Y are two compact Hausdorff spaces, f is for
every y ∈ Y lower semi-continuous, f is for every x ∈ X upper semi-
continuous, and f is convex on x and concave on y and if the sets X
and Y are convex, too. The following saddle-point interpretation can
also be applied
This also implies that the strong max–min property holds (and there-
fore the strong duality).
Part of the content of this book was presented during lectures at the
Technische Universität Berlin, Germany, within the course “Applied
Information Theory” from 2005–2007, at the Royal Institute of Tech
nology, Stockholm, Sweden, within the course “Advanced Digital Com-
munications,” and at the Beihang University, Beijing, China within the
course “Advanced Digital Communications” in 2007.
This work has been supported in part by the Bundesministerium
für Bildung und Forschung (BMBF) under Grant BU150 and in part
by the Swedish Research Foundation (Vetenskapsrådet) under Grant
623-2005-5359.
141
References
143
144 References
[27] G. Caire, G. Taricco, and E. Biglieri, “Optimum power control over fading
channels,” IEEE Transactions on Information Theory, vol. 45, no. 5, pp. 1468–
1489, July 1999.
[28] D. Chizhik, G. J. Foschini, M. J. Gans, and R. A. Valenzuela, “Keyholes,
correlations, and capacities of multielement transmit and receive antennas,”
IEEE Transactions on Wireless Communications, vol. 1, no. 2, pp. 361–368,
April 2002.
[29] D. Chizhik, G. J. Foschini, and R. A. Valenzuela, “Capacities of multielement
transmit and receive antennas,” IEE Electronics Letters, vol. 36, pp. 1099–
1100, June 2000.
[30] C.-N. Chuah, D. N. C. Tse, and J. M. Kahn, “Capacity scaling in MIMO
wireless systems under correlated fading,” IEEE Transactions on Information
Theory, vol. 48, no. 3, pp. 637–650, March 2002.
[31] T. M. Cover, “Broadcast channels,” IEEE Transactions on Information The-
ory, vol. 18, no. 1, pp. 2–14, January 1972.
[32] T. M. Cover, “Comments on broadcast channels,” IEEE Transactions on
Information Theory, vol. 44, no. 6, pp. 2524–2530, October 1998.
[33] T. M. Cover and J. A. Thomas, Elements of Information Theory. Wiley
& Sons, 1991.
[34] W. F. Donoghue Jr, Monotone Matrix Functions and Analytic Continuation.
Springer-Verlag, 1974.
[35] K. Fan, “Minimax theorems,” Proceedings National Academic Society, vol. 39,
pp. 42–47, 1953.
[36] M. Fiedler, “Bounds for the determinant of the sum of hermitian matrices,”
Proceedings of the American Mathematical Society, vol. 30, no. 1, pp. 27–31,
September 1971.
[37] G. J. Foschini and M. J. Gans, “On limits of wireless communications in a
fading environment when using multiple antennas,” Wireless Personal Com-
munications, vol. 6, pp. 311–335, 1998.
[38] D. Gerlach and A. Paulraj, “Adaptive transmitting antenna methods for mul-
tipath environments,” Global Telecommunications Conference, vol. 1, pp. 425–
429, December 1994.
[39] A. Goel and A. Meyerson, “Simultaneous optimization via approximate
majorization for concave profits or convex costs,” Algorithmica, vol. 44,
pp. 301–323, 2006.
[40] A. J. Goldsmith, S. A. Jafar, N. Jindal, and S. Vishwanath, “Capacity limits
of MIMO channels,” IEEE Jourunal on Selected Areas in Communications,
vol. 21, no. 5, pp. 684–702, June 2003.
[41] J.-C. Guey, M. P. Fitz, M. R. Bell, and W.-Y. Kuo, “Signal design for
transmitter diversity wireless communication systems over rayleigh fad-
ing channels,” in 1996 IEEE Vehicular Techology Conference, pp. 136–140,
Atlanta, GA, 1996.
[42] D. Guo, Gaussian Channels: Information, Estimation and Multiuser Detec-
tion. PhD thesis, Princeton University, 2004.
[43] D. Guo, S. Shamai (Shitz), and S. Verdú, “Mutual information and minimum
mean-square error in Gaussian channels,” IEEE Transactions on Information
Theory, vol. 51, no. 4, pp. 1261–1282, April 2005.
146 References
[44] S. Hanly and D. Tse, “Multiaccess fading channels: Part II: Delay-limited
capacities,” IEEE Transactions on Information Theory, vol. 44, no. 7,
pp. 2816–2831, November 1998.
[45] F. Hansen, G. Ji, and J. Tomiyama, “Gaps between classes of matrix monotone
functions,” Bulletin London Mathematical Soceity, vol. 36, no. 1, pp. 53–58,
2004.
[46] F. Hansen and G. K. Pedersen, “Jensen’s inequality for operators and Löwner’s
theorem,” Mathematique Annalen, vol. 258, pp. 229–241, 1982.
[47] F. Hansen and G. K. Pedersen, “Perturbation formulas for traces on c*-
algebras,” Publication of the Research Institute for Mathematical Sciences,
Kyoto University, vol. 31, pp. 169–178, 1995.
[48] G. Hardy, J. E. Littlewood, and G. Pólya, Inequalities. Cambridge Mathemat-
ical Library, Second ed., 1952.
[49] T. Haustein and H. Boche, “On optimal power allocation and bit-loading
strategies for the MIMO transmitter with channel knowledge,” Proceedings of
IEEE ICASSP 2003, vol. IV, pp. 405–409, 2003.
[50] R. A. Horn and C. R. Johnson, Matrix Analysis. Cambridge University Press,
1985.
[51] R. A. Horn and C. R. Johnson, Topics in Matrix Analysis. Cambridge Uni-
versity Press, 1991.
[52] M. Horodecki, P. Horodecki, and R. Horodecki, “Separability of mixed states:
Necessary and sufficient conditions,” Physics Letters A, vol. 223, pp. 1–8, 1996.
[53] M. T. Ivrlac and J. A. Nossek, “Quantifying diversity and correlation in
rayleigh fading mimo communication systems,” Proceedings of IEEE ISSPIT,
vol. 1, pp. 158–161, 2003.
[54] S. Jafar and A. Goldsmith, “Multiple antenna capacity in correlated rayleigh
fading with channel covariance information,” IEEE Transactions on Wireless
Communication, vol. 4, no. 3, pp. 990–997, May 2005.
[55] S. A. Jafar and A. Goldsmith, “On optimality of beamforming for multi-
ple antenna systems with imperfect feedback,” International Symposium on
Information Theory, p. 321, 2001.
[56] S. A. Jafar and A. Goldsmith, “Vector MAC capacity region with covari-
ance feedback,” IEEE International Symposium on Information Theory, p. 54,
2001.
[57] G. Jöngren, M. Skoglund, and B. Ottersten, “Combining beamforming and
orthogonal space-time block coding,” IEEE Transactions on Information The-
ory, vol. 48, no. 3, pp. 611–627, March 2002.
[58] E. Jorswieck, B. Ottersten, A. Sezgin, and A. Paulraj, “Guaranteed perfor-
mance region in fading orthogonal space-time coded broadcast channels,” Pro-
ceedings of IEEE ISIT, 2007.
[59] E. A. Jorswieck, Unified approach for optimisation of single-user and
multi-user multiple-input multiple-output wireless systems. PhD thesis,
Technical University of Berlin, Germany, September 2004. Available online:
http://edocs.tu-berlin.de/diss/2004/jorswieck eduard.htm.
[60] E. A. Jorswieck, Transmission strategies for the MIMO MAC, ch. 21, pp. 423–
442, Hindawi Publishing Corporation, 2005.
References 147
[93] A. W. Marshall and F. Proschan, “An inequality for convex functions involving
majorization,” Journal of Mathematical Analysis Applications, 1965.
[94] T. L. Marzetta and B. M. Hochwald, “Capacity of a mobile multiple-antenna
communication link in Rayleigh flat fading,” IEEE Transactions on Informa-
tion Theory, vol. 45, no. 1, pp. 139–157, January 1999.
[95] R. J. McEliece and W. E. Stark, “Channels with block interference,” IEEE
Transactions on Information Theory, vol. 30, no. 1, pp. 44–53, January
1984.
[96] R. U. Nabar, H. Bölcskei, V. Erceg, D. Gesbert, and A. J. Paulraj, “Perfor-
mance of multiantenna signaling techniques in the presence of polarization
diversity,” IEEE Transactions on Signal Processing, vol. 50, pp. 2553–2562,
2002.
[97] R. U. Nabar, H. Bölcskei, and A. J. Paulraj, “Outage properties of space-time
block codes in correlated Rayleigh or Ricean fading environments,” IEEE
ICASSP, pp. 2381–2384, May 2002.
[98] R. U. Nabar, H. Bölcskei, and A. J. Paulraj, “Diversity and outage perfor-
mance in Ricean MIMO channels,” IEEE Transactions Wireless Communica-
tions, vol. 4, no. 5, pp. 2519–2532, September 2005.
[99] A. Narula, M. J. Lopez, M. D. Trott, and G. W. Wornell, “Efficient use of
side information in multiple-antenna data transmission over fading channels,”
IEEE Journal on Selected Areas in Communications, vol. 16, no. 8, pp. 1423–
1436, October 1998.
[100] A. Narula, M. J. Trott, and G. W. Wornell, “Performance limits of coded
diversity methods for transmitter antenna arrays,” IEEE Transactions on
Information Theory, vol. 45, no. 7, pp. 2418–2433, November 1999.
[101] M. A. Nielsen, “Conditions for a class of entanglement transformations,” Phys-
ical Review Letters, vol. 83, pp. 436–439, 1999.
[102] H. Özcelik, Indoor MIMO Channel Models. PhD thesis, Technische Universität
Wien, 2004.
[103] H. Özcelik, M. Herdin, W. Weichselberger, J. Wallace, and E. Bonek, “Defi-
ciencies of the kronecker MIMO radio channel model,” IEE Electronics Letters,
vol. 39, no. 16, pp. 1209–1210, 2003.
[104] D. P. Palomar, A unified framework for communications through MIMO chan-
nels. PhD thesis, Universitat Politécnica de Catalunya, 2003.
[105] D. P. Palomar, J. M. Cioffi, and M. A. Lagunas, “Joint TX-RX beamforming
design for multicarrier MIMO channels: A unified framework for convex opti-
mization,” IEEE Transactions on Signal Processing, vol. 51, no. 9, pp. 2381–
2401, September 2003.
[106] D. P. Palomar, J. M. Cioffi, and M. A. Lagunas, “Uniform power allocation in
MIMO channels: A game-theoretic approach,” IEEE Transactions on Infor-
mation Theory, vol. 49, no. 7, pp. 1707–1727, July 2003.
[107] D. P. Palomar and Y. Jiang, “MIMO transceiver design via majorization the-
ory,” Foundations and Trends in Communications and Information Theory,
vol. 3, no. 4–5, pp. 331–551, 2007.
[108] D. P. Palomar and S. Verdú, “Gradient of mutual information in linear vector
Gaussian channels,” IEEE Transactions on Information Theory, vol. 52, no. 1,
pp. 141–154, January 2006.
150 References