Professional Documents
Culture Documents
6, JUNE 2013
AbstractWe consider precoder design for maximizing the antenna transmitter, e.g, a BS in cellular networks, to several
weighted sum rate (WSR) of successive zero-forcing dirty paper receivers, possibly with multiple antennas. This scenario is
coding (SZF-DPC). For this problem, the existing precoder referred to as the MIMO broadcast channel (BC) (alternatively,
designs often assume a sum power constraint (SPC) and rely
on the singular value decomposition (SVD). The SVD-based downlink channel). It is now well known that dirty paper
designs are known to be optimal but require high complexity. coding (DPC) achieves the capacity region of the MIMO BC
We first propose a low-complexity optimal precoder design for [6]. Although DPC is the optimal transmission scheme for
SZF-DPC under SPC, using the QR decomposition. Then, we MIMO BCs, finding the transmit covariances that achieve a
propose an efficient numerical algorithm to find the optimal point in the capacity region is generally of high complexity.
precoders subject to per-antenna power constraints (PAPCs). To
this end, the precoder design for PAPCs is formulated as an For example, to find the sum-capacity point, several numerical
optimization problem with a rank constraint on the covariance algorithms were proposed, e.g., in [7][9], which are basically
matrices. A well-known approach to solve this problem is to relax based on the iterative water-filling algorithm. Thus, it is
the rank constraints and solve the relaxed problem. Interestingly, of great interest to develop simplified DPC-based precoding
for SZF-DPC, we are able to prove that the rank relaxation is techniques, where optimal precoders are easy to find.
tight. Consequently, the optimal precoder design for PAPCs is
computed by solving the relaxed problem, for which we propose Successive zero-forcing dirty paper coding (SZF-DPC),
a customized interior-point method that exhibits a superlinear introduced in [10] for single-antenna receivers, and later
convergence rate. Two suboptimal precoder designs are also generalized in [11] for multiple-antenna receivers, combines
presented and compared to the optimal ones. We also show the zero-forcing (ZF) technique with DPC. In the SZF-DPC
that the proposed numerical method is applicable for finding
the optimal precoders for block diagonalization scheme. scheme, for the kth user, the interference caused by users 1 to
k 1 is canceled by DPC, and that caused by users k + 1 to
Index TermsMIMO systems, broadcast channels, dirty paper
K is eliminated by the ZF technique, where K is the number
coding, multiuser multi-antenna communication, zero-forcing.
of users. To be specific, let W k be the precoder of the kth
user. Then, the ZF constraints impose that H j W k = 0 for all
I. I NTRODUCTION j < k, where H j is the channel matrix between the BS and the
VER the last decade, multiple-input multiple-output
O (MIMO) transmission techniques have drawn a lot of
attention due to its capability of boosting the channel capacity
jth user. In this way, SZF-DPC decomposes a MIMO BC into
a group of parallel interference-free point-to-point channels,
which simplifies the problem of precoder design. It was shown
without the need for additional bandwidth or power [3] in [10], [11] that SZF-DPC performs very close to DPC, and
[5]. An important scenario is the transmission from a multi- that the optimal precoders can be computed analytically. Thus,
SZF-DPC can also be used as a benchmark for comparison
Manuscript received February 1, 2013; revised April 5, 2013. The editor
coordinating the review of this paper and approving it for publication was D. purposes.
Gunduz. It is convenient to consider a cascade structure for W k , i.e.,
L.-N. Tran and M. Juntti are with the Centre for Wireless Communications
and the Department of Communications Engineering, University of Oulu, W k = B k Dk , where B k is designed to satisfy the ZF con-
Finland (e-mail: {ltran, markku.juntti}@ee.oulu.fi). L.-N. Tran was with the straints, and Dk is optimized under the power constraints. The
Signal Processing Laboratory, ACCESS Linnaeus Center, KTH Royal Institute precoder design methods for SZF-DPC differ in how B k is
of Technology, SE-100 44 Stockholm, Sweden.
M. Bengtsson and B. Ottersten are with the Signal Processing Laboratory, calculated. Since W k must lie in N (H k ) to satisfy the zero-
ACCESS Linnaeus Center, KTH Royal Institute of Technology, SE-100 44 interference constraints, where H k = [H H H H H H ]H ,
1 2 k1
Stockholm, Sweden (e-mail: {mats.bengtsson, bjorn.ottersten}@ee.kth.se). B. and N (H) denotes the null space of H, it is optimal to
Ottersten is also with the Interdisciplinary Center for Security, Reliability
and Trust (SnT), University of Luxembourg, L-1359 Luxembourg-Kirchberg, design B k as an orthonormal basis of N (H k ). This fact
Luxembourg (e-mail: bjorn.ottersten@uni.lu). was exploited in [11], where B k is an orthonormal basis of
This research has been supported by Tekes, the Finnish Funding Agency for N (H k ), which is obtained from the singular value decompo-
Technology and Innovation, Nokia Siemens Networks, Renesas Mobile Eu-
rope, Elektrobit, Xilinx, Academy of Finland, and by the European Research sition (SVD) of H k . However, finding a basis of N (H k ) via
Council under the European Communitys Seventh Framework Programme SVD is computationally costly, since the size of H k grows
(FP7/2007-2013) / ERC grant agreement n 228044. Parts of this paper were with the user index. We note that the SVD-based design in
presented at the IEEE International Conference on Communications, Ottawa,
Canada, June, 2012 [1], [2]. [11] needs to calculate a series of SVD to find the precoders
Digital Object Identifier 10.1109/TCOMM.2013.043013.130100 for all users.
0090-6778/13$31.00 2013 IEEE
TRAN et al.: WEIGHTED SUM RATE MAXIMIZATION FOR MIMO BROADCAST CHANNELS USING DIRTY PAPER CODING AND ZERO-FORCING . . . 2363
Herein, we propose a low-complexity precoder design for tion, we present two suboptimal precoder designs which have
SZF-DPC, using only a single QR decomposition (QRD). lower computational complexity, and perform very close to
First, we note that it is computationally cheaper to calculate the optimal one.
the null space of a matrix using a QRD instead of an SVD. As mentioned before, the precoder design for SZF-DPC is
Thus, a natural way to reduce the complexity of the SVD- closely related to that for block diagonalization (BD) [15],
based method is to find a basis of N (H k ) using the QRD, [16]. Of the two precoding schemes, BD suffers from a
instead of the SVD. This approach is referred to as the natural stricter ZF condition than SZF-DPC. More explicitly, the ZF
QRD-based design (NQRD-based design). Still, the NQRD- constraints for BD force the precoder of a user to lie in the
based design computes several QR decompositions. As one intersection of null spaces of all other users channel matrices.
of our main contributions, we propose a precoder design Subsequently, in contrast to SZF-DPC, not all the antennas in
that results from applying a QRD to a matrix composed the BD transmit scheme will use full power, since the number
of the channel matrices of all users. More explicitly, only of degrees of freedom could be too low. In other words, some
a single QRD is performed to find all B k s, instead of of the PAPCs are nonbinding. In spite of this difference, we
separately computing N (H k ) to find B k for each k as in the show that the proposed precoder design method for SZF-DPC
SVD-based design. Thus, the proposed method requires much can be slightly modified to solve the problem of the precoder
lower complexity, compared to the SVD- and NQRD-based design for BD, leading to a numerical method that converges
designs. Even though the columns of B k in the proposed faster than the existing design using the subgradient method.
method do not span N (H k ), we will prove that it is also The remainder of the paper is organized as follows. In
an optimal precoder design for SZF-DPC. Particularly, the Section II, we briefly review the system model and the
proposed method reduces to the QRD-based design in [10] precoder design for SZF-DPC schemes. Section III deals with
for multiple-input multiple-output (MISO) BCs, meaning that the precoder design with the SPC. In this section, we present
the QRD-based design in [10] is optimal for MISO BCs. an optimal precoder design, and analyze the computational
The precoder designs for SZF-DPC mentioned above as- complexity. The precoder design with PAPCs is addressed
sume a sum-power constraint (SPC), for which the optimal in Section IV, where we present a specialized numerical
precoders admit a water-filling solution. In practice, individual algorithm to the find the optimal precoders. We also intro-
per-antenna power constraints (PAPCs) are more useful than duce two suboptimal designs which have lower computational
the SPC, since each antenna is equipped with its own power complexity, but perform close to the optimal one. In Section V,
amplifier. We notice that the precoder designs for channel we address a precoder design for BD under PAPCs, extending
inversion or block diagonalization with PAPCs in [12], [13] the proposed algorithm for SZF-DPC. Numerical results are
are applicable to SZF-DPC since SZF-DPC can be viewed as given in Section VI, followed by some conclusions in Section
a relaxation of these schemes. Particularly, a numerical algo- VII.
rithm based on a dual decomposition method was proposed in Notation: Standard notations are used in this paper. Bold
[13], using a subgradient method to find the optimal precoders lower and upper case letters represent vectors and matrices,
for block diagonalized systems. respectively; H H and H T are Hermitian and normal trans-
Since subgradient methods in general show slow conver- pose of H, respectively; ||H||F and |H| are the Frobenius
gence, a second contribution of this paper is to propose a norm and determinant of H, respectively. I M represents an
more efficient solution. To the best of our knowledge, no M M identity matrix. R(H) and N (H) denote the column
analytical form for the optimal precoder design for SZF- space and the null space of H, respectively. diag(x), where
DPC with PAPCs has been reported. Hence, we resort to x is a vector, denotes a diagonal matrix with elements x;
numerical algorithms. To this end, we first formulate the diag(H), where H is a square matrix, denotes a vector of its
precoder design as a rank-constrained optimization problem, diagonal elements. [x]i is the ith entry of vector x; [H]i,j is
to which the rank relaxation technique is a popular approach the entry at the ith row and jth column of H.
[14]. Basically, this technique drops the rank constraints to
form a relaxed problem, which is (very often) convex and, II. MIMO BC S AND SZF-DPC
thus, easier to solve. In general, however, such an approach
may yield suboptimal solutions to the original problem since Consider a single-cell MIMO BC with a base station (BS)
the rank constraints are not guaranteed when solving the and K multiple antenna users. The channel between the BS
relaxed problem. Interestingly, due to the special structure and the kth user is generally modeled by a matrix H k
of SZF-DPC, we are able to show that all optimal solutions Cnk N , where N and nk 1 are the number of antennas at
of the relaxed problem always satisfy the rank constraints. the BS and the kth user, respectively. The received signal at
In other words, the relaxation is tight, and both the original the kth user is given by
and the relaxed problems are equivalent. Then, we propose
y k = H k xk + H k xj + nk (1)
a numerical algorithm to solve the relaxed problem, based j=k
on the barrier method. As a part of the proposed algorithm,
we recognize that all the power constraints must be active where xk CN 1 denotes the transmitted signals for the kth
at the optimum. This facilitates finding the optimal solutions user, and nk Cnk 1 is assumed to be a complex-Gaussian
numerically since equality constraints are easier to cope with. noise vector with zero mean and covariance matrix I nk . We
By numerical examples, we demonstrate that the proposed can further write
algorithm achieves remarkably fast convergence rate. In addi- xk = W k sk (2)
2364 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 61, NO. 6, JUNE 2013
where sk CLk 1 , Lk min(N, nk ) with E[sk sH k ] = I, the outer boundary of the achievable rate region of SZF-DPC.
and W k CN Lk are the vector of transmitted symbols and In [11], an approach to find the optimal precoders W k s for
the precoder of the kth user, respectively. Accordingly, (1) can (7) was proposed using the SVD, which is described next.
be rewritten as
y k = H k W k sk + H k W j sj + H k W j sj + nk . (3) A. SVD-based Design
j<k j>k To solve (7), we can write W k = B k Dk , where B k is
It is well known that DPC is a capacity achieving transmission designed to remove the interference, and Dk is adjusted to
strategy for MIMO maximize the WSR under some power constraints. Obviously,
BCs. For the kth user, the BS views the the ZF constraints in (7) mean that B k must lie in N (H
k ),
interference term j<k H k W j sj as known non-causally, so
1
it can be perfectly eliminated using DPC. As a result, the where
k1
resulting data rate of the kth user is given by k = [H T H T H T ]T C
H i=1 ni N
. (8)
1 2 k1
K
DPC
|I + jk H k S j H H k | Like DPC, the user ordering also affects the achievable sum
Rk = log (4)
|I + j>k H k S j H H k | rate of SZF-DPC [10], [11]. Note that a different user ordering
results in a different H k in (8), and accordingly different
where S j = W j W H j is the transmit covariance matrix N (H k ). More specifically, the singular values of the effective
for the jth user. Since (4) is non-convex with respect to channel for the kth user are changed with a new user ordering,
S j , it is generally difficult to deal with. For the sum rate providing higher or lower sum rate. Obviously, the best sum
maximization problem for MIMO BCs under a total power rate can be obtained by searching all possible combinations of
constraint, optimal S j s cannot be found analytically, and user ordering. For the sake of simplicity, we assume a natural
iterative numerical algorithms are involved [7][9]. user ordering in this paper. In [11], B k is chosen to be an
To overcome the difficulty in finding optimal covariance orthonormal basis of N (H k ), which can be found by an SVD
matrices in DPC, the authors in [11] proposed SZF-DPC, of H k . To be specific, denote the full-size SVD of H k as
which admits a closed-form solution for optimal precoders.
In fact, SZF-DPC is a generalization of the zero-forcing DPC H k
k=U k [V k B k ]H (9)
(ZF-DPC) in [10], devised for single-antenna
receivers. For
where k has the same dimensions as H
k1
k . Then columns
SZF-DPC, the interference term j<k H k
W j sj in (3) is
of B k C N nk
,n k = N i=1 ni , form an orthonormal
canceled by DPC, and the interference term j>k H k W j sj k ). The condition for the BS to support all users
basis of N (H
is eliminated by designing W j such that
is that N (H k ) has a dimension larger than zero, for all k.
H k W j = 0 for all j > k. (5) Assuming the rows of all H k s to be linearly independent,
this requirement is equivalent to N > K1 i=1 ni . When the
Accordingly, the resulting data rate of the kth user for SZF- number of users is large, a user scheduling algorithm is needed
DPC is given by to choose a set of users that satisfies the above condition and
RkSZF-DPC = log |I + H k S k H H
k |. (6) can exploit the multiuser
K1 diversity gain [17]. In this paper,
we assume N > i=1 ni and focus on the precoder design.
The goal of the precoder design for SZF-DPC schemes is Since tr(W k W H H H H
k ) = tr(B k D k D k B k ) = tr(D k D k ), the
to find W k s that maximize a performance measure under maximization problem in (7) reduces to
the ZF constraints in (5), and additional constraints on the K H H
k=1 k log |I + H k D k D k H k |
transmit power. In this paper, we consider the precoder design maximize
Dk (10)
for SZF-DPC to maximize the WSR under a sum power and K H
subject to k=1 tr(D k D k ) P
per-antenna power constraints.
where H k = H k B k . Then, D k can be easily found with
III. P RECODER D ESIGNS FOR SZF-DPC WITH SPC water-filling over non-zero singular values of H k . More
specifically, define a compact SVD of Hk as
In this section, we address the maximization problem for
SZF-DPC under a SPC, which is mathematically formulated k = H k B k = U k k V H Cnk nk
H (11)
k
as
K H H where k is an Lk Lk diagonal matrix that contains all
maximize
Wk k=1 k log |I + H k W k W k H k | non-zero singular values of H k , Lk = min(nk , n k ), and V k
subject to H j W k = 0, j < k (7) contains the Lk singular vectors of H k . To maximize the
K H
1
The set of optimal {k } to (12) can be easily calculated using which is a lower block triangular matrix, where H k =
nk Lk
the water-filling algorithm. The resulting precoder for the kth H kW k C is the effective channel matrix of the kth
1
user is given by W k = B k V k k2 . Since the columns of user. In the generalized QRD (GQRD)-based design, we fur-
B k span N (H k ), it is not difficult to see that the SVD-based ther force HW to be a lower triangular matrix. Specifically,
method is optimal for (7). However, consider a QRD of H given in (14) as
k1 this method employs SVD
to find the null space of a ( i=1 ni ) N matrix for the kth H = LQ (17)
user, which is computationally costly.
A simple method to reduce the complexity of the SVD- and partition L into
based design is to replace the SVD by the QRD, which has
L1
lower and deterministic complexity. Specifically, applying a L2
QRD to H k gives
..
L=
.
(18)
k = [Lk 0][Q B k ]H
H (13)
k ..
k1 k1
.
where L C i=1 ni i=1 ni is a lower triangular matrix, LK
k1
Qk CN i=1 ni contains an orthonormal basis of R(H H ),
k nk nk
and B k CN nk forms an orthonormal basis of N (H k ), i.e., where stands for Lk C is a lower triangular matrix,
k B k = 0, and B H B k = I. This simple method is referred and Q into
H k K ]H
to as the natural QRD (NQRD)-based design in the sequel. Q = [B1B
2 B (19)
Note that B k in (13) and B k in (9) are both orthonormal basis where B k CN nk satisfies H j B k = 0, j < k, i.e.
of N (H k ), and, thus, the NQRD- and SVD-based precoder H
H
kB k = 0, and B B
k
k = I n . Note that the columns of
k
designs are equivalent. k does not form a basis of N (H
B k ). Let
Apparently, the NQRD-based design can lower the com-
plexity of SVD-based method, but its complexity is still high Bk = B k B k+1 B K CN nk . (20)
because, like SVD-based design, the QRD is sequentially Clearly, the columns of B k in (20) forms an orthogonal basis
applied to a matrix of large dimensions. In the following, we of N (H k ), and thus precoder for user k can always be written
propose another optimal precoder design, which is derived as W k = B k D k . In this way, an orthogonal basis for each
directly from a single QRD of H in (14). N (H k ) is computed by a single QRD of H in (14). To gain
further insights into the structure of optimal precoders under
B. Generalized QRD-based Design (GQRD-based design) SPC, let us analyze the effective channel of the kth user, which
is given by
The precoding matrix B k in the NQRD- and SVD-based
designs is found to be a basis of N (H k ) for each k. Due to the Dk,1
H k W k = H k B k D k = Lk 0 = Lk D k,1 (21)
concatenated structure of H k , the complexity of these meth-
Dk,0
ods increases with k. Note that rank(B k ) = dim(N (H k )) =
where Dk,1 contains the top nk rows of Dk and Dk,0 the
N n k . Since rank(H k B k ) = rank(H k ) = nk
remaining (nk nk ) rows. From (21), we can easily see that
rank(B k ), we can reduce the dimension of B k as long as
the cost function of (7) stays the same no matter how Dk,0
its singular values are aligned with those of H k . This fact
is chosen. We now show that it is optimal to set D k,0 = 0,
suggests that B k need not be a basis of N (H k ) to be an
which follows from
optimal design. In what follows, we propose a precoder design
based on a single QRD, where columns of B k do not span tr(W k W H H H H H
k ) = tr(B k D k D k B k ) = tr(B k B k D k D k )
N (H k ). For notational convenience, let us stack the channel
= tr(D k DH H H
k ) = tr(D k,1 D k,1 ) + tr(D k,0 D k,0 )
matrix of all users in a matrix H defined as
H tr(D k,1 D H
k,1 ) (22)
H = HH 1 HH 2 HHK CnR N (14)
with equality if and only if D k,0 = 0. In fact, we have proved
K the following theorem
where nR = k=1 nk is the total number of receive antennas,
and all precoders in a matrix W given by Theorem 1. The optimal precoders W k s for SZF-DPC under
W = [W 1 W 2 W K ] C N LR
(15) SPC are of the form W k = B k Dk,1 , where D k,1 is the
solution to the following problem
K
where LR = k=1 Lk is the total number of data streams K H H
that the BS is able to transmit to all users in the system. The
maximize
D k,1 k=1 k log |I + Lk D k,1 D k,1 Lk |
K (23)
H
k=1 tr(D k,1 D k,1 ) P.
ZF constraints in (5) can be equivalently expressed as subject to
H1 Theorem 1 leads to the GQRD-based design, which is
H 2 summarized in Algorithm 1.
..
HW = .
(16) Remark 1. As a special case when nk = 1 for all k, i.e. single-
. antenna receivers, the GQRD-based method is the same as
.. the precoder design based on QRD proposed in [10]. Indeed,
H K this design is widely used in current works [18][20] without
2366 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 61, NO. 6, JUNE 2013
Number of flops
4: Optimal precoder for user k is found as W k = B k Dk,1 .
1
Now problem (28) is a convex program, and thus can be solved can be replaced by equality ones, which facilitates a Newton-
by numerical optimization tools, for example SDPT3 [29]. type method. In other words, (28) is equivalent to
Regarding the relaxation technique, an important question to K
ask is whether the optimal solution to the relaxation problem maximize k k H
k log |I + H H|
k=1 k
K H
k=1 [B k k B k ]n,n = Pn , n
is also optimal to the original problem. Interestingly, we show subject to (31)
that (27) and (28) are equivalent, and thus the optimal solution k 0, k.
to (28) is also optimal to (27). The proof is an immediate
The proposed algorithm is based on the barrier method [30] to
consequence of the following lemma.
solve (31). As a standard step, we define a modified objective
Lemma 1. The optimal solutions k to (28) satisfy function
rank(k ) Lk = min(nk , n
k ). K
K
f (t, {k }) = (t H| +
k k H
k log |I + H log |k |)
Proof: Please refer to Appendix A. k
k=1 k=1
Although problem (28) can be solved by a general purpose (32)
optimization package, developing a specialized algorithm, if where log |k | is the logarithmic barrier function to account
possible, that exploits the problem structure is always of for the positive semidefinite constraint k 0, and t > 0
great interest. Herein, we present a customized interior-point is a parameter that controls the logarithmic barrier terms. For
algorithm to solve (28), using the barrier method. Before mathematical convenience, let
proceeding, it is worth pointing out that a two-stage iterative
(n)
algorithm was proposed for the precoder design of block Ak = BH
k diag(0, . . . , 0, 1, 0, . . . , 0)B k C
n
k
nk
(33)
diagonalization scheme in [13] using a subgradient method, n1 N n
which can be applied to solve (28). More explicitly, consider
and consider a standard equality constrained minimization
the partial Lagrangian function of (28), which is given by
problem
K
H| minimize f ({k }, t)
L(k , ) = k k H
k log |I + H k K (n) (34)
k=1 subject to k=1 tr[Ak k ] = Pn , n.
N
K
The general idea of a barrier method is that, for a fixed
n ( [B k k B H
k ]n,n Pn ) (29) t, we find the optimal solutions {k (t)} to (34) (which is
n=1 k=1
known as the centering step), and increase t until the algorithm
where n is the dual variable associated with the power converges. In this paper, we employ the infeasible start Newton
constraint on the nth antenna . Since strong duality holds method to find the optimal solutions to (34). The purpose of
for (28), its optimal solution can be found via the following using the infeasible start Newton method is to simplify the
convex-concave optimization problem initialization of {k } that satisfy the equality constraints. We
start with the optimal conditions (i.e. KKT conditions) for
minimize maximize L(k , ) = minimize g() (30) (34), which are given by
0 k 0 0
H H
tk H k )1 H
k k H
k (I + H k
where g() max L(k , ) is the dual function of problem N
k 0 (n)
(28). The two-stage iterative algorithm in [13] works as 1
k + n Ak = 0, k (35)
follows. For fixed , the set of covariance matrices {k } n=1
that maximizes L(k , n ) can be obtained by the water- K
(n)
filling algorithm. Next, for a set of given {k }, the iterative tr[Ak k ] = Pn , n (36)
algorithm updates to minimize the dual function g() based k=1
on the subgradient method. Generally, however, the subgradi- where {n } are the dual variables. In (35) we have used the
ent method converges slowly to the optimum. It is known
fact that the gradient of log |I + H H | with respect
k k H
k
that for a minimax optimization problem, the infeasible-start
to k is given by k log |I + H H| = H
k k H H (I +
Newtons method [30] that solves the maximization and the k k
minimization at the same time has a faster convergence rate H H )1 H
k k H k . The main effort in a Newton method is
k
[28], [30], [31]. In the following, we propose a numerical to calculate the Newton step. To do this, we replace k by
algorithm to solve (28), which exhibits a better convergence k + k and n by n + n in the KKT conditions,
behavior. which yields the KKT system for the Newton step
First, we observe that the constraints in (28) are active at the
optimum. As proof, suppose the ith H (I + H
tk H H +H
k k H H )1 H
k k H k
K H constraint
K
is inactive, i.e. k k k
Denote k k H
k = I + H H . Since
k is invertible, and Algorithm 2 The proposed numerical algorithm to solve (28)
k
1 1 1 1 Initinalization: k = In k , = 0, t = t0 , and and
(A + B) A A BA for small B, we can write
(37) as tolerance > 0
1: repeat {Outer iteration}
H
tk H 1 1 H 1 1
k (k k H k k H k k )H k + (k 2: repeat {Inner iteration (centering step)}
N
3: Compute the Newton step k and dual step
(n)
1 1
k k k ) (n + n )Ak = 0, k (39) from (41) and (44), respectively
n=1 4: Backtracking line search:
or equivalently as 5: s=1
6: while r({k } + s{k }, + s) > (1
k k H
tk k H k k + k = tk k H
k k s)r({k }, ) or ({k } + s{k }
0) do
N 7: s = s
(n)
+ k (n + n )k Ak k , k (40) 8: end while
n=1 9: Update primal and dual variables: k = k +sk ;
= + s
where H k = H H 1 H
k . One possibility to find {k } 10: until r({k }, ) <
k k
and {n } is to vectorize k as a vector k of length 11: Increase t. t = t
n nk + 1)/2 for each k, transform(38) and (40) into
k ( 12: until t is sufficiently large to tolerate the duality gap.
K
a form of a linear system of (N + k=1 n k (
nk + 1)/2)
variables, and use a generic method to solve the resulting
linear system. However, the complexity of such a method is
primal-dual interior-point method which often shows faster
O(N 6 ). In this paper, we present a low-complexity method to
convergence rate. However, we skip the details for the sake of
find {k } and {n }, using block elimination [30, Section
simplicity.
10.4]. Specifically, we express k as
Remark 2. Here, we provide a rough comparison of the
N
(0) (i) computational cost of an iteration for the proposed algorithm
k = k + i k . (41)
and the two-stage iterative method. Note that, to update k
i=1
in each iteration of the two-stage iterative method, presented
Substituting (41) into (40) yields a system of (N +1) discrete- in [13], we need to compute the singular value decomposition
time Sylvester equations (SVD) of an nk n k matrix, and the inverse of an n k n
k
(0) H
k k + (0) = tk k H
k k + matrix, of which the complexity is O( n2k nk ) and O(
n3k ),
tk k H k k respectively. As mentioned earlier, the complexity of solving
N
(n) (42) is reduced to O( n3k ). That is to say, the complexity of
k n k Ak k
each iteration of the two-stage iterative method is of the same
n=1
order as that of Algorithm 2. Moreover, Algorithm 2 is actually
(i) H
tk k H k k + (i) = k A(i) k , i = 1, . . . , N.
k k k a customized interior-point method using the barrier method,
(42)
and thus it shows a superlinear convergence rate as we show
Numerical methods to solve the discrete-time Sylvester
in the numerical result section. As a result, for the proposed
equations in (42) with complexity O( n3k ) can be found e.g. in
algorithm, less computation effort is required to obtain the
[32]. We note that the solution to each discrete-time Sylvester
optimal precoders.
equation in (42) is a Hermitian matrix, and thus k in (41)
is Hermitian as well. To compute the Newton step for the dual
variable , we plug k from (41) into (38), which results B. Suboptimal Designs
in a linear system
= (43) In practice, it is always interesting to find suboptimal
K designs that can achieve a significant fraction of the optimal
(i) (j)
where []i,j = tr(A ) for i, j = 1, 2, . . . , N , performance, but require lower complexity. In this subsection
K k=1 (i) k (0)k K (i)
and []i = k=1 tr(Ak k ) + Pi k=1 tr(Ak k ). we propose two suboptimal precoder designs, and briefly
We define residual norm at {k } and , which is used in the discuss their computational complexity, compared to (28).
backtracking line search, as 1) QRD-based PAPC Precoder Design: The first sub-
K
optimal design, which is referred to as QRD-based PAPC
r({k }, ) = H (I + H
|| tk H H )1 H
k k H k design, is derived from the GQRD-based design for the SPC.
k k
Specifically, let Qk CN nk be the result of applying the
k=1
N QRD to H as given in (19). By Theorem 1, it holds that
1 +
(n)
n Ak ||F + v 2 (44) H k Qk QH H H H
k H k = H k B k B k H k . This equality suggests
k
n=1 that precoders based on a linear combination of Qk are
K expected to work well. In this way, the transmit covariance
(i)
where vi = Pi k=1 tr[Ak k ] for i = 1, 2, . . . , N. The matrices of the first suboptimal design are given by
proposed algorithm to solve (28) is summarized in Algorithm
2. The proposed algorithm can be easily modified into a S k = Qk k QH
k (45)
TRAN et al.: WEIGHTED SUM RATE MAXIMIZATION FOR MIMO BROADCAST CHANNELS USING DIRTY PAPER CODING AND ZERO-FORCING . . . 2369
where k Cnk nk is the solution to the following problem Cnk Lk , and problem (48) is equivalent to
K H H K H H H
maximize k=1 k log |I + H k Qk k Qk H k | maximize
Sk k=1 k log |I + H k k T k T k k H k |
k
K H
(46) K H H
subject to k=1 [Qk k Qk ]n,n Pn , n. subject to k=1 [k T k T k k ]n,n Pn , n
(50)
We note that problem formulation in (46) is analogous to that We can further rewrite (50) as
in (31). Hence, Algorithm (2) is used to compute optimal
K H
Qk in (46). Recall that the dimension of each k in (46) maximize k=1 k log |I + H k k H k |
k 0
is nk (nk + 1)/2, which is smaller than n nk + 1)/2 for
k ( K (51)
each optimal k in (28). Thus, solving (46) requires much subject to [ ]
k=1 k k P , n
k n,n n
n1 + n = 0, n (57) 101
K
Proposed method
100
[k k H
Two-stage iterative method
k ]n,n + n = Pn , n (58)
k=1
101
and the resulting KKT system to find the Newton direction is
tk k H
k H k k + k = tk k H k k 103
N
(n)
+ k (n + n )k Ak k , k (59) 104
n=1
105
n2 n + n = n1
n , n (60)
K K 106
(n)
(n)
20 40 60 80 100
tr[Ak k ] + n = Pn tr[Ak k ] n n (61) Iteration
k=1 k=1
Fig. 2. Convergence behavior, N = 6, nk = 2 for k = 1, 2, 3.
The Newton step k is computed analogously by a system
of (N + 1) discrete-time Sylvester equations as in (42).
Plugging (60) into (61), and using (41), we can find the
line search parameters in Algorithm 2 are = 0.01, and
Newton step for the dual variables as the solution to the
= 0.5. Fig. 2 plots the error in the sum rate versus
following
the number of iterations for a random realization of channel
= (62)
matrices. For the first iterations, the two-stage iterative method
K (i) (j) 2
k=1 tr(Ak k ) i i,j for i, j =
where []i,j = converges faster than the proposed method to the optimal
1, 2, . . . , N , where i,j denotes the Kroneckers function, solution. However, the proposed method performs better when
i.e., i,j = 1 if i = j and i,j = 0 otherwise, and precoders approach the optimum. Simulation results obtained
K (i) (0) (i)
[]i = Pi k=1 tr(Ak k + Ak k ) + n2 n 2i . The with other randomly generated channel matrices illustrate
Newton direction for the slack variables is computed using the same convergence behavior of the two methods. Recall
(60), namely as that a subgradient need not be a descent direction. Thus, an
iteration can even decrease the objective function. Moreover,
n = n2 ( + n ) + n . (63) the convergence rate of subgradient methods relies strongly
on the problem size. For example, we observe that the two-
Remark 3. We have treated the cases of sum-power and
stage iterative method fails to converge within three thousand
per-antenna power constraints separately and their specific
iterations when N = 16, and K = 8, while Algorithm 2 still
properties are exploited to arrive at a computationally efficient
converges to the optimal solution within tens of iterations. As
algorithm for each case. However, it could also be practical to
a conclusion, the proposed numerical algorithm demonstrates
consider both types of power constraints simultaneously. For
better convergence rate than the two-stage iterative algorithm
example, in addition to PAPCs due to the physical limitation of
in [13]. It is worth mentioning again that, since the proposed
the power amplifiers, we can impose a SPC on transmitted data
algorithm and the two-stage iterative method are alternatives to
to meet, e.g., a regulatory body requirement for health factors
each other to find the optimal solution of the resulting convex
or to reduce the overall interference situation [31]. We note
problem given in (28), they will converge to the same optimal
that the proposed algorithms for SZF-DPC and BD schemes
objective value, i.e., the proposed algorithm and the two-stage
presented in the paper can be easily modified to handle SPC
iterative approach yield the same sum-rate performance. Thus,
and PAPCs simultaneously. For such a case, all the constraints
it is sufficient to provide the sum rate offered by the proposed
will not be binding, in general. However, we can introduce
algorithm in Figs. 3 and 4 to follow.
some slack variables to convert all constraints to be equality
In Fig. 3, we plot the average sum rate, i.e., k = 1 for
ones as done in (53). Then, the steps from (54) to (63) can
all k, of optimal and suboptimal precoder design methods
be slightly changed to solve the new problem.
for SZF-DPC schemes as a function of P , the total transmit
power. The resulting power constraint for each antenna (when
VI. N UMERICAL RESULTS considering the PAPCs) is Pn = P/N for all n. A quasi-
In this section, we provide numerical examples to demon- static fading model is used in our simulation.
The channel
strate the results in this paper. In the first numerical experi- for user k is generated as H k = dk H k where dk is a
ment, we compare the convergence rate of Algorithm 2 and given parameter to capture the path loss and entries of H k
the two-stage iterative method in [13]. A MIMO BC with follow zero mean and unit variance complex Gaussian random
N = 6 transmit antennas, K = 3 users, each with 2 receive variables for each snapshot. Fig. 3 considers a scenario with
antennas is simulated. The tolerance (for each centering step) dk = 1, N = 8, K = 4, and nk = 2 for k = 1, 2, . . . , 4. That
is set to be = 105 . The barrier method parameters t0 and is, we ignore the effect of path loss and simply consider small
are set to 50 and 1, respectively. The effects of the initial scale fading in Fig. 3. We can see that the optimal precoder
value of and t0 are discussed in [30]. The backtracking designs with PAPCs yield a slightly lower sum rate than those
TRAN et al.: WEIGHTED SUM RATE MAXIMIZATION FOR MIMO BROADCAST CHANNELS USING DIRTY PAPER CODING AND ZERO-FORCING . . . 2371
Rescaled SPC Precoder Design which is shown to be optimal and has greatly lower complexity
than the existing method using the SVD. For PAPCs, the
precoder design for SZF-DPC is first formulated as a rank-
constrained optimization problem. Then we consider a relaxed
15
problem, which is obtained by dropping the rank constraints.
Exploiting the special features of SZF-DPC, we prove that the
relaxed and original problems are equivalent. More explicitly,
we show that optimal solutions of the relaxed problem always
10 satisfy the rank constraints. Next, we propose an efficient
numerical method based on a barrier method to solve the
0 2 4 6 8 relaxed problem. The proposed numerical method is shown to
Total transmit power, P (dB)
have a superior convergence behavior, compared with the two-
stage iterative method based on the dual subgradient method.
Fig. 3. Sum rate comparison of optimal and suboptimal designs for SZF-DPC In addition, we illustrate that that the proposed precoder design
schemes with dk = 1, N = 8, K = 4, nk = 2, k = 1, 2, . . . , 4. for SZF-DPC can be slightly modified to solve the problem
of precoder design for BD.
A PPENDIX
Sum-power constraint
6 A. Proof of Lemma 1
Per-antenna optimal design
QRD-based PAPC precoder design In this appendix, we prove that the rank of optimal solutions
Average sum rate (b/s/Hz)
5 Rescaled SPC Precoder Design to (28) is less than or equal to Lk . The proof follows
similar arguments as in [23], [33]. We begin by forming the
Lagrangian function of (28), which is given by
4
K
L(k , ) = k log |I + H H|
k k H
k
3 k=1
N
K
K
(n)
n tr(k Ak ) Pn + tr(k k ) (64)
2
n=1 k=1 k=1
By contradiction, suppose i = 0 for some 1 [9] M. Kobayashi and G. Caire, An iterative water-filling algorithm for
i N . We construct a set of k such that 1 = maximum weighted sum-rate of Gaussian MIMO-BC, IEEE J. Sel.
Areas Commun., vol. 24, no. 8, pp. 16401646, Aug. 2006.
diag(0, . . . , 0, , 0, . . . , 0), and k = 0 for 2 k K.
[10] G. Caire and S. Shamai, On the achievable throughput of a multi-
i1 N i
antenna Gaussian broadcast channel, IEEE Trans. Inf. Theory, vol. 49,
Then, the objective function in (65) becomes no. 7, pp. 16911706, Jul. 2003.
[11] A. Dabbagh and D. Love, Precoding for multiple antenna Gaussian
1 ]i ||22 ) + tr(1 1 ) (69)
L(k , , k ) = 1 log(1 + ||[H broadcast channels with successive zero-forcing, IEEE Trans. Signal
Process., vol. 55, no. 7, pp. 38373850, Jul. 2007.
[12] A. Wiesel, Y. Eldar, and S. Shamai, Zero-forcing precoding and
where [H 1 ]i is the ith column of H 1 . We can see that the
generalized inverses, IEEE Trans. Signal Process., vol. 56, no. 9, pp.
objective function in (69) is unbounded above if . 44094418, 2008.
Since we are only interested in the case where g(, k ) is [13] R. Zhang, Cooperative multi-cell block diagonalization with per-base-
station power constraints, IEEE J. Sel. Areas Commun., vol. 28, no. 9,
finite, we conclude that i > 0 for all 1 i N . Thus, pp. 14351445, Dec. 2010.
must be positive definite, and k is invertible. It follows from [14] Z. Q. Luo, W. K. Ma, A.-C. So, Y. Ye, and S. Zhang, Semidefinite
(67) that rank(k ) rank(H k ) = Lk , which completes the relaxation of quadratic optimization problems, IEEE Signal Process.
Mag., vol. 27, no. 3, pp. 2034, May 2010.
proof. [15] Q. Spencer, A. Swindlehurst, and M. Haardt, Zero-forcing methods
for downlink spatial multiplexing in multiuser MIMO channels, IEEE
Trans. Signal Process., vol. 52, no. 2, pp. 461471, Feb. 2004.
B. Proof of Lemma 2 [16] L.-N. Tran, M. Juntti, and E.-K. Hong, On the precoder design for
In this appendix, we slightly modify the proof of Lemma block diagonalized MIMO broadcast channels, IEEE Commun. Lett.,
vol. 16, no. 8, pp. 11651168, Aug. 2012.
1 to show that the optimal solutions to problem (52) satisfy [17] L.-N. Tran and E.-K. Hong, Multiuser diversity for successive zero-
rank(k ) min(nk , nk ). Let {n } be dual variables associ- forcing dirty paper coding: greedy scheduling algorithms and asymptotic
ated with the PAPCs in (52). Following the same derivations performance analysis, IEEE Trans. Signal Process., vol. 58, no. 6, pp.
34113416, Jun. 2010.
from (64) to (67) as in Appendix A, we have [18] J. Jiang, R. Buehrer, and W. Tranter, Greedy scheduling performance
H H
for a zero-forcing dirty-paper coded system, IEEE Trans. Commun.,
k H k k H
(I + H )1 H
k k = k k (70) vol. 54, no. 5, pp. 789793, May 2006.
k k
[19] Z. Tu and R. Blum, Multiuser diversity for a dirty paper approach,
where k and H k are defined in (49)-(52), = IEEE Commun. Lett., vol. 7, no. 8, pp. 370372, Aug. 2003.
[20] M. Maddah-Ali, M. Sadrabadi, and A. Khandani, Broadcast in MIMO
diag(1 , 2 , . . . , N ), and k is now defined as k = systems based on a generalized QR decomposition: signaling and
H k k . Unlike SZF-DPC where we are able to prove that performance analysis, IEEE Trans. Inf. Theory, vol. 54, no. 3, pp. 1124
n > 0 for all n, the dual variables associated with the PAPCs 1138, Mar. 2008.
[21] L.-N. Tran, M. Juntti, M. Bengtsson, and B. Ottersten, Beamformer
for BD are not necessarily positive. That is to say, not all designs for zero-forcing dirty paper coding, in Proc. 2011 International
the power constraints are necessarily tight at the optimum. Conference on Wireless Communications and Signal Processing, Nov.
However, k in (70) is still invertible. To see this, let be a 2011, pp. 15, invited paper.
[22] L.-N. Tran, M. Juntti, M. Bengtsson, and B. Ottersten, On the opti-
vector lying in N (k ), i.e. k = 0. Due to the assumption mality of beamformer design for zero-forcing DPC with QR decompo-
of independence among H k s, it is guaranteed with probability sition, in Proc. 2012 IEEE ICC, pp. 25362541.
one that H H = H k k = 0. Let k = H . Then, [23] L.-N. Tran, M. Juntti, M. Bengtsson, and B. Ottersten, Beamformer
k designs for MISO broadcast channels with zero-forcing dirty paper
lim L(k , , k ) = , i.e., the objective function in (68) coding, IEEE Trans. Wireless Commun., vol. 12, no. 3, pp. 11731185,
Le-Nam Tran received the B.S. degree in Electrical Mats Bengtsson (M00-SM06) received the M.S.
Engineering from Ho Chi Minh National University degree in computer science from Linkping Univer-
of Technology, Vietnam in 2003, and M.S and PhD sity, Linkping, Sweden, in 1991 and the Tech. Lic
in Radio Engineering from Kyung Hee University, and Ph.D. degrees in electrical engineering from the
Republic of Korea, in 2006 and 2009, respectively. Royal Institute of Technology (KTH), Stockholm,
In 2009, he joined the Department of Electri- Sweden, in 1997 and 2000, respectively. From 1991
cal Engineering, Kyung Hee University, Republic to 1995, he was with Ericsson Telecom AB Karlstad.
of Korea, as a lecturer. From September 2010 to He currently holds a position as Associate Profes-
July 2011, he was a postdoc fellow at the Signal sor at the Signal Processing Laboratory, School of
Processing Laboratory, ACCESS Linnaeus Centre, Electrical Engineering, KTH. His research interests
KTH Royal Institute of Technology, Sweden. Since include statistical signal processing and its applica-
August 2011, he has been with Centre for Wireless Communications and tions to communications, multi-antenna processing, cooperative communica-
Department of Communications Engineering, University of Oulu, Finland. His tion, radio resource management, and propagation channel modelling. Dr.
current research interests include multiuser MIMO systems, energy efficient Bengtsson served as Associate Editor for the IEEE Transactions on Signal
communications, and full duplex transmission. He received the Best Paper Processing 2007-2009 and was a member of the IEEE SPCOM Technical
Award from IITA in August 2005. Committee 2007-2012.