You are on page 1of 13

1

Channel Estimation for Reconfigurable Intelligent


Surface Aided Multi-User MIMO Systems
Jie Chen, Student Member, IEEE, Ying-Chang Liang, Fellow, IEEE, Hei Victor Cheng, Member, IEEE, and
Wei Yu, Fellow, IEEE

Abstract—Channel acquisition is one of the main challenges the reflective coefficients in real-time and reflect the incident
arXiv:1912.03619v1 [eess.SP] 8 Dec 2019

for the deployment of reconfigurable intelligent surface (RIS) signal passively without additional energy consumption [9].
aided communication system. This is because RIS has a large Besides, RIS can be equipped with a large number of reflective
number of reflective elements, which are passive devices without
active transmitting/receiving and signal processing abilities. In elements for achieving high array/passive beamforming gain
this paper, we study the uplink channel estimation for the without requiring much hardware cost [10].
RIS aided multi-user multi-input multi-output (MIMO) system. Due to the above promising advantages, RIS has been
Specifically, we propose a novel channel estimation protocol introduced into various wireless communication systems. In
for the above system to estimate the cascade channel, which particular, the key design issue of the RIS aided wireless
consists of the channels from the base station (BS) to the
RIS and from the RIS to the user. Further, we recognize the communication system is to jointly optimize the beamformer
cascaded channels are typically sparse, this allows us to formulate at the transceiver and the phase shift matrix induced by RIS
the channel estimation problem into a sparse channel matrix to achieve various objectives [11]–[19]. Specifically, for a
recovery problem using the compressive sensing (CS) technique, downlink multi-user multiple-input multiple-output MIMO
with which we can achieve robust channel estimation with limited systems [11]–[16], the energy-efficiency maximization was
training overhead. In particular, the sparse channel matrixes of
the cascaded channels of all users have a common row-column- studied in [11] subject to the individual Quality-of-Service
block sparsity structure due to the common channel between BS (QoS) constraint. In [12], the minimum signal-to-interference-
and RIS. By considering such a common sparsity, we further plus-noise ratio (SINR) subject to a maximum power con-
propose a two-step procedure based multi-user joint channel straint was studied by considering both rank-one and full-
estimator. In the first step, by considering common column-block rank channel matrix between the base station (BS) and the
sparsity, we project the signal into the common column subspace
for reducing complexity, quantization error, and noise level. In RIS. Then, the weighted sum-rate maximization problems
the second step, by considering common row-block sparsity, we were studied in [13] for a single-cell scenario and in [14]
apply all the projected signals to formulate a multi-user joint for a multi-cell scenario, respectively. Moreover, the downlink
sparse matrix recovery problem, and we propose an iterative achievable rate maximization problem was studied in [19] for
approach to solve this non-convex problem efficiently. Moreover, wideband orthogonal frequency division multiplexing (OFDM)
the optimization of the training reflection sequences at the RIS
is studied to improve the estimation performance. system. Besides, the channel capacity optimization problems
were studied in [16] with single RIS and in [15] with multiple
Index Terms—Reconfigurable intelligent surface, common row- RISs, respectively, and then extended into millimeter-wave
column sparsity, multi-user joint channel estimation, compressive
sensing. (mmWave) environment in [17], [18].
However, the above studies [11]–[19] focus on the joint
design of the beamformer at the BS and the phase shift
I. I NTRODUCTION matrix induced by RIS under the assumption that channel state
Reconfigurable intelligent surface (RIS) is a promising informations (CSIs) are perfectly known at the BS, which
technique to achieve high spectrum- and energy- efficiency is not practical for the RIS aided wireless communication
[1]–[4]. Specifically, RIS is a uniform planar array with a system. Compared with the traditional active devices (i.e., AF
large number of reflective elements, each of which can induce relay) aided communication systems, the channel estimation in
a phase shift of the incident signal and reflect it passively. the RIS aided system is quite a challenging problem. This is
Hence, by adaptively adjusting the phase shift matrix of RIS, it because in the active devices aided communication system, the
can enhance the transmission quality of the intended incident- CSI can be estimated by enabling the active devices to send
reflection signal [5], which is also called as passive beamform- training sequences. However, RIS is a passive device with a
ing [6]. Compared with traditional amplify-and-forward (AF) large number of passive reflective elements, which cannot per-
relay beamforming techniques [7], [8], RIS can reconfigure form active transmitting/receiving and signal processing. Thus,
the CSIs with such a large number of unknown parameters
can only be estimated at the active BS or users, which makes
J. Chen and Y.-C. Liang are with the Center for Intelligent Networking and
Communications (CINC), University of Electronic Science and Technology the channel estimation quite difficult [4]. This motivates us
of China (UESTC), Chengdu 611731, China (e-mails: chenjie.ay@gmail.com, to find innovative channel estimation methods to tackle these
liangyc@ieee.org). new challenges.
H. Cheng and W. Yu are with the Electrical and Computer Engineering
Department, University of Toronto, Toronto, ON M5S 3G4, Canada (e-mails: Recently, there are some literature investigating the channel
hei.cheng@utoronto.ca, weiyu@ece.utoronto.ca). estimation for the RIS aided single user communication sys-
2

tems [20]–[25]. Specifically, the binary reflection method was we apply the common row-column-block sparsity to
proposed in [20], [21], where the RIS turns on each reflective jointly estimate the cascaded channels. Specifically, if we
element successively, while keeping the rest reflective elements recover the sparse matrix by considering the common
closed. Then, the BS successively estimates the cascaded row-column-block sparsity simultaneously, we need to
channel, which consists of the channels from the BS to one quantize the AoA and AoD with high resolutions to
typical reflective element and from this element to the users. reduce the quantization errors caused by the discrete grid
In [22], a minimum variance unbiased channel estimator was of AoAs/AoDs. This leads to intractable computational
proposed by turning on all reflective elements in the entire complexity. To further deal with this issue, we propose the
training period, where the optimal phase shift matrix induced following two-step procedure based multi-user joint chan-
by RIS was shown to be a discrete Fourier transform matrix. nel estimator. In particular, in the first step, we exploit the
This method was further extended in [23], [24], where the common column-block sparsity to estimate the common
authors assume that the surface can be divided into multiple subspace spanned by AoD array steering vectors. Then,
sub-surface, where each sub-surface consists of some adjacent we project our received signals into this common AoD
reflective elements sharing the common reflection coefficient. subspace, which can reduce the number of zero columns
However, the training overhead of the above methods in of the original sparse matrix and transform it as a row-
[20]–[24] scales up with the number of reflective elements block sparsity matrix. Hence, this procedure can achieve
(or sub-surfaces), which causes intractable training overhead lower complexity due to less unknown columns in the
and degrades the spectrum-efficiency. In [25], some active sparse matrix, lower quantization error due to without
elements are randomly deployed at the RIS to perform channel quantizing AoD, and higher SNR due to reducing the
estimation. Then, the full CSIs are recovered by using the influence of noise on the null space of common AoD
estimated CSIs from the active elements due to channel subspace. In the second step, we exploit the common
sparsity. This method, in fact, can reduce the training overhead, row-block sparsity to formulate a MMV-based multi-user
but it also increases the hardware cost and complexity due to joint sparse matrix recovery problem. Since we use the
the deployment of active elements. received signals of all users to recover the sparse matrix
Motivated by the above reasons, in this paper, we study jointly, a better recovery performance can be achieved.
the channel estimation for the RIS aided multi-user MIMO • Since the optimization variables are coupled in the for-
communication system operated in time division duplex (TDD) mulated multi-user joint sparse matrix recovery problem,
mode. To highlight the main contributions, we summarize the which is non-convex and hard to solve, we propose
paper as follows: an approach based on the principles of alternative op-
timization and iterative reweighted algorithm to solve
• We propose a novel uplink channel estimation protocol it efficiently. Besides, we analyze the convergence of
and apply compressive sensing (CS) technique to estimate the proposed algorithm. Moreover, we design a training
the cascaded channels of the RIS aided multi-user MIMO reflection coefficient sequence optimization method based
system with limited training overhead. Specifically, we on minimizing the mutual coherence of the equivalent
first investigate the sparsity representation of the cascaded dictionary. Finally, the simulation results validate the
channels. Since the BS and the RIS are usually mounted effectiveness of the proposed estimation scheme.
at a height, there are only limited scatters around the BS
and the RIS. This indicates that the cascaded channel The rest of this paper is organized as follows. Section II
has only a few angle of arrival (AoA) and angle of presents the system model and channel estimation protocol
departure (AoD) array steering vectors, and thus it can for the RIS aided multi-user MIMO system. Section III
be represented by a row-column-block sparse channel investigates the common sparsity of the cascaded channels, and
matrix. This specific sparsity structure is quite different shows the drawbacks of the conventional CS-based techniques.
from the conventional (mmWave) MIMO communication Section IV studies the two-step procedure based multi-user
systems, whose sparse channel matrix is usually just row- joint channel estimator and Section V provides its detail
block sparsity, because there are only limited scatters at solution. Section VI studies the training reflection coefficient
the BS but rich scatters at the receivers [26]–[28]. optimization method. Section VII provides simulation results
• We further find that the cascaded channels of all users to validate the effectiveness of the proposed scheme. Finally,
have the common row-column-block sparsity structure Section VIII concludes the paper.
due to the common channel from the BS to the RIS. Notations: The scalar, vector, and matrix are lowercase, bold
However, the conventional CS-based channel estimators, lowercase, and bold uppercase, i.e., a, a, and A, respectively.
i.e., single measurement vector (SMV) [29] and multi- (·)T , (·)H , Tr (·), and rank (·) denote transpose, conjugate
ple measurement vectors (MMV) [26]–[28], recover the transpose, trace, and rank, respectively. A† is the Moore-
sparse channel matrix of each user individually without Penrose pseudoinverse matrix. [a]i and Ai,j denote the i-th
considering the specific common row-column-block spar- component of vector a and i-th row j-th column component
sity. Hence, these estimators usually require more training of matrix A, respectively. A:,i and A:,Ω denote the i-th
overhead to guarantee recovery performance when the column vector of matrix A and the sub-matrix consisting
sparsity level (the number of spatial paths) increases. of the columns of matrix A with indices in set Ω. I M
• To avoid the drawbacks of the conventional estimators, denotes the M -by-M identity matrix and 1M×N denotes M -
3

abilities. Hence, it is quite a challenging problem to achieve


12 L
h1H channel estimation in the studied system.
F Therefore, we propose the following innovative method
U1
RIS
hkH to tackle these new challenges. Specifically, due to V o =
h H
K
diag(v) and hH T H
k diag(v) = v diag(hk ), it is straightforward
BS to know that the joint beamformer and reflection coefficient
F :Channel Response from BS to RIS
Uk
design is only dependent on the CSI of the following cascaded
h1H :Channel Response from RIS to U1
hkH :Channel Response from RIS to U k channel [4], [10]:
UK
hKH :Channel Response from RIS to U K

Fig. 1: A RIS aided multi-user MIMO system consisting of Gk = diag(hH


k )F ∈ C
L×M
. (1)
one BS with M antennas, one RIS with L reflective elements,
and K single-antenna users. In the subsequent parts, we propose a novel uplink channel
estimation protocol for the studied system and use the least
BT Symbols
square (LS) method [22] to estimate the cascaded channel Gk .
Uplink Channel Estimation Downlink Information Transfer

T Symbols T Symbols T Symbols B. Channel Estimation Protocol


RIS v1 v1 v1 v2 v2 v2 vB vB vB
For TDD systems, the CSI of the downlink channel can
U1 s1,1 s1,2 s1,T s1,1 s1,2 s1,T s1,1 s1,2 s1,T be obtained by estimating the CSI of the uplink channel
Uk sk ,1 sk ,2 sk ,T sk ,1 sk ,2 sk ,T sk ,1 sk ,2 sk ,T due to channel reciprocity. Since L ≥ M ≥ K exists in
UK sK ,1 s K ,2 sK ,T sK ,1 s K ,2 sK ,T sK ,1 s K ,2 sK ,T most wireless communications, in this paper, we propose the
following uplink channel estimation protocol.
Fig. 2: Channel estimation protocol and frame structure.
To begin with, note that the whole channel response from
BS to Uk can be expressed by v T diag(hH T
k )F = v Gk ∈
1×M
by-N matrix whose elements are equal to 1. span(A) is the C . Hence, in order to separate v and Gk from the
space spanned by the column vectors of matrix A. Besides, received training signals, we need to obtain enough individual
 observations with different reflection coefficient v. Moreover,
CN µ, σ 2 denotes the distribution of circularly symmetric
complex Gaussian (CSCG) with mean µ and variance σ 2 . we need to enable each user to transmit orthogonal pilot
sequence to separate Gk from each user without interference.
Inspired from the above characteristics, we propose the
II. S YSTEM M ODEL
estimation protocol as shown in Fig. 2. Specifically, the
A. Cascaded Channel Model frame structure is divided into two phases, i.e., Phase-I for
As shown in Fig. 1, we consider a block-fading multi- uplink channel estimation and Phase-II for downlink data
user MIMO system operated in TDD mode, where a BS with transmission. In this paper, we only focus on the uplink
the assistance of RIS serves K users. The BS and the RIS channel estimation of Gk in Phase-I. In particular, Phase-
are equipped with M antennas and L reflective elements, I consists of B sub-frames and each sub-frame consists of
respectively, while the users are all equipped with a single T symbol durations (T ≥ K). Specifically, the RIS keeps
antenna each. The users are denoted by U1 , · · · , UK . The T copies of the training reflection coefficient vector v b =
channel responses from the BS to the RIS and from the [vb,1 , vb,2 , · · · , vb,L ]H ∈ CL×1 in the b-th sub-frame, and then
RIS to Uk are denoted by F ∈ CL×M and hH 1×L adjusts the value of reflection coefficient to make it different in
k ∈ C ,
respectively. The reflective channel at the RIS is usually each sub-frame. Uk transmits B copies of the k-th orthogonal
referred to as the dyadic backscatter channel, where each pilot sequence in B sub-frames, where each pilot sequence is
1×T
reflective element combines all the received signals and then with length T , i.e., sH k = [sk,1 , sk,2 , · · · , sk,T ] ∈ C with
H
transmits them to Uk acting as a point source by reflection. sk1 sk2 = 0 for 1 ≤ k1 , k2 ≤ K and k1 6= k2 .
Thus, the reflection coefficient channel matrix [13] is given by Specifically, in the b-th sub-frame, the received T pilot
V o = diag(v) ∈ CL×L with v = [v1 , v2 , · · · , vL ]T ∈ CL×1 signals at the BS, i.e., Y b ∈ CM×T , can be written as
where vl = ejϑl is the reconfigurable reflection coefficient on K
the l-th reflective element. X
Yb= F H diag(v b )hk sH
k + Ub
Next, the whole channel response from BS to Uk through k=1
RIS is denoted by hH k V oF ∈ C
1×M
. Hence, the transmis- K
(a) X
sion quality of the intended incident-reflection signal can be = GH H
k v b sk + U b , (2)
enhanced by adaptively adjusting V o at the RIS. However, the k=1
joint design of beamformer at the BS and reflection coefficient
matrix V o at the RIS for downlink data transmission requires where (a) is because of Gk = diag(hH H
k )F . Note that sk sk =
the CSIs of hH k and F , simultaneously. Moreover, RIS has a P T is the transmit energy constraints for training sequence of
large number of reflective elements, which are passive devices Uk and P is the transmit power of each user. U b ∈ CM×T is
without active transmitting/receiving and signal processing received Gaussian noise with assuming U b ∼ CN 0, δ 2 I M .
4

C. Conventional LS Estimator the number of spatial paths between the RIS and Uk . ̟ is
With the above estimation protocol, we can apply the the antenna spacing and ρ is the carrier wavelength, and we
conventional LS estimator to estimate the cascaded channel. set ̟/ρ = 1/2 for simplicity. aX (ϕ) ∈ CX×1 is the array
Specifically, since all sH steering vector, i.e.,
k in (2) are orthogonal pilot sequences,
we have 1
aX (ϕ) = √ [1, ejπϕ , · · · , ejπϕ(X−1) ]H . (8)
∆ 1 X
ỹ b,k = Y b sk = GH k v b + ub,k , (3)
PT
From (6) and (7), Gk = diag(hH k )F can be rewritten as
where ub,k = P1T U b sk ∈ CM×1 . Let Ye k = Nf Nhk
s
[ỹ 1,k , ỹ 2,k , · · · , ỹ B,k ] ∈ CM×B , V = [v 1 , v 2 , · · · , v B ] ∈ X X L2 M
Gk = αp βk,q aL(sin(ϕk,q )−sin(φAoD
p ))
CL×B , and U e k = [u1,k , u2,k , · · · , uB,k ] ∈ CM×B . Then, N f N h,k
p=1 q=1
we can rewrite (3) into the following matrix form: AoD
× aH
M (sin(θp )).
Ye k = GH
k V
e k.
+U (4) (9)
Note that aL(sin(ϕk,q )−sin(φAoD
p )) is the (p, q)-th cascaded
Using the conventional LS channel estimator [30], the AoA/AoD at RIS/user side, which is called as the cascaded
cascaded channel is estimated by AoA in the following contents for simplicity.
Ĝk = (Ye k V † )H , (5) To design the CS-based channel estimator, we approximate
the cascaded channel in (9) by using the virtual angular
† H H −1
where V = V (V V ) . domain (VAD) representation, i.e.,
It is worth noting that the above LS estimator in (5) requires
B ≥ L. Then, it causes intractable training overhead when the Gk = AR X k AH
T , (10)
RIS is equipped with a large number of reflective elements. where AR ∈ CL×Gr and AT ∈ CM×Gt are the dictionary
Therefore, it motivates us to investigate the efficient channel matrices for the angular domain with angular resolutions
estimation schemes to reduce the training overhead. Gr and Gt , respectively, i.e., each column of AR and AT
represent the array steering vectors (array steering vectors)
III. C ASCADED C HANNEL S PARSITY M ODEL corresponding to one specific cascaded AoA at RIS/user side
In this section, we recognize the cascaded channels are typi- and one specific AoD at the BS side, respectively1 , i.e.,
cally sparse, which motivates us to apply CS-based techniques h i
to achieve robust channel estimation with limited training AR = aL (−1), aL (−1 + G2r ), · · · aL (1 − G2r ) , (11)
h i
overhead. Specifically, we first investigate the cascaded chan- AT = aM (−1), aM (−1 + G2t ), · · · aM (1 − G2t ) . (12)
nel sparsity representation for the individual user. Then, we
analyze the specific sparsity structure of the cascaded channel Besides, X k ∈ CGr ×Gt is the angular domain sparse channel
and the drawbacks of the conventional CS-based techniques in matrix, where the non-zero (i, j)-th component corresponds to
the studied scenario. Finally, we study the common sparsity the complex gain on the channel consisting of the i-th cascaded
structure for the further multi-user joint channel estimator AoA array steering vector at RIS/user side and the j-th AoD
design. array steering vector at the BS side.

A. Individual User Channel Sparsity Representation B. Sparsity Structure Analysis and Related CS-based Tech-
In this part, we investigate the sparsity representation of the niques
cascaded channel in the RIS aided communication system. In this part, we analyze the sparsity structure of the cascaded
Assume BS and RIS are each equipped with a uniform channel and show the drawbacks of applying the conventional
linear array (ULA). By applying the physical propagation CS-based techniques to recover the sparse channel matrix X k
structure of wireless channel [26], the channels hk and F in the RIS aided communications.
are given by 1) Sparsity Structure Analysis: Since both BS and RIS are
s Nf usually mounted at a height to assist the communications
LM X 2̟ 2̟ between the BS and users, there are only limited scatters
F= αp aL ( sin(φAoA
p ))aH
M( sin(φAoD
p )), (6)
Nf p=1 ρ ρ around the BS and the RIS. This indicates that there are only a
s Nh,k   few AoDs and cascaded AoAs, i.e., both Nf and Nhk should
L X 2̟ be small. Hence, X k has the row-column-block sparsity
hk = βk,q aL sin (ϕk,q ) , (7)
Nh,k q=1 ρ structure, i.e., there are only a few column/row vectors in X k
being non-zero, as shown in Fig. 3. This specific structure
respectively, where αp and βk,q denote the complex gains of of RIS aided communication system is quite different from
p-th spatial path between the BS and the RIS and q-th spatial the conventional (mmWave) MIMO communication system,
path between the RIS and Uk , respectively. φAoD
p and φAoA
p are
the p-th AoD from the BS and AoA to the RIS, respectively, 1 Although sin(ϕ AoD ) ∈ [−2, 2], it is straightforward to know
k,q )−sin(φp
and ϕq is the q-th AoD from the RIS to Uk . Nf is the number that we only need to quantize sin(ϕk,q )−sin(φAoD
p ) in the domain [−1, 1]
AoD
of spatial paths between the BS and the RIS, and Nh,k is due to the term of e−jπ(sin(ϕk,q )−sin(φp ))
.
5

First Step: Common AoD Suce Estimation o Ste Mti-se oint Channel Estimation
aXZ
n [
i\]
g all ^ 
roje_ tion
Se

eY k
H
V H AR X k ATH UH
k V A Xf
H
R k
,
Aba`W
D
T
, D
H
U kH Š~kH ’‹…y S H {™—–Œ‡˜‘Ž‰„€† V H αk AR X Uk |”ˆ‚ S H }•ƒ“x
t D

0 1 2 3 4 5 6 7 8 9 10 d !"e
Estimate 2 > ? @ 10
0 sikn Ah , D
lmqn S 0
# $%s&a'() *+,nnel is
T

Si./0l 12;< e=tion Xš


1 1 The Estimate
tw ŸS H
G km›œe
a αk
2 2 ហA XS
Xk
 Gr X D CB IterativJ ReweiKLNeO
k R

 E base P QlternatiRS
 H TUVimization
 I

Fig. 3: Two-step procedure based multi user joint channel estimation.

whose sparse channel matrix is usually just row-block sparsity. Algorithm 1 MMV-based Channel Estimator by using SOMP
This is because there are only limited scatters at the BS H
Input: Ỹ k = V H AR X fk + U e k , 1 ≤ k ≤ K, D = V H AR , and H

but rich scatters at the users in the conventional (mmWave) M Bδ 2


ǫ = PT .
communications [26]–[28]. Then, the sparse channel matrix mmv
Output: Ĝk , 1 ≤ k ≤ K.
consists of only a few AoD array steering vectors but each 1: for k = 1 to K do
(0) H
corresponds to all AoA array steering vectors, thus it has a 2: Initialize the residual Rk = Ỹ k , the corresponding index
(0)
row-block sparsity structure. set Ωk = ∅, and the iteration counter t = 0.
2) Related CS-based Techniques: From (4) with the ob- 3: repeat
(t)
4: Set t = t + 1 and estimate the support ik =
tained sparsity channel representation of the cascaded channels :,i H (t−1) 2
(10), the received signal can be rewritten as arg maxi∈Ω /
(t−1) (D ) Rk .
k 2
(t) (t−1) (t)
H
5: Update index set Ωk = Ωk ∪ {ik }.
eH
Ye k = V H Gk + U H H eH
k = V AR X k AT + U k . (13) (t) (t) (t) † H
6: Update
residual
2 Rk = (I − D :,Ωk (D :,Ωk ) )Ỹ k .
(t)
7: until Rk ≤ ǫ.
With (13), we can formulate the channel estimation problem 2
(t)
into a sparse channel matrix X k recovery problem using 8: Denote the estimated index set Ω̂k = Ωk . The channel is
mmv †
the CS-based techniques. One straightforward approach is estimated by Ĝk = AR X f̂k = (D :,Ω̂ ) Ỹ H
f̂k , where X k .
to ignore the block sparsity structure of the sparse channel 9: end for
H
matrix, and vectorize the signal Ye k and sparse channel matrix
X k into vectors directly. So, it can be formulated into a
conventional SMV-based recovery problem, which can be (mmWave) MIMO communications, the MMV-based channel
solved efficiently by applying orthogonal matching pursuit estimator is thus usually adopted in these scenarios.
(OMP) algorithm [27]. However, applying the above conventional SMV- and MMV-
based estimators into the RIS aided multi-user communication
However, the above SMV-based channel estimator has the
system usually leads to performance loss due to ignoring
following disadvantages: 1) This estimator usually requires
its specific row-column-block sparsity structure. Hence, it
more training overhead to guarantee the estimation perfor-
motivates us to redesign a channel estimator for the studied
mance when the sparsity level increases due to ignoring the
system by taking advantages of both row- and column- block
block sparsity structure . 2) This estimator needs to discrete
sparsity.
both AoAs and AoDs on the grid due to the VAD representa-
tion in (10). Hence, in order to reduce the quantization errors
of the AoA and AoD, both Gr and Gt should be much larger C. Common Channel Sparsity Representation of Multiple-
than the number of antennas at the BS or reflective elements User
at the RIS, which introduces intractable computational com- It is worth noting that the cascaded channels have the
plexity. common block sparsity structure brought by the common
Instead of ignoring the block sparsity, another better method channel between BS and RIS. Therefore, another main issue
is to ignore the only column-block sparsity, and use X fk = of the above conventional estimators is that they recover
H Gr ×M
X k AT ∈ C to represent a new sparse channel matrix. the sparse channel matrix of each user individually without
Thus, Xfk has a row-block sparsity structure and can be formu- considering this common sparsity.
lated into a conventional MMV-based recovery problem [26]– Hence, in this subsection, we investigate the common block
[28], which can be solved efficiently by applying simultaneous sparsity, which together with row-column-block sparsity will
OMP (SOMP) algorithm [26], [27], as shown in Algorithm 1. be applied to design a multi-user joint channel estimator in
Particularly, this estimator only needs to recover AoA array Section IV.
steering vectors with Xfk , where the dimension of optimization 1) Common Column-Block Sparsity due to Common
variables and the quantization error for AoD representation Scatters at the BS: The cascaded channels exist a common
are both reduced. As aforementioned that row-block sparsity channel, whose spatial paths departing from the BS to the
usually exists in the spare channel matrix of conventional RIS go through the common scatters. Hence, the AoD array
6

steering vectors of each cascaded channel should be the same can use (15) to separate αk and G from Gk to solve Chal-
as each other. Specifically, let the AoD index set corresponding lenge 1, which cannot be solved using the term diag(hH k )F .
to the non-zero columns of X k be ΩD,k , where each element This fact is important for the initialization in the following
is associated with a typical AoD array steering vector at the alternative optimization algorithm for solving the multi-user
BS side. Then, from (1) and (9), we have joint channel estimator designed in the next section.

ΩD,1 =ΩD,2 · · · = ΩD,K = ΩD , (14)
IV. T WO -S TEP P ROCEDURE BASED M ULTI -U SER J OINT
where ΩD is defined as the common AoD support index set. C HANNEL E STIMATOR
2) Common Row-Block Sparsity due to Scaling Prop-
A. Overview of Two-Step Procedure
erty: Since the cascaded AoAs for each user are different,
there is no straightforward common row-block sparsity. Hence, To begin with, it is worth noting that if we apply the com-
in this part, we rewrite the VAD presentation of Gk to establish mon row-block sparsity and common column-block sparsity,
the common row-block sparsity. simultaneously, to formulate a sparse channel matrix recovery
First, we have the following challenge to establish the problem, we need to quantize the AoA and AoD simultane-
common row-block sparsity. ously with high resolutions to reduce the quantization errors
Challenge 1: Denoting the VAD representation of F by caused by the VAD representation. This leads to intractable
F = AR X F AH H
T , we have Gk = diag(hk )AR X F AT .
H complexity.
H To deal with the above issue, we propose the following
Then, if hk is known, we can jointly estimate the common
sparsity channel matrix X F existed in all cascaded channels. two-step procedure based multi-user joint channel estimator by
However, it is challenging to obtain hHk since there exists the
applying common row-/column- block sparsity, respectively.
ambiguity that we cannot separate diag(hH k ) and F from Gk . • In the first step, we exploit the common column-block
To deal with Challenge 1 and obtain the common row- sparsity to jointly estimate the common subspace spanned
column-block sparsity representation, we propose the follow- by AoD array steering vectors. This is because we will
ing joint scaling property. Specifically, we observe that the show that it is sufficient to represent Gk by using
cascaded channels through one arbitrary reflective element of arbitrary Nf basis in the subspace spanned by AoD
all users are the common channel scaled by different scalars, array steering vectors in Section IV-B. Then, we project
i.e., Gl,: H l,:
k = [hk ]l F . Hence, we know each element in the our received signal into the common AoD subspace for
l-th row vector of Gk1 divided by each element in the l-th reducing the number of zero columns of X k from Gt
H
[hk ] to Nf . This projection procedure reduces the number of
row vector of Gk2 has the same scaling factor H1 l , i.e.,
[hk2 ] columns of the sparse matrix for complexity reduction,
l
reduce the quantization error due to without quantizing
[hH
k1 ]l Gl,1
k1 Gl,2
k1 Gl,M
k1 the AoD, and achieves higher SNR due to reducing the
= = ···= . (15)
[hH
k2 ]l Gl,1
k2 Gl,2
k2 Gl,M
k2 influence of noise on the null space of the common AoD
This is called as the joint scaling property, which implies subspace.
• In the second step, we exploit the common row-block
that all cascaded channels can be represented by one arbitrary
cascaded channel with different scalars. For example, we can sparsity to design a MMV-based multi-user joint sparse
use the cascaded channel of U1 , i.e., G1 , to represent the matrix recovery problem, in which we recover the cas-
cascaded channels of the rest users, i.e., caded AoA array steering vectors with common sparse
channel matrix and the scaling matrix αk jointly. In
Gk = αk G, (16) this step, since we use the received signals of all users
to recover the sparse matrix jointly, a better recovery
where the lower script “1” of G1 is ignored for notation
performance can be achieved due to the increased number
simplicity, i.e., G = G1 , and
" #! of measurements.
[hH H
k ]1 [hk ]2 [hH
k ]L
αk = diag H
, H ,··· , H ∈ CL×L , (17)
[h1 ]1 [h1 ]2 [h1 ]L B. First Step: Subspace Estimation and Signal Projection
is referred to as scaling matrix. Observed from (16), we know In this subsection, we propose the common AoD subspace
all the cascaded channels have the common channel part G, estimation method and project the signals into this subspace.
which means that all cascaded channels have the following To begin with, denote the common AoD subspace by
common row-block sparse VAD representation, i.e., span(A:,ΩT
D
), which is the linear span of Nf AoD array
steering vectors, i.e., A:,ΩD
∈ CM×Nf , where ΩD is the
Gk = αk AR XAH
T , (18) T
corresponding AoD index set, as aforementioned in (14).
where the lower script “1” of X 1 is ignored for notation Note that if we apply CS-based methods to estimate the
simplicity, i.e., X = X 1 . exact Nf AoD array steering vectors A:,Ω T
D
over a discrete
From (18), we know that all Gk have the common sparse grid of AoDs, and then project the signals into the subspace
matrix X and we have established the common row-block spanned by the estimated AoD array steering vectors, there
sparsity. Note that the main reason, we use αk G to establish exist both quantization error and estimation error, both of
common row-block sparsity instead of diag(hH k )F , is that we which degrade the estimation performance in the second step.
7

In fact, we do not need to estimate the exact Nf AoD array Algorithm 2 Two-Step Procedure (Subspace) based Multi-
steering vectors A:,Ω
T
D
to represent common AoD subspace, User Joint Channel Estimation (S-MJCE)
and we only need to estimate Nf vectors, S k ∈ CM×Nf , H
Input: Ỹ k , 1 ≤ k ≤ K.
whose linear span includes the common AoD subspace, i.e., mjce
Output: Ĝk , 1 ≤ k ≤ K.
span(A:,ΩT
D
) ⊆ span(S k ). Then, it is clear that S k is 1: First Step: Subspace Estimation and Signal Projection
sufficient to represent A:,Ω
T
D
for the sparse representation of • Subspace Estimation
Gk , i.e., Compute covariance matrix Ĉ = BT 1
Y Y H.
Apply the eigenvalue decomposition as Ĉ = SΘS H .
:,ΩD According to the MDL, the number of AoD vectors is estimated
Gk = AR X k AH
T = AR X k (A:,Ω
T
D H
) by (21) and denoted by N̂f .
(a) (b) According to Lemma 4.2, the AoD subspace is estimated by the
= AR X :,Ω
k
D
M H SH H
k = AR X̄ k S k , (19) first N̂f columns of S in (20).
• Signal Projection
where in (a) we use the fact that there exists a matrix eH
Project signal as Y H †
k (S k ) = V
H e k (S H
AR X̄ k + U †
k ) in (22).
M ∈ CNf ×Nf such that A:,Ω T
D
= S k M if span(A:,Ω T
D
)⊆
:,ΩD H Gr ×Nf 2: Second Step: multi-user joint channel estimation (MJCE)
span(S k ), and in (b) we use X̄ k = X k M ∈ C ,
which is a row-block sparsity matrix consisting of Nf Solving problem (24) by Algorithm 3, the cascaded channels can
mjce ˆ Ŝ H .
columns. Specifically, S k can be estimated by eigenvalue be estimated by Ĝk = α̂k AR X̄ k

decomposition of the signal’s covariance matrix without re-


quiring a discrete grid of AoDs, and we do not need to estimate
the linear transform matrix M for A:,Ω T
D
= S k M because we 2) Signal Projection: By applying (19), we can project the
only need to estimate the combined row-block sparsity matrix received signals into common AoD subspace span(S k ), i.e.,
X̄ k in the second step. H ∆ H
Therefore, we can reduce the number of zero columns by Ȳ k = Ye k (S H † H H H † e H †
k ) = V AR X̄ k S k (S k ) + U k (S k )
projecting the signal into the estimated subspace span(S k ) (a)
e k (S H
= V H AR X̄ k + U †
k ) , (22)
without estimating the exact AoD array steering vectors A:,Ω T
D
.
1) Subspace Estimation: In this part, we propose the fol- where (a) is due to S H H †
k (S k ) = I Nf . Compared with (13)
lowing lemmas to estimate the common AoD subspace. and (22), the noise on the null space of span(S k ) has been
Lemma 4.1: Denote Y = [Y 1 , Y 2 , · · · , Y B ] ∈ CM×BT . reduced, and the noise power is reduced from E(kU e k k2 ) =
F
Then, Ĉ = BT 1
Y Y H is a sufficient statistics for estimating
2 e H † 2 2
M Bδ /P T to E(kU k (S k ) kF ) = Nf Bδ /P T .
the AoD subspace span(A:,Ω T
D
) if K → ∞.
Proof: Please refer to Appendix A-A. C. Second Step: Multi-User Joint Sparse Matrix Recovery
Lemma 4.2: For K → ∞, by maximizing the likelihood Specifically, applying the common row-block sparsity (16),
function of Ĉ for a given Nf , the optimal solution of sub- we can rewrite (22) as
space estimation is the linear span of the eigenvalue vectors H
Ȳ k = V H αk G(S H † e H †
corresponding to the Nf largest eigenvalue of Ĉ, i.e., k ) + U k (S k )

h i = V H αk AR X̄ + Ue k (S H )† . (23)
k
S k = S :,1 , S :,2 , · · · , S :,Nf , (20)
In order to estimate αk and the real cascaded AoAs associated
with the sparse channel matrix X̄ from (23), we can formu-
where SΘS H is the eigenvalue decomposition of Ĉ, and late the following MMV-based multi-user joint sparse matrix
Θ = diag([θ1 , θ2 , · · · , θM ]) ∈ CM×M is the eigenvalue recovery problem, i.e.,
matrix where the eigenvalue θm is ordered in a decreasing   XG r
order of magnitude. H x̄H
min diag X̄ X̄ = i x̄i 0 (24)
0 i=1
Proof: Please refer to Appendix A-B. X̄ ,αk
H 2
Note that the decision on the number of AoD array steering
s.t. Ỹ k − V H αk AR X̄ ≤ ǭ, 1 ≤ k ≤ K,
vectors, i.e., Nf , is a model selection problem, which can be 2
addressed by information theoretic criteria [31]. In addition, N Bδ 2
where x̄H
i is the i-th row vector of X̄, and ǭ ≥ fP T is the
there are several well-known decision rules, such as likelihood
tolerance upper bound related to the combined noise.
ratio [32], Akaike’s information criterion (AIC), and minimum ˆ and α̂ , respec-
Denote the solutions of problem (24) as X̄ k
description length (MDL) [33]. In this paper, we just adopt
tively. Then, the cascaded channels can be estimated by
MDL scheme [33] to estimate Nf , i.e.,
mjce ˆ SH .
    Ĝk = α̂k AR X̄ k
(25)
(M−n)BT

 Q
M 1 


 M −n
θi  
 Finally, the detail steps of the above two-step based multi-user
  
i=n+1  joint channel estimation are summarized in Algorithm 2.
N̂f = arg min − log  P  +2n(2M −n) .
n 
  M
θi  
 However, Algorithm 2 requires to solve the non-convex

 

 i=n+1  problem (24), where the optimization variables X̄ and αk are
M−n
(21) further coupled. Hence, the conventional OMP/SOMP method
8

can not be applied to solve the 1-norm relaxation of problem B. Optimization of αk


(24). In the next section, we develop an algorithm based on the
In this subsection, we provide the optimal solution of (28)
principles of alternative optimization and iterative reweighted
with fixed X̄. To begin with, problem (28) can be equivalently
algorithm to solve this problem efficiently.
reformulated as the following K individual optimization prob-
lems, i.e.,
V. S OLUTION TO M ULTI -U SER J OINT S PARSE M ATRIX H 2

R ECOVERY P ROBLEM min Ỹ k − V H αk AR X̄ . (30)
αk 2
A. Algorithm Development H
Then, let z̄ k = vec(Ȳ k ) ∈ CBM×1 , and H = (AR X̄)T ⊗
To deal with the coupled optimization variables X̄ and αk , H 2
V ∈ CBNf ×L . Note that there are only L non-zero
we develop an alternative optimization method to solve the
elements in the vector vec(αk ) with the following index set
multi-user joint sparse matrix recovery problem (24). From
Ωα , i.e.,
[34], [35], theoretical results and experimental results have
shown that Log-sum penalty function has superiority over the Ωα = {i + (i − 1)L |i ∈ {1, 2, · · · , L} } , (31)
1-norm penalty function for sparse signal recovery. Hence,
in this section, we relax the 0-norm function as the log-sum where |Ωα | = L. Hence, problem (30) can be rewritten as,
function for the alternative sparsity-promoting function design, i.e.,
i.e., 2

min [z̄ k ]Ωα − H :,Ωα [vec(αk )]Ωα . (32)
XGr αk 2

min Q(X̄) = log x̄H
i x̄i + ς (26)
X̄ ,αk Since problem (32) is a convex optimization problem, the
i=1
H 2 optimal solution can be obtained by letting the first derivative

s.t. Ỹ k − V H αk AR X̄ ≤ ǭ, 1 ≤ k ≤ K, of the objection equal to zero, which is given by
2
 †

where ς > 0 is a small positive parameter to guarantee that αk = diag (H :,Ωα ) [z̄ k ]Ωα . (33)
the function is meaningful. The choice of ς can be related to
[36] for details. (H :,Ωα )† = ((H :,Ωα )H H :,Ωα )−1 (H :,Ωα )H . Note that the
Then, by introducing non-negative penalty factor λk , we rank of matrix H :,Ωα should be large than L in (33), which
reformulate problem (26) as the following unconstrained op- means that rank(X̄) ≥ B L
.
timization problem,
XK H 2

min L(X̄, αk ) = Q(X̄) + λk Ỹ k − V H αk AR X̄ . C. Iterative Reweighted Algorithm for Optimizing X̄
X̄ ,αk 2
k=1 In this section, we find the solution of (29) for fixed αk . In
(27)
fact, L(X̄, αk ) is a non-convex function, which is still hard
Note that λk is the penalty factor balancing the tradeoff to solve and may require intractable complexity to find the
between data fitting and the sparsity of the solutions. The value optimal solutions. To find an efficient solution of non-convex
of λk should be related to the signal-to-noise ratio (SNR) and problem (29), we apply the iterative reweighted technique to
can be set as λk = δ2PlogTd
Gr [36]–[38], where d is a constant
solve it in an iterative manner. The basic principle of this
scaling factor. Since the received signal for each user has the method is to iteratively approximate the non-convex objective
same SNR, we set λ = λ1 = λ2 = · · · = λK in the following function as a convex function with reweighted coefficients and
contents. solve the approximated convex problem.
Then, to further deal with the coupled optimization vari- Specifically, in the t-th iteration of iterative reweighted
ables, we apply alternative optimization to decouple problem algorithm, the upper bound approximated function of Q(X̄)
(27) into the following two unconstrained subproblems as at X̄(t) is given by
min L(X̄, αk ), (28) Gr
X 
αk
Q(X̄) = log x̄H
i x̄i + ς
which is an optimization problem of αk for given X̄, and i=1
!
Gr
X x̄H x̄ i + ς 
min L(X̄, αk ), (29) ≤ i
+ log x̄H
i,t x̄i,t + ς −1
X̄ i=1
x̄H
i,t x̄i,t + ς

which is an optimization problem of X̄ for given αk . Then, = Qub (X̄, X̄(t)), (34)
we alternatively solve (28) and (29) until the objective function
converges. where x̄H i,t is the i-th row vector of constant matrix X̄(t), and
In the following subsequent parts, we first develop algo- x̄H
i,t x̄ i,t + ς can be regarded as the reweighted coefficient in
rithms to obtain the solutions of problem (28) and problem the t-th iteration on the objective function. This inequality can
(29), respectively. Then, we provide the convergence analysis be proved by calculating its first derivation, which is omitted
and initialization method of the proposed algorithm. here for brevity. Then, the upper bound approximated function
9

of L(X̄) in the t-th iteration of iterative reweighted algorithm Algorithm 3 Iterative Reweighted based Alternative Optimiza-
is given by tion
(0) (0)
1: Initialize X̄ as 1Gr ×Nˆf , αk as (41), and the outer loop
L(X̄, αk )
iteration counter r = 0.
K
X 2
H 2: repeat
≤Qub (X̄, X̄(t)) + λ Ỹ k − V H αk AR X̄ 3:
(r) (r)
Set αk = αk and X̄(0) = X̄ , initialize the inner loop
2
k=1 iteration counter t = 0.
∆ 4: repeat
=Lub (X̄, αk ; X̄(t)). (35)
5: Given X̄(t), calculate X̄(t + 1) based on (38), and then
Note that the equality holds in (35) if X̄ = X̄(t). set t = t + 1.
Next, minimizing Lub (X̄, αk ; X̄(t)) is equivalent to solv- 6: until Lub (X̄, αk ; X̄(t)) converges.
(r)
ing the following problem, i.e., 7: Set r = r + 1 and update X̄ = X̄(t), then given X̄ =
(r) (r)
X̄ , calculate αk based on (33).
  XK 2 (r) (r)
H H 8: until L(X̄ , αk ) converges.
min Tr X̄ ΛX̄ + λ Ỹ k − V H αk AR X̄ , (36)
X̄ 2
k=1

where Λ ∈ CGr×Gr is a diagonal matrix, i.e., 2) Initialization Analysis: It is worth noting that the initial-
!
1 1 ization in Algorithm 3 is important for the successful recovery
Λ = diag ,··· , H . (37) of the sparse channel matrix, especially for the initialization
x̄H x̄
1,t 1,t + ς x̄ Gr ,t Gr ,t + ς

of αk . This is because only when the initialized αk is as close
It is straightforward to know that problem (36) is a convex as possible to the real value, all the cascaded channels have
optimization problem, and the optimal solution can be obtained the similar sparse channel matrix, and thus it can be jointly
by letting the first derivative of the objection equal to zero. recovered due to (16). Otherwise, there is no solution of X̄
Then, the optimal solution can be given by to make the equality hold in (16).
K
!−1 K In fact, we can initialize αk by first estimating the cascaded
Λ X H
X H H channels Gk individually, i.e., using the SMV method, using
X̄= + (V αk AR ) V αk AR (V αk AR ) Ỹ k .
λ the MMV method, or solving the single user version of (24).
k=1 k=1
(38) Then, we use the estimated Ĝk to initialize αk according to
the scaling property (15), i.e.,
D. Convergence and Initialization Analysis l,1 l,2 l,M
[hH
k ]l Ĝk Ĝk Ĝk
1) Convergence Analysis: In this part, we analyze the con- αl,l
k = ≈ ≈ ··· ≈ , ∀k, ∀l. (40)
[hH
1 ]l
l,1
Ĝ1
l,2
Ĝ1
l,M
Ĝ1
vergence of the proposed iterative reweighted based alternative
optimization algorithm, which is summarized in Algorithm 3 Thus, we can initialize αk as
(0)
with a double loop structure. Specifically, the inner loop is 
from step 4 to step 6 for optimizing X̄ and the outer loop is  i,l  1 P M ˆ l,m
Gk
(0) ˆ l,m
, if 1 ≤ i = l ≤ L,
from step 1 to step 9 for alternatively optimizing X̄ and αk . In αk = M
m=1 G1 (41)
(r+1) (r+1) (r) (r) 
the following, we show L(X̄ , αk ) ≤ L(X̄ , αk ), 0, Others.
where r is the outer loop iteration index, which guarantees the
convergence of Algorithm 3 [39] . VI. T RAINING R EFLECTION C OEFFICIENT O PTIMIZATION
Specifically, the proof is given as follows. In this section, we optimize the training reflection coefficient
(r) (r) (a) (r) (r) sequence at the RIS to improve the estimation performance.
L(X̄ , αk ) = Lub (X̄ , αk ; X̄(0)) Motivated by the fact that a better recovery performance
(r) (b) (r) can be achieved if the mutual coherence of the equivalent
≥ min Lub (X̄, αk ; X̄(0)) = Lub (X̄(1), αk ; X̄(0)) · · ·
X̄ dictionary is smaller [40]. Specifically, denoting the equivalent
(r) (c) (r) dictionary by D = V H AR from (13), the corresponding
≥ min Lub (X̄, αk ; X̄(t − 1)) = Lub (X̄(t), αk ; X̄(t − 1)) mutual coherence is denoted by µ (D), which can be written

(d)
as ( H
)
(r) (e) (r+1)
≥ L(X̄(t), αk ) = L(X̄
(r)
, αk ) (D :,i ) D:,j
µ (D) = max :,i :,j . (42)
(r+1) (r) (f) (r+1) (r+1) i6=j,1≤i,j≤Gr D d
≥ min L(X̄ , αk ) = L(X̄ , αk ), (39) 2 2
αk
This fact implies that the columns of D should be as orthogo-
(r)
where in (a) we initialize X̄ = X̄(0) and the equality holds nal as possible. Equivalently, it requires that we need to design
in (35) if X̄
(r)
= X̄(0), in (b) and (c) we use the fact that D to make D H D as similar as possible to identity matrix,
X̄(t) is the optimal solution of (36), in (d) we use (35), in i.e.,
(r+1) D H D = AH H
R V V AR ≈ BI Gt Gr , (43)
(e) we update X̄ as X̄(t), and in (f) we use the fact that
αk is the optimal solution of (30). where B in the right term is applied for normalization.
(r+1) (r+1) (r) (r)
Therefore, we have L(X̄ , αk ) ≤ L(X̄ , αk ), The solution for problem (43) has been investigated in
which guarantees the convergence. [41]. However, since RIS just induces a phase shift on the
10

incident signal without changing its amplitude, the reflection is described as a relative value of the noise power. Specifically,
coefficients should satisfy the following constraint [42], i.e., we define the normalized mean square error (NMSE) as the
performance metric [26], which is given by
b,l
V = 1, ∀b, ∀l. (44)  2  

NMSE = E Ĝk − Gk kGk k22 . (50)
Thus, the method in [41] cannot be applied to find the 2
solution of V in (43) directly. In the following, we modify In the following simulations, we compare the NMSE per-
the method in [41] to solve problem (43) subject to constraint formances of the following channel estimation schemes. In
(44). Firstly, we can transfer (43) into the following equation, addition, each result is obtained over 500 Monte Carlo trials.
i.e.,
• LS: The channels are estimated by using the estimation
AR AH H H H
R V V AR AR ≈ BAR AR . (45) protocol in Fig. 2 and the LS estimator in (5) with the
optimal training sequences studied in [22].
Let U R ΞU H R be the eigenvalue decomposition of AR AR ,
H
• Binary Reflection [20]: The channels are estimated one
L×L
and Ξ = diag(γ1 , γ2 , · · · , γL ) ∈ C be the eigenvalue by one through turning on only one reflective element
matrix where the eigenvalue γl is ordered in a decreasing and keeping the rest reflective elements closed.
order of magnitude. Then, we have the following optimization • MMV: The channels are estimated by formulating a
problem, MMV problem with the solution of SOMP algorithm as
2
b,l shown in Algorithm 1.
min BΞ − ΞU H H
R V V U R Ξ s.t. V = 1, ∀b, ∀l.
V 2 • S-MMV: The channels are estimated by projecting the
(46) received signals into the common subspace firstly and
Then, define Q = [q 1 , q 2 , · · · , q B ] = ΞU HR V and E b = using the MMV estimator.
PB
• S-SMV: The channels are estimated by projecting the
BΞ − qi qH
i . Thus, (46) can be rewritten as
i6=b received signals into the common subspace firstly and
2 formulating a SMV problem with the solution of OMP
b,l
min E b −q b q H
b 2 s.t. V = 1, ∀b, ∀l. (47) algorithm [26].
V
• S-MJCE: The channels are estimated by two-step proce-
Let U E,b Ψb U H E,b be the eigenvalue decomposition of E b , and dure (subspace) based multi-user joint channel estimation
Ψb = diag(ξb,1 , ξb,2 , · · · , ξb,L ) ∈ CL×L be the eigenvalue (MJCE), which is given in Algorithm 2.
matrix where the eigenvalue ξb,l is ordered in a p decreasing • S-Genie-aided LS: The cascaded channels are estimated
order of magnitude. Then, we can make q b = ξb,1 uE,b , by assuming that the BS knows the exact angles of the
where uE,b is the first column vector of U E,b corresponding cascaded AoAs at the RIS/user side and the AoDs at
to the largest eigenvalue. Thus, the largest error in (46) is the BS side, and projecting the received signals into the
eliminated. known common AoD subspace with the solution of LS
Next, denote q b = [γ1 q̃b,1 , γ2 q̃b,2 , · · · , γL q̃b,L ]T ∈ CL×1 , estimator, which is the performance upper bound that can
where q̃b,l is the l-th component of q̃ b . Then, we can recover not be achieved.
q̃ b . Note that since matrix E b is not full-rank in general, we Fig. 4 shows the impacts of training overhead B on the
only need to update the components in q̃ b corresponding to NMSE. Firstly, we can observe that the estimation perfor-
the positive eigenvalue. mances of all estimation schemes increase as the training
By considering (44), we need to further solve the following overhead B. This is because a better successful recovery and
projection problem: estimation accuracy can be achieved with a large number of
2 measurements. Next, the performance of the binary reflection

min U H v
R b − q̃ b s.t. |V (b, l)| = 1, ∀l. (48) method is much worse than that of other baselines. This
vb 2
is because in each transmit symbol duration, the power of
From [10], the optimal solution of (48) is given by
training reflection coefficient at the RIS of this method is
v b = ej∠(U R q̃ b ) . (49) 1 while that of other schemes is L. Next, the increased
performance gap between the MMV and the S-MMV im-
Then, we substitute v b into (46) and repeat the above steps plies that subspace projection for estimation performance is
B times to compute v 1 , v 2 , · · · , v B , successively. Finally, we significant with a large B for the studied system. Moreover,
can obtain the optimized training reflection coefficient V . the decreased performance gap between the S-SMV and the
S-MMV shows that the performance loss by ignoring block
VII. S IMULATION R ESULTS sparsity is significant when B is small. Finally, the proposed
In this section, we show the simulation results to validate the method outperforms other baseline schemes significantly, and
effectiveness of the proposed algorithm. In these simulations, achieves similar estimation performance of the Genie-aided
we assume αp and βk,q follow complex Gaussian distribution LS method. This validates the effectiveness of the proposed
with unit power. φAoAp , φAoD
p , and ϕk,q are continuous and estimator.
uniformly distributed over [0, 2π). Parameters ς, d, and Gr are Fig. 5 shows the impacts of the number of scatters between
set as 10−9 , 0.1 and 512, respectively. In addition, the noise the BS and the RIS on the NMSE. Firstly, we can observe
power δ 2 is normalized to one, and the transmit power P (dB) that the performances of all estimation schemes decrease as
11

10 1 10 4
Binary Reflection Binary Reflection:B=128
LS LS:B=128
0
10 S-SMV S-SMV:B=128
MMV 10 2 MMV:B=128
S-MMV S-MMV:B=128
-1
10 S-MJCE S-MJCE:B=128
S-Genie aided LS S-Genie aided LS:B=128
10 0 S-SMV:B=44
10 -2 MMV:B=44
NMSE

NMSE
S-MMV:B=44
S-MJCE:B=44
10 -3 S-Genie aided LS:B=44
10 -2

10 -4
10 -4
10 -5

10 -6 10 -6
8 24 40 56 72 88 104 128 -15 -10 -5 0 5 10 15
Training overhead B Transmit power P (dB)

Fig. 4: Effect of training overhead B on the NMSE: P = Fig. 6: Effect of the transmit power on the NMSE for different
10dB, K = T = 4, M = L = 128, Nf = 8, and Nhk = 1. system parameters: K = T = 4, M = L = 128, Nf = 8,
and Nhk = 1.
10 0

10 -1 10 0
S-MJCE: M=64, B=40, P=15 dB
S-MJCE: M=128, B=40, P=15 dB
10 -1 S-MJCE: M=128, B=88, P=15 dB
10 -2
NMSE

10 -2
-3
10
NMSE

10 -3
S-SMV
MMV
10 -4 S-MMV
S-MJCE 10 -4
S-Genie aided LS

10 12 14 16 18 20 22 24 26 28 30 10 -5
Number of scatters N f

10 -6
Fig. 5: Effect of the number of scatters between the BS and 10 -3 10 -2 10 -1 10 0 10 1
the RIS on the NMSE: P = 10dB, B = 30, K = T = 4, PTd
d

M = L = 128, Nf = 8, and Nhk = 1. Fig. 7: Effect of λ = logG r


on the NMSE with different system
parameters: K = T = 4, L = 128, Nf = 8, and Nhk = 1.

the number of scatters grows. This is because the number of


unknown parameters required to be estimated increases with decreases first and then increases as λ grows. This is because
the number of scatters. Also, the performance gaps among penalty factor is applied for balancing the tradeoff between
these methods decrease as the number of scatters grows. data fitting and the sparsity of the solutions. Hence, a smaller
This is because the estimation performance is limited by the λ causes a more sparse matrix but does not fit the data, while
number of measurements and SNR. Besides, the proposed a larger λ fits the data well but loses the sparsity. Both of
method outperforms other baseline schemes especially when these two scenarios degrade the recovery performance of the
the number of scatters is small, which also validates the proposed estimator.
effectiveness of the proposed estimator.
Fig. 6 shows the impacts of transmit power on the NMSE. VIII. C ONCLUSIONS
We can observe that the performances of all estimation This paper studies the channel estimation problem for the
schemes increase as the transmit power grows, which is RIS aided multi-user MIMO system and proposes a novel up-
because the better recovery/estimation performance can be link channel estimation protocol to estimate the cascaded chan-
achieved with a higher SNR. Also, the performance gaps nels directly. Specifically, we recognize the cascaded channel
between the proposed estimator with other baseline schemes are typically sparse, and formulate the channel estimation
increase as transmit power grows, especially when the training problem into a sparse channel matrix recovery problem using
overhead is small. This implies that our method is superior CS techniques, in order to achieve achieve robust channel
for reducing training overhead, which further demonstrates the estimation with limited training overhead. In particular, the
effectiveness of the proposed estimator. sparse channel matrixes of the cascaded channels of all users
Fig. 7 shows the impacts of penalty factor λ on the NMSE. have a common row-column-block sparsity structure due to
We can observe that the performance of the proposed estimator the common channel between BS and RIS. Hence, by taking
12

advantages of such a sparsity structure, we propose a two-step without requiring exact AoD vectors A. Hence, substituting
procedure based multi-user joint channel estimator. Moreover, (55) into (53), we have the following equivalent problem of
the optimization of the training reflection coefficient sequences (53), i.e.,
at the RIS is studied to improve the estimation performance.  
min Tr ∆W ˜ H SΘS H W k . (56)
Finally, the simulation results validate the effectiveness of the k
Wk
proposed estimator.  
WH  
Since k
SS H W k W ⊥ = I M , and
A PPENDIX A WH ⊥  
  WH
A. Derivation of Sufficient Statistics SH W kW ⊥ k
S = I M , problem (56) is
WH
To begin with, we define A = A:,Ω T
D
, Cb = ⊥
equivalent to the following problem
P
K
H H P
K
H H H H 2
A( X k AR v b )( X k AR v b ) A +δ I M , and P b,t = Nf M
X X
k=1 k=1 θm
min an,m (57)
P
K 0≤an,m ∆n + δ 2
XH H
k AR v b [sk ]t for notation simplicity. Then, if K → ∞, n=1 m=1
Xm
k=1
we have s.t. an,m = 1,1 ≤ n ≤ Nf ,
m=1
XN f
P b,t = 0, and C b = ξAAH + δ 2 I M , (51) an,m ≤ 1,1 ≤ m ≤ M,
n=1
:,n H :,m 2
where ξ = P KL. Then, from (2), the likelihood function is where an,m = |(W k ) S | .This is a convex problem,
given by which can be solve by applying Lagrangian method. Then,
the optimal solutions of an,m is given by
P(Y |A ) 
 B T  1, 1 ≤ n = m ≤ Nf ,
P P H −1 an,m = (58)
exp − (Y :,t
b − AP b,t ) C b (Y :,t
b − AP b,t ) 0, Others.
b=1 t=1
= Note that an,m = 1 means W :,n :,m
Q
B T k h= S , which indicates the
i
π MBT det C −1 ∆
b=1
b
optimal solution is W k = S k = S , S :,2 , · · · , S :,Nf .
:,1
  
H 2 −1
(a) exp −BT Tr (ξAA + δ I M ) Ĉ
R EFERENCES
= BT , (52)

π MBT det ξAAH + δ 2 I M [1] H. Yang, X. Cao, F. Yang, J. Gao, S. Xu, M. Li, X. Chen, Y. Zhao,
Y. Zheng, and S. Li, “A programmable metasurface with dynamic
polarization, scattering and focusing control,” Scientific reports, vol. 6,
where in (a) we use (51) for K → ∞. Therefore, it is p. 35692, 2016.
clear that the likelihood function depends on D only via Ĉ. [2] E. Basar, M. Di Renzo, J. de Rosny, M. Debbah, M.-S. Alouini, and
According to the Fischer-Neyman factorization theorem [43], R. Zhang, “Wireless communications through reconfigurable intelligent
surfaces,” arXiv preprint arXiv:1906.09490, 2019.
[44], it follows that Ĉ is a sufficient statistics for estimating [3] M. Di Renzo, M. Debbah, D.-T. Phan-Huy, A. Zappone, M.-S.
the subspace spanned by A. Alouini, C. Yuen, V. Sciancalepore, G. C. Alexandropoulos, J. Hoydis,
H. Gacanin et al., “Smart radio environments empowered by recon-
figurable AI meta-surfaces: an idea whose time has come,” EURASIP
B. Derivation of Basis Optimization for Subspace Estimation Journal on Wireless Communications and Networking, vol. 2019, no. 1,
pp. 1–20, 2019.
Since Ĉ is a sufficient statistics, we can design an optimal [4] Y.-C. Liang, R. Long, Q. Zhang, J. Chen, H. V. Cheng, and H. Guo,
maximum-likelihood (ML) estimator by minimizing its likeli- “Large intelligent surface/antennas (LISA): Making reflective radios
smart,” J. Commun. Inf. Netw., vol. 4, no. 2, Jun. 2009.
hood function as [5] J. Zhao, “A survey of reconfigurable intelligent surfaces: Towards
−1 6G wireless communication networks with massive mimo 2.0,” arXiv
min Tr((ξAAH +δ 2 I M ) Ĉ) + det |ξAAH +δ 2 I M |. (53) preprint arXiv:1907.04789, 2019.
A [6] Q. Wu and R. Zhang, “Intelligent reflecting surface enhanced wireless
Let W k ∆W H H network via joint active and passive beamforming,” IEEE Trans. Wireless
k be the eigenvalue decomposition of ξAA , Commun., to be published, doi: 10.1109/TWC.2019.2936025, 2019.
and ∆ = diag(∆1 , ∆2 , · · · , ∆Nf ) be the eigenvalue matrix [7] R. Zhang, Y.-C. Liang, C. C. Chai, and S. Cui, “Optimal beamforming
where the eigenvalue ∆n is ordered in a decreasing order of for two-way multi-antenna relay channel with analogue network coding,”
magnitude. It is straightforward to know that span(A) is equal IEEE J. Sel. Areas Commun., vol. 27, no. 5, pp. 699–712, 2009.
[8] K. Ntontin, M. Di Renzo, J. Song, F. Lazarakis, J. de Rosny, D.-T.
to the linear span of W k ∈ CM×Nf , i.e., Phan-Huy, O. Simeone, R. Zhang, M. Debbah, G. Lerosey et al., “Re-
configurable intelligent surfaces vs. relaying: Differences, similarities,
span(A) = span(W k ). (54) and performance comparison,” arXiv preprint arXiv:1908.08747, 2019.
[9] Q. Wu and R. Zhang, “Towards smart and reconfigurable environment:
Further denoting W ⊥ be the rest (M − Nf ) normalized Intelligent reflecting surface aided wireless network,” arXiv preprint
orthogonal basis, we have arXiv:1905.00152, 2019.
[10] J. Chen, Y.-C. Liang, Y. Pei, and H. Guo, “Intelligent reflecting surface:
 −1 1 A programmable wireless environment for physical layer security,” IEEE
ξAAH + δ 2 I M ˜ H
= W k ∆W H
k + 2 W ⊥W ⊥ , (55) ACCESS, vol. 7, pp. 82 599 – 82 612, Jun. 2019.
δ [11] C. Huang, A. Zappone, G. C. Alexandropoulos, M. Debbah, and
where ∆˜ = diag( 1 2 , 1 2 , · · · , 1
2) ∈ C
Nf ×Nf
. C. Yuen, “Reconfigurable intelligent surfaces for energy efficiency in
∆1 +δ ∆2 +δ ∆Nf +δ wireless communication,” IEEE Trans. Wireless Commun., vol. 18, no. 8,
As aforementioned, we only need to estimate subspace W k pp. 4157–4170, 2019.
13

[12] Q.-U.-A. Nadeem, A. Kammoun, A. Chaaban, M. Debbah, and M.- [37] R. E. Carrillo and K. E. Barner, “Iteratively re-weighted least squares for
S. Alouini, “Large intelligent surface assisted MIMO communications,” sparse signal reconstruction from noisy measurements,” in IEEE Annual
arXiv preprint arXiv:1903.08127, 2019. Conference on Information Sciences and Systems, 2009, pp. 448–453.
[13] H. Guo, Y.-C. Liang, J. Chen, and E. G. Larsson, “Weighted sum- [38] M. E. Tipping, “Sparse bayesian learning and the relevance vector
rate optimization for intelligent reflecting surface enhanced wireless machine,” Journal of machine learning research, vol. 1, no. Jun, pp.
networks,” arXiv preprint arXiv:1905.07920, 2019. 211–244, 2001.
[14] C. Pan, H. Ren, K. Wang, W. Xu, M. Elkashlan, A. Nallanathan, [39] J. Chen, L. Zhang, Y.-C. Liang, X. Kang, and R. Zhang, “Resource
and L. Hanzo, “Multicell MIMO communications relying on intelligent allocation for wireless-powered IoT networks with short packet commu-
reflecting surface,” arXiv preprint arXiv:1907.10864, 2019. nication,” IEEE Trans. Wireless Commun., vol. 18, no. 2, pp. 1447–1461,
[15] M. Jung, W. Saad, M. Debbah, and C. S. Hong, “On the optimality Feb. 2019.
of reconfigurable intelligent surfaces (RISs): Passive beamforming, [40] M. Elad, “Optimized projections for compressed sensing,” IEEE Trans.
modulation, and resource allocation,” arXiv preprint arXiv:1910.00968, Signal Process., vol. 55, no. 12, pp. 5695–5702, Dec. 2007.
2019. [41] J. M. Duarte-Carvajalino and G. Sapiro, “Learning to sense sparse
[16] G. Zhou, C. Pan, H. Ren, K. Wang, W. Xu, and A. Nallanathan, “Intelli- signals: Simultaneous sensing matrix and sparsifying dictionary opti-
gent reflecting surface aided multigroup multicast miso communication mization,” IEEE Trans. Image Process., vol. 18, no. 7, pp. 1395–1408,
systems,” arXiv preprint arXiv:1909.04606, 2019. Jul. 2009.
[17] P. Wang, J. Fang, X. Yuan, Z. Chen, H. Duan, and H. Li, “Intelligent [42] S. Abeywickrama, R. Zhang, and C. Yuen, “Intelligent reflecting surface:
reflecting surface-assisted millimeter wave communications: Joint active Practical phase shift model and beamforming optimization,” arXiv
and passive precoding design,” arXiv preprint arXiv:1908.10734, 2019. preprint arXiv:1907.06002, 2019.
[18] N. S. Perović, M. Di Renzo, and M. F. Flanagan, “Channel capacity [43] S. M. Kay, Fundamentals of statistical signal processing. Prentice Hall
optimization using reconfigurable intelligent surfaces in indoor mmwave PTR, 1993.
environments,” arXiv preprint arXiv:1910.14310, 2019. [44] S. Haghighatshoar and G. Caire, “Massive MIMO channel subspace esti-
[19] Y. Yang, S. Zhang, and R. Zhang, “IRS-enhanced OFDM: Power alloca- mation from low-dimensional projections,” IEEE Trans. Signal Process.,
tion and passive array optimization,” arXiv preprint arXiv:1905.00604, vol. 65, no. 2, pp. 303–318, 2016.
2019.
[20] D. Mishra and H. Johansson, “Channel estimation and low-complexity
beamforming design for passive intelligent surface assisted MISO wire-
less energy transfer,” in IEEE ICASSP, 2019, pp. 4659–4663.
[21] Z.-Q. He and X. Yuan, “Cascaded channel estimation for large intelligent
metasurface assisted massive MIMO,” arXiv preprint arXiv:1905.07948,
2019.
[22] T. L. Jensen and E. De Carvalho, “On optimal channel estimation scheme
for intelligent reflecting surfaces based on a minimum variance unbiased
estimator,” arXiv preprint arXiv:1909.09440, 2019.
[23] B. Zheng and R. Zhang, “Intelligent reflecting surface-enhanced
OFDM: Channel estimation and reflection optimization,” arXiv preprint
arXiv:1909.03272, 2019.
[24] C. You, B. Zheng, and R. Zhang, “Intelligent reflecting surface with
discrete phase shifts: Channel estimation and passive beamforming,”
arXiv preprint arXiv:1911.03916, 2019.
[25] A. Taha, M. Alrabeiah, and A. Alkhateeb, “Enabling large intelligent
surfaces with compressive sensing and deep learning,” arXiv preprint
arXiv:1904.10136, 2019.
[26] C.-R. Tsai, Y.-H. Liu, and A.-Y. Wu, “Efficient compressive channel es-
timation for millimeter-wave large-scale antenna systems,” IEEE Trans.
Signal Process., vol. 66, no. 9, pp. 2414–2428, 2018.
[27] X. Rao and V. K. Lau, “Distributed compressive CSIT estimation and
feedback for fdd multi-user massive MIMO systems,” IEEE Trans.
Signal Process., vol. 62, no. 12, pp. 3261–3271, 2014.
[28] Y. Ding and B. D. Rao, “Dictionary learning-based sparse channel
representation and estimation for fdd massive mimo systems,” IEEE
Trans. Wireless Commun., vol. 17, no. 8, pp. 5437–5451, 2018.
[29] J. A. Tropp and A. C. Gilbert, “Signal recovery from random mea-
surements via orthogonal matching pursuit,” IEEE Trans. Inf. Theory,
vol. 53, no. 12, pp. 4655–4666, 2007.
[30] J. Chen, L. Zhang, and Y.-C. Liang, “Exploiting Gaussian mixture model
clustering for full-duplex transceiver design,” IEEE Trans. Commun.,
vol. 67, no. 8, pp. 5802–5816, May 2019.
[31] H. Akaike, “Information theory and an extension of the maximum
likelihood principle,” in Selected papers of hirotugu akaike. Springer,
1998, pp. 199–213.
[32] A. Paulraj, B. Ottersten, R. Roy, A. Swindlehurst, G. Xu, and T. Kailath,
“subspace methods for directions-of-arrival estimation,” Handbook of
Statistics, vol. 10, pp. 693–739, 1993.
[33] M. Wax and T. Kailath, “Detection of signals by information theoretic
criteria,” IEEE Trans. Acoust., Speech, Signal Process., vol. 33, no. 2,
pp. 387–392, 1985.
[34] E. J. Candes, M. B. Wakin, and S. P. Boyd, “Enhancing sparsity by
reweighted ℓ1 minimization,” J. Fourier Anal. Appl, vol. 14, no. 5-6,
pp. 877–905, Dec.
[35] D. Wipf and S. Nagarajan, “Iterative reweighted ℓ1 and ℓ2 methods for
finding sparse solutions,” IEEE J. Sel. Topics Signal Process., vol. 4,
no. 2, pp. 317–329, Apr. 2010.
[36] J. Fang, F. Wang, Y. Shen, H. Li, and R. S. Blum, “Super-resolution
compressed sensing for line spectral estimation: An iterative reweighted
approach,” IEEE Trans. Signal Process., vol. 64, no. 18, pp. 4649–4662,
Sep. 2016.

You might also like