Learning To Reflect and To Beamform For Intelligent Reflecting Surface With Implicit Channel Estimation

1
Learning to Reflect and to Beamform for Intelligent

Reflecting Surface with Implicit Channel Estimation
Tao Jiang, Graduate Student Member, IEEE, Hei Victor Cheng, Member, IEEE, and Wei Yu, Fellow, IEEE
Abstract—Intelligent reflecting surface (IRS), which consists to directly design the beamforming and reflective patterns to
of a large number of tunable reflective elements, is capable of optimize a system-wide objective.
enhancing the wireless propagation environment in a cellular The promise of IRS stems from its ability to manipulate
network by intelligently reflecting the electromagnetic waves from
incident electromagnetic waves toward the intended directions
arXiv:2009.14404v4 [eess.SP] 9 Jun 2021
the base-station (BS) toward the users. The optimal tuning of

the phase shifters at the IRS is, however, a challenging problem, by controlling the phase responses of the passive elements [2],
because due to the passive nature of reflective elements, it is [3]. By intelligently adjusting these phases, in effect modifying
difficult to directly measure the channels between the IRS, the the signal propagation environment, the IRS can enhance
BS, and the users. Instead of following the traditional paradigm the communication channel between the transmitter and the
of first estimating the channels then optimizing the system
parameters, this paper advocates a machine learning approach receiver. In addition, due to its passive structure, the IRS
capable of directly optimizing both the beamformers at the BS requires very little energy to produce the desired phase shifts
and the reflective coefficients at the IRS based on a system for signals reflection. Further, it can be flexibly integrated
objective. This is achieved by using a deep neural network into various objects (e.g., walls, ceilings), thus enabling a
to parameterize the mapping from the received pilots (plus smooth deployment of IRS in existing wireless communication
any additional information, such as the user locations) to an
optimized system configuration, and by adopting a permutation networks [2]. As a result, a wide range of applications for
invariant/equivariant graph neural network (GNN) architecture IRS have been explored in the literature, e.g., for improving
to capture the interactions among the different users in the network coverage [4], boosting wireless spectral efficiency [5],
cellular network. Simulation results show that the proposed [6], reducing power consumption of data transmission [7]–
implicit channel estimation based approach is generalizable, can [9], enhancing over-the-air computation performance [10] and
be interpreted, and can efficiently learn to maximize a sum-rate
or minimum-rate objective from a much fewer number of pilots enabling secure wireless communications [11].
than the traditional explicit channel estimation based approaches. This paper tackles the problem of how to optimally tune the
IRS elements for capacity enhancement in a multiuser cellular
Index Terms—Intelligent reflecting surface, channel estimation, network. A conventional design would have followed the
beamforming, deep learning, graph neural network. approach of first estimating the channel, then optimizing the
design parameters. This is, however, not necessarily the best
I. I NTRODUCTION approach for designing an IRS system, due to the following
reasons. First, the passive elements in the IRS have no ability
Conventional physical-layer communication system design to perform active signal transmission and reception, so the
has always been based on the paradigm of first modeling channels to and from an IRS can only be estimated indirectly.
the channel, then optimizing the transmitter and the receiver Second, an IRS typically has a large number of reflective
design parameters according to the channel model. While this elements, so the number of channel parameters to estimate
conventional approach has served the communication engi- can be very large. Third, conventional channel estimation is
neers well for many practical communication scenarios, the always with respect to some artificial criterion (such as the
recent emergence of new communication modalities involving mean squared error), which does not necessarily correspond
passive reflectors for which channel estimation may not be to the ultimate system objective. Finally, even if the channel
straightforward to perform has motivated the need for new is perfectly known, the optimization of the beamformers
approaches. This paper investigates the design of reflective and phase shifts is a high-dimensional nonconvex problem,
patterns for intelligent reflecting surfaces (IRS), also known as so finding an optimal solution remains a difficult numerical
reconfigurable intelligent surface, composed of a large number problem.
of tunable reflective elements. We advocate the use of machine The main idea of this paper is that a data-driven approach
learning techniques to bypass explicit channel estimation and can be used to overcome some of these difficulties. We show
Manuscript submitted on December 25, 2020, revised on March 15, 2021, that by adopting a graph neural network (GNN) architecture
and accepted on April 19, 2021 in IEEE Journal on Selected Areas in that models the interaction between the IRS and the multiple
Communications. The materials in this paper have been presented in part users in the system, it is possible to directly learn the mapping
at the IEEE Global Communications Conference (Globecom), December
2020 [1]. This work was supported by Huawei Technologies Canada and by from the received pilots to a desired set of beamformers at
Natural Sciences and Engineering Research Council (NSERC) via the Canada the base-station (BS) and a desired reflective pattern at the
Research Chairs program. The authors are with The Edward S. Rogers Sr. De- IRS for maximizing a system-wide objective, such as the
partment of Electrical and Computer Engineering, University of Toronto,
Toronto, ON M5S 3G4, Canada (e-mails: {taoca.jiang@mail.utoronto.ca, sum rate or the minimum rate across the multiple users. The
hei.cheng@utoronto.ca, weiyu@ece.utoronto.ca}). GNN structure has the properties that a permutation of the
2
ordering of the users induces the same permutation on the shifts for a single-user system to reduce the algorithm run-
BS beamformers but keeps the IRS reflective pattern fixed time complexity of optimizing the IRS. Furthermore, deep
(known as permutation equivariant and permutation invariant reinforcement learning is leveraged to optimize the phase shifts
properties, respectively). Such an architecture allows the possi- for the single-user system in [23], and for the multiuser case
bility of generalizability to networks with an arbitrary number in [24].
of users. Numerical results show that the proposed approach This paper proposes to use a data-driven approach for
produces solutions that can be easily interpreted. Overall, this optimizing the IRS. The proposed approach is motivated by the
paper shows that by bypassing the explicit channel estimation success of using deep learning to optimize wireless communi-
phase altogether, a machine learning approach can achieve cation systems without explicitly estimating the channel [25]–
a significantly higher transmission rate than the conventional [27]. In particular, [25] shows that the beamforming vectors
channel estimation based approach, especially in the limited learned from the received pilot signals can approach the
pilot length regime. achievable rate of the optimal beamforming vectors for highly-
mobile mmWave systems. Further, [26] shows that based on
the geographical locations of the users, the deep learning
A. Related Works
approach is able to learn the optimal scheduling without
Many of the existing works on the optimization of IRS channel estimation. Location information is also utilized in
assume that perfect channel state information (CSI) is avail- [27] to configure the IRS for indoor signal focusing using a
able at the BS. Based on the perfect CSI assumption, joint deep learning approach. In this paper, we primarily use the
optimization of the IRS reflection and BS beamforming can be received pilots as the input to the neural network, because
carried out for different network objectives, e.g., minimizing the received pilot signal contains rich information about both
the energy consumption [7], maximizing the spectral efficiency the large-scale and small-scale fading, but we also show that
[5], or maximizing the minimum rate [12]. In practice, CSI incorporating the location information can further reduce the
needs to be estimated. Due to the passive nature of the IRS required pilot length.
elements, directly estimating the CSI for IRS is not feasible.
Instead, CSI estimation needs to be carried out either at the
BS or at the users based on the end-to-end reflected signals. In B. Main Contributions
this direction, [13] proposes to solve the channel estimation This paper casts the problem of designing the beamforming
problem based on a binary reflection method by turning on and reflective patterns in an IRS system as a variational
the IRS elements one at a time, while [14] proposes a channel optimization problem whose optimization variables are func-
estimation method based on parallel factor decomposition for tionals, i.e., mappings from the received pilots to the phase
estimating the BS-IRS channel and the individual IRS-user shifts at IRS and beamforming matrix at the BS. We propose
channels. However, as the number of elements in an IRS is to parameterize this mapping using a neural network and to
typically quite large (in order to achieve higher beamforming train the neural network based on the training data to directly
gain [15]), these channel estimation methods typically require maximize a network utility function.
a large pilot overhead. To address this issue, [16] proposes While a fully connected neural network has already been
to group the IRS elements into sub-surfaces, but at a cost of shown to be able to significantly reduce pilot length for sum-
reduced beamforming capability. More recently, [17] proposes rate maximization in the conference version of this paper
a compressed sensing based channel estimation method for [1], a further contribution of this journal paper is that we
the multiuser IRS-aided system, which reduces the training propose a GNN architecture to better model the interference
overhead significantly but requires the assumption of channel among the different users in the network. The proposed
sparsity. Further, [18] proposes to reduce the training overhead GNN is permutation invariant/equivariant across the users and
by exploiting the common reflective channels among all the therefore provides better scalability and generalization ability.
users. All these works fall into the paradigm of first estimating For example, while a fully connected neural network would
the channels from the received pilot signals, then solving not have been able to generalize when the number of users
the reflection optimization problems based on the estimated in the network changes (except by re-training a new neural
channels. network), a GNN structure can easily have shared parameters
Recently, data-driven approaches have been introduced to across the components for different users, thereby achieving
address the challenges either in channel estimation or beam- generalizability. It is worth noting that GNNs have been
forming [19]. For the channel estimation problem, [20] pro- proposed to solve radio resource allocation problems in [28]–
poses a deep denoising neural network to enhance the per- [30], but these prior works all require perfect CSI and are not
formance of the model-based compressive channel estimation designed for IRS systems.
for mmWave IRS systems. The authors of [21] propose a A key benefit of the data-driven approach for communica-
convolutional neural network to estimate both direct and tion system design is that it can easily incorporate different
cascaded channels from the received pilot signals through types of data as inputs to the neural network. In this paper,
end-to-end training. Given perfect channels, the beamforming we propose to incorporate the locations of the users, so that the
problem has been investigated from the perspective of the data- neural network can focus on learning the small-scale fading
driven approach in [22]–[24]. In particular, [22] proposes a component of the wireless channels. The numerical results
multi-layer fully connected neural network to learn the phase show that this significantly improves the utility maximization
3
performance. We remark that incorporating such heteroge-

neous information is not easy to do in the conventional model-
based approach.
This paper also shows that the same machine learning
framework can be used to design systems under different
overall network objectives. In particular, this paper shows
the superior performance of the GNN for both the sum-rate
maximization and the minimum-rate maximization problems
in term of reducing the length of pilots required to achieve a
target performance level.
A crucial requirement for the eventual adoption of the
machine learning approach to system-level optimization is the
interpretability of its solutions. Toward this end, this paper
analyzes the beamforming patterns and the reflective patterns
that the GNN learns from training data. We show visually that
the deep learning approach is indeed performing the task of Fig. 1. IRS assisted multiuser MISO system
focusing the electromagnetic waves toward the directions of
the target users, hence providing a level of confidence that the
proposed approach is capable of finding an optimized solution is designed for the multiuser setting and can be more easily
in the overall highly nonconvex optimization landscape. generalized to systems with different number of users.
To summarize, the main novelties and the key findings of
this paper are as follows: C. Organization of the Paper and Notations
1) A deep neural network is used to directly map the The rest of the paper is organized as follows. Section II
received pilots to the optimized phase-shifts at the describes the system model and problem formulation. Section
IRS and optimized beamforming matrix at the BS. III describes the uplink pilot transmission and the conventional
By bypassing the explicit channel estimation stage, the channel estimation strategy. Section IV describes the proposed
proposed machine learning framework is able to utilize deep learning framework and the GNN architecture for the
pilots more efficiently than the conventional channel IRS system. Sections V and VI provide simulation results
estimation based approaches. for the proposed scheme for both sum-rate and min-rate
2) A GNN architecture is used to map the received pilots objectives. Section VII interprets the results by visualizing the
to the beamformers at the BS and the reflective pattern beamforming and reflective patterns. Finally, conclusions are
at the IRS. The GNN is permutation invariant for the drawn in Section VIII.
reflective pattern and permutation equivariant for the The notations used in this paper are as follows. Lower case
user beamformers, thereby allowing generalizability to letters are used to denote scalars. Lower case bold-faced letters
systems with different number of users. are used to denote column vectors. Upper case bold-faced
3) The proposed neural network is able to incorporate letters are used to denote matrices. We use [·]i to denote an
heterogeneous inputs such as the location information element of a vector and (·)> and (·)H to denote transpose and
of the users to further improve the system performance. Hermitian transpose of matrices. We use CN (·, ·) to denote a
4) The beamforming patterns obtained from the machine complex Gaussian distribution, and E[·] to denote expectation.
learning approach are visualized, showing that the pro-
posed GNN is able to learn the correct beam focusing
patterns for both the IRS and the BS. II. P ROBLEM F ORMULATION
5) Numerical simulations show that the proposed deep A. System Model
learning framework outperforms conventional model- Consider an IRS assisted downlink multiuser MISO system
based approaches for both the sum-rate maximization where K single-antenna users are served by a BS with M
and the minimum-rate maximization problems. Further, antennas. An IRS equipped with an array of N passive
the proposed GNN generalizes well to different signal- reflective elements is deployed to aid the transmission between
to-noise ratios (SNRs). the BS and users. The BS controls the IRS through an IRS
We remark that a parallel and independent work [31] has controller, which is capable of adjusting the phases of the
also proposed the idea of using the received pilot signals to array elements in order to reflect the incident signals to desired
learn the configuration of the phase shifts. However, [31] only directions, as shown in Fig. 1.
considers a single-user IRS system. Moreover, their method We use G ∈ CM ×N to denote the uplink channel matrix
first uses an alternating optimization approach to find an from the IRS to the BS, hrk ∈ CN to denote the uplink
optimized set of beamforming matrix and phase shifts, then channel vectors from user k to the IRS, and hdk ∈ CM to
uses a fully connected neural network to learn the optimized denote the uplink channel vectors from user k to the BS. We
solution in a supervised fashion. In comparison, the method assume channel reciprocity, so that the channel matrices in the
proposed in this paper is more direct. Further, our method downlink direction are the transpose of the uplink channels.
4
Let sk ∈ C be the symbol to be transmitted from the BS B. Problem Formulation

to user k. The BS uses a beamforming strategy to transmit sk The main idea of this paper is that since the final goal is to
through a beamforming vector wk ∈ CM P, Kwhere the beam- optimize the rates Rk ’s, instead of explicitly estimating all the
formers must satisfy a power constraint k=1 kwk k22 ≤ Pd . channel coefficients as an intermediary step, we can exploit
Let v = [ejω1 , ejω2 , · · · , ejωN ]> be the reflection coefficients the pilot phase more efficiently by mapping the received
at the IRS, where ωi ∈ [−π, π) is the phase shift of the i-th pilots directly to the optimized transmission strategy for utility
element. Then, the received signal rk at the user k is given by maximization, in effect, bypassing channel estimation.
K To this end, we propose to design the optimal beamforming
X
rk = (hdk + G diag(v)hrk )> wj sj + nk (1) vector wk ’s and the reflection phase shifts v based directly
j=1 on the received pilots Y = [y(1), y(2), · · · , y(L)] ∈ CM ×L .
K
X Specifically, given the matrix Y , our goal is to solve the
= (hdk + Ak v)> wj sj + nk , (2) following optimization problem
j=1 maximize E [U (R1 (v, W ), . . . , RK (v, W ))]
(W ,v)=g(Y )
M ×N
where Ak = G diag(hrk )
∈C is the cascaded channel X
subject to kwk k2 ≤ Pd , (6)
between the user k and the BS through reflection at the IRS,
k
and nk ∼ CN (0, σ02 ) is the additive white Gaussian noise.
|vi | = 1, i = 1, 2, · · · , N,
A block-fading model is assumed in which the channel
coefficients remain constant during a coherence block, but where W = [w1 , · · · , wk ] is the beamforming matrix at
change independently from block to block. An achievable rate BS, g(·) is a function that maps the received pilots to the
Rk for user k can be computed as beamforming matrix W and the phase shift vector v, and U (·)
! is the network utility function. The expectation here is over
|(hdk + Ak v)> wk |2 the random channel realizations and the noise in the uplink
Rk = log 1 + PK , (3)
d > 2 2 pilot transmission phase. In the rest of the paper, we use the
i=1,i6=k |(hk + Ak v) wi | + σ0
sum-rate maximization and the max-min fairness problem as
where multiuser interference is treated as noise. The beam- examples to illustrate our idea, but the proposed method can
forming vectors wk ’s at the BS and phase shifts v at the be extended to other utility functions as well.
IRS can be jointly optimized to maximize a network utility Solving problem (6) is computationally challenging, be-
U (R1 . . . RK ), which is a function of the achievable rates of cause it is a variational optimization problem with a nonconvex
all
PKthe users. Common utility functions include the sum rate objective function. To tackle this problem, we propose to
k=1 Rk , and the minimum rate mink Rk . parameterize the mapping function g(·) by a deep neural
To optimize the transmit beamformers at the BS and the network, and to learn the parameters of the neural network
reflection coefficients at the IRS, the knowledge of the cas- from data. This is motivated by the universal approximation
caded channel matrix Ak and the channel vector hdk for property of the neural networks [32]. Before presenting the
k = 1, · · · , K is needed either explicitly or implicitly. Toward proposed approach, we first discuss the uplink pilot transmis-
this end, a pilot transmission phase is used before the data sion stage and the conventional approach of channel estimation
transmission phase to gain knowledge of Ak ’s and hdk ’s. By followed by network utility maximization for solving the
uplink-downlink channel reciprocity, channel estimation can problem (6).
take place in the uplink [13]. Specifically, we use an uplink
pilot transmission phase, in which the user k sends a pilot III. U PLINK P ILOT T RANSMISSION AND C ONVENTIONAL
sequence, xk (`), ` = 1, · · · , L, which is reflected through the C HANNEL E STIMATION
IRS and received at the BS. The received signal y(`) at the In this section, we describe the conventional approach to
BS in the time slot ` can be expressed as solving the problem (6), which consists of an uplink channel
K
X estimation phase and a downlink utility maximization phase.
y(`) = (hdk + G diag(v(`))hrk )xk (`) + n(`) (4) Given the estimated channels, the downlink utility maximiza-
k=1 tion problem is well investigated. For instance, the downlink
XK sum-rate maximization problem can be solved using the al-
= (hdk + Ak v(`))xk (`) + n(`), (5) gorithm proposed in [5], and the minimum rate maximization
k=1 problem is studied in [12]. Below we focus on the uplink pilot
transmission and the channel estimation step.
where diag(v(`)) is a diagonal matrix with v(`), the phase
shifts of IRS in time slot ` of the pilot phase, on its diagonal,
and n(`) ∼ CN (0, σ12 I) is the additive Gaussian noise. Note A. Uplink Pilot Transmission
that there are (M +N )K +M N unknown channel coefficients We adopt the pilot transmission strategy proposed in [17]
in G, hdk ’s and hrk ’s, for k = 1, · · · , K. Since the number of to design the uplink pilots and the phase shifts at the IRS
elements N in a typical IRS is generally large (possibly in the in the pilot phase. In particular, the total training slots L is
hundreds), it is challenging to estimate the channels when the divided into τ sub-frames, each of which consists of L0 = K
pilot length is short. symbols (i.e., L = τ L0 ). The users simultaneously send their
5
pilot sequences xHk = [xk (1), xk (2), · · · , xk (L0 )] of length L0 The optimal solution to problem (11) is given by
to the BS, repeated over τ sub-frames. The pilot sequences
f (Ỹk ) = E[Fk |Ỹk ]. (12)
of all users are designed to be orthogonal to each other so
that they can be decorrelated at the BS, i.e., xH k1 xk2 = 0 However, for general channel fading distributions, the optimal
if k1 6= k2 and xHki xki = L0 Pu where Pu is the uplink pilot solution is computationally intensive to implement. A low-
transmission power. In the meanwhile, the IRS keeps the phase complexity approach is to constrain the estimator f (·) to be
shifts fixed within each sub-frame, but uses different phase linear, which results in the linear MMSE (LMMSE) method.
shifts in different sub-frames so that both the users-to-IRS When the rows of Fk and the rows of Ñ are i.i.d., the LMMSE
and the IRS-to-BS channels can be measured. estimator is as follows [33]
The BS decorrelates the received pilots in each sub-frame −1
by matching the pilot sequence for each user. Let Ȳ (t) = Fˆk = (Ỹk − E[Ỹk ]) E[(Ỹk − E[Ỹk ])H (Ỹk − E[Ỹk ])]
[y((t − 1)L0 + 1), · · · , y(tL0 )] denote the received pilots in
E[(Ỹk − E[Ỹk ])H (Fk − E[Fk ])] + E[Fk ]. (13)
sub-frame t. Let v̄(t) be the phase shifts at the IRS in sub-
frame t. Then, Ȳ (t) can be expressed as: The estimates of hdk , Ak can then be obtained from Fˆk . Note
K that the LMMSE estimator is an optimal solution to (11)
only if the unknown Fk is Gaussian distributed. We note that
X
Ȳ (t) = (hdk + Ak v̄(t))xH
k + N̄ (t), t = 1, · · · , τ, (7)
k=1
this LMMSE channel estimation method for multiuser MISO
system is also proposed in [12].
where N̄ (t) is a noise matrix with each column independently
distributed as CN (0, σ12 I). By the orthogonality of the pilots,
IV. P ROPOSED D EEP L EARNING F RAMEWORK
we can form ȳk (t) ∈ CM , i.e., the contribution from user k
at the t-th sub-frame, as given in [17]: The conventional channel estimation approach aims to solve
(11), i.e., to recover entries of Fk given the received pilots Ỹk
1
ȳk (t) = Ȳ (t)xk = hdk + Ak v̄(t) + n̄(t) (8) using a mean squared error metric. However, recovering the
L0 channels is not the goal. Our ultimate objective is to maximize
, Fk q(t) + n̄(t), (9) the network utility as in (6). The main idea of this paper is to
bypass explicit channel estimation and to solve problem (6)
where n̄(t) = L10 N̄ (t)xk , and the combined channel matrix
directly.
is defined as Fk , [hdk , Ak ] and the combined phase shifts Specifically, we aim to use a neural network to represent
is defined as q(t) , [1, v̄(t)> ]> . Recall that we have τ sub- the mapping function g(·) in problem (6), and to pursue a
frames in total. Then, by denoting Ỹk = [ȳk (1), · · · , ȳk (τ )] data-driven approach to train the neural network so that it
as a matrix of received pilots across τ sub-frames, we have mimics the optimal mapping from the received pilot signals
Ỹk = Fk Q + Ñ , (10) to the beamformers and the phase shifts for network utility
maximization. This overall framework is depicted in Fig. 2(a).
where Q = [q(1), · · · , q(τ )] and Ñ = [n̄(1), · · · , n̄(τ )]. In this section, we describe the neural network architecture
The channel estimation problem aims to estimate the com- suited for this task.
bined matrix Fk for k = 1, . . . , K. Typically, to ensure that the
matrix Q is full rank so that Fk can be recovered successfully,
A. Graphical Representation of Users and IRS
we need at least τ = N + 1, i.e., a total (N + 1)K pilot
symbols are needed. When τ = N + 1, one choice of Q is A central task in a multiuser cellular network is the manage-
a DFT matrix as suggested in [16]. For comparison purposes, ment of the interference between the users. Toward this end,
we also consider the more general case where τ 6= N + 1. In the beamformers at the BS and the phase shifts at the IRS must
this case, Q is not a square matrix. One way to construct Q is be coordinated so that the mutual interference is minimized.
to first form a d × d DFT matrix Q0 with d = max(τ, N + 1), This paper proposes to use a neural network architecture,
then truncate Q0 to the first τ columns or the first N + 1 called GNN, which is based on a graph representation of
rows. Another possibility is to independently construct vectors the beamformers for the users and the phase shifts at IRS,
v̄(t), t = 1, · · · , τ , randomly. Specifically, the phase of [v̄(t)]i to capture the multiuser interference. The graph consists of
can be constructed by drawing from a uniform random variable K + 1 nodes as shown in Fig. 2(b). The IRS is represented
in [−π, π). In this paper, we use the second approach to by node 0 and the K users are represented by nodes 1 to
construct Q if τ < N + 1, and use the first approach in other K. A representation vector, denoted as zk , k = 0, 1, · · · , K,
cases. These are empirically good choices. is associated with each node. The goal is to encode all the
useful information about each corresponding node in these
representation vectors. The representation vectors are updated
B. Conventional Channel Estimation layer by layer in a GNN, taking into account all the repre-
To estimate the channel matrix Fk from (10), we can use sentation vectors in the previous layer as input. After multiple
the minimum mean-squared error (MMSE) estimator, obtained layers, the representation vector of each node would contain
by solving the following problem sufficient information for designing the beamforming vectors
h i and the reflection coefficients. Specifically, the GNN is trained
minimize E kf (Ỹk ) − Fk k2F . (11) in such a way so that the phase shifts of the IRS can be
f (·)
6
(a) (b)
Fig. 2. (a) Proposed deep learning framework for directly designing the beamformers and phase shifts based on the received pilots; (b) Graph representation
of the network. The node representation vector z0 corresponds to the IRS, and z1 , · · · , zK correspond to the users.
subsequently obtained from z0 , and the beamforming matrix aggregation and combination layers, and a final layer that
at the BS can be obtained from z1 , · · · , zK . includes normalization. It takes the input feature from the user
As compared to a fully connected neural network, the GNN node of the graph as the initial value of the representation
more naturally captures the interactions between the users and vector, denoted as zk0 , then updating them through the D layers
the IRS. By explicitly embedding these interactions into the to produce zkd , d = 1, · · · , D, and finally a linear layer to
neural network architecture, the GNN is better able to learn produce zkD+1 , which is mapped to the beamformer matrix
a mechanism for reducing the interference between users. In W and the phase shifts v via normalization. The overall
particular, the update of each user node is a function of all architecture is shown in Fig. 3.
its neighboring user nodes and the IRS node, which enables 1) Initialization Layer: The initialization layer takes the
the GNN to learn to avoid interference. The update of the input features from the user nodes as the initial representation
IRS node is a function of all the user nodes, which enables vector zk0 , k = 1, · · · , K, then trains one layer of the neural
the GNN to learn to configure the phase shifts to spatially network to produce zk1 , k = 0, · · · , K for the subsequent
separate the channels for all the users. layers. Note that the IRS node does not have input features,
A useful feature of the GNN is that it is able to capture the because only the users transmit pilots.
permutation invariant and permutation equivariant properties The input features from the user nodes are simply the
[28], [30] of the network utility maximization problem (6). received pilots, i.e., vectorized form of the matrix Ỹk with
That is, if we permute the index labels of the users in the real and imaginary components separated:
problem, the neural network should output the same set of
beamforming vectors wk with permuted indices and the same zk0 = [vec(<{Ỹk })> , vec(={Ỹk })> ]> . (14)
phase shifts v. Here, permutation invariance means that the
phase shifts v is independent of the ordering of the user As mentioned earlier, it is easy for the neural network to
channels, and permutation equivariance means that if the user incorporate additional useful information about each user into
channels are permuted, the beamforming vectors wk would the input feature vector zk0 . For example, if the locations of
be permuted in the same way. These properties are not easy the users are available, the input feature vector zk0 can be
to learn in a fully connected neural network, but are naturally
embedded in the GNN. zk0 = [vec(<{Ỹk })> , vec(={Ỹk })> , lk> ]> , (15)
By tailoring to the problem structure, the GNN also reduces
the model complexity as compared to the fully connected where lk is the three-dimensional vector denoting the coordi-
neural network. More importantly, the parameters of the GNN nates of the location of the user k.
can be tied across the users, so that it can be easily generalized Given the input feature vector zk0 , we use a layer of fully
to scenarios with different number of users. This is in contrast connected neural networks, denoted as fw0 (·), to produce zk1
to fully connected neural networks, whose parameters need to for the user nodes, i.e.,
scale with the number of users, which makes it difficult to
generalize. zk1 = fw0 (zk0 ), k = 1, · · · , K. (16)
For the IRS node, we take inputs from all the user nodes and
B. GNN Architecture process them using a permutation invariant function ψ0 (·) first,
We now describe the proposed GNN architecture and the then a fully connected neural network fv0 (·) as
training process. The overall GNN aims to learn the graph
z01 = fv0 ψ0 z10 , · · · , zK
0

representation vector zk through an initialization layer, D . (17)
7
DNN Linear
Aggregation Aggregation Layer
…
…
…
+ +
Combination Combination
DNN Aggregation Aggregation Linear
Normalization
Layer
…
+ +
…
…
Layer
…
…
…
…
DNN Linear
Aggregation Aggregation
Layer
…
+ +
…
…
Normalization
Aggregation Aggregation Linear
Layer
Layer
…
…
DNN + +
Fig. 3. Overall graph neural network architecture with an initialization layer, D updating layers, and a final normalization layer.
DNN DNN
…
DNN …
DNN
…
DNN DNN DNN DNN

…
…
DNN DNN
DNN
DNN
(a) The IRS node. (b) The user node k.
Fig. 4. Aggregation and combination operations of the d-th layer for (a) the IRS node, and (b) the user nodes.
In the implementation, we choose ψ0 as the element-wise representation and the aggregation of the representations from
mean function, i.e., its neighboring nodes. In a general GNN, this is given by [34]
zkd = fcombine
d
zkd−1 , faggregate
d
{zjd−1 }j∈N (k) , (19)

K
1 X 0
[ψ0 (z10 , · · · , zK
0
)]i = [zk ]i . (18)
K
k=1
where N (k) denotes the set of neighboring nodes of the node
d d
k, faggregate (·) and fcombine (·) are the aggregation function
This is a reasonable choice, because all the pilots are reflected and combining function of the layer d, respectively.
through the IRS, so all the received pilots contain equal A key in designing the GNN is to choose a suitable aggrega-
amount of information about IRS. d d
tion function faggregate (·) and combining function fcombine (·)
The representation vectors (z01 , z11 , · · · , zK
1
) now contain in (19) so that the GNN is scalable and generalizes well. An
features about the IRS and the user channels, respectively, and d
efficient implementation of faggregate (·) takes the following
are subsequently passed to the D updating layers of the GNN form [30], [35]:
in order to eventually produce the IRS reflective coefficients d
{zjd−1 }j∈N (k) = ψ {fnn d
(zjd−1 )}j∈N (k) , (20)

faggregate
v from z0D and the user beamformers wk from zkD .
2) Updating Layers: The update of the representation vec- where ψ(·) is a function that is invariant to the permutation
tor zk in the d-th layer is based on combining its previous of the inputs, e.g., element-wise max-pooling or element-wise
8
d
mean pooling, and fnn (·) is a fully connected neural network. Then a normalization layer outputs the real and imaginary
d
In addition, the combining function fcombine (·) in (19) can components of the reflection coefficients v as follows:
also be implemented by a fully connected neural network for
complicated optimization problems [30], [35]. Zv = [z0D+1 (1 : N ), z0D+1 (N + 1 : 2N )] ∈ RN ×2 , (25)
We adopt this framework to design the GNN for solving [Zv ]i1 [Zv ]i2
vi = p 2 2
+jp , ∀i, (26)
the problem (6). We should note that our problem requires [Zv ]i1 + [Zv ]i2 [Zv ]2i1 + [Zv ]2i2
permutation equivariance with respect to the user nodes and
where [Z]ik denotes the element in the i-th row and k-th
permutation invariance with respect to the IRS node, so the
column of the matrix Z, and the notation z(i1 : i2 ) denotes
aggregation and combination operations for the IRS and the
the subvector of z indexed from i1 to i2 .
user nodes are designed differently.
Similarly, to output the beamforming matrix, we first pass
For the IRS node, we derive z0d , the node representation zkD through a linear layer fwD+1 (·) with 2M units
vector in the d-th layer from the previous layer as follows:
zkD+1 = fwD+1 (zkD ) ∈ R2M , k = 1, · · · , K, (27)
z0d f2d f0d (z0d−1 ), ψ0 f1d (z1d−1 ), · · · d−1
, f1d (zK

= ) , (21)
then use the following normalization steps to produce the
beamforming matrix W :
where f0d (·), f1d (·) and f2d (·) are fully connected neural
networks, and the aggregation function ψ0 (·) is chosen to Zw = [z1D+1 , · · · , zKD+1
] ∈ R2M ×K , (28)
be the element-wise mean function over the users as in (18), p Zw
which performs well empirically and corresponds to the fact Zw = Pd , (29)
kZw kF
that the IRS reflective pattern needs to serve all users.
W = Zw (1 : M, :) + jZw (M + 1 : 2M, :), (30)
For the user nodes, we recognize that aggregation should be
with respect to all the other user nodes excluding the IRS node where the notation Z(i1 : i2 , :) denotes the submatrix of Z
and propose the update equation for the node representation constructed by taking the rows of Z indexed from i1 to i2 .
vectors zkd for k = 1, · · · , K in the d-th layer to take the form Note that throughout the GNN architecture, we use the
same fw0 (·), f0d (·), f1d (·), f2d (·), f3d (·), f4d (·), and fwD+1 (·) to
zkd = f4d f0d (z0d−1 ), zkd−1 , ψ1 f3d (zjd−1 ) ∀j6=0,j6=k ,

update the node representation vectors for all the user nodes.
(22) This allows the GNN to be generalizable to an IRS network
with arbitrary number of users. The learned combining and
aggregation operations (21) and (22) are independent of the
where f3d (·), f4d (·) are fully connected neural networks and
number of users. If we increase or decrease the number of
ψ1 (·) is chosen to be the element-wise max-pooling function,
users in the system, we only need to increase or decrease the
number of nodes in the graph, the same learned combining
[ψ1 (z1 , · · · , zK )]i = max([z1 ]i , · · · , [zK ]i ), (23)
and aggregation operations can still be used without having to
re-train the neural network.
which performs well empirically and corresponds to the fact
that multiuser interference is typically dominated by the
strongest user. C. Neural Network Training
The update of the node representation vectors in (21) and Since the existing deep learning software packages do not
(22) is shown in Fig. 4(a) and Fig. 4(b), respectively. We support complex-valued operations, to compute the network
remark that there are many different design choices for the utility during the training phase, we rewrite the achievable
aggregation and combination operations. There is no general rate Rk as a function of the real and imaginary parts of wk
theory on how to choose these permutation invariant functions. and v as follows:
In most works in the literature, the GNN architectures are !
designed based on empirical trials. In the simulation section of kγk k2
Rk = log 1 + PK , (31)
2 2
this paper, we shall see that the adopted GNN framework and i=1,i6=k kγi k + σ0
the choice of the permulation invariant functions can achieve
where
excellent performance.
<{wi> } −={wi> }

3) Normalization layer: After D update layers, the rep- γi =
resentation vectors produced by the GNN, i.e., zkD , k = ={wi> } <{wi> }
<{hdk }

0, · · · , K, are passed to a normalization layer to produce the <{Ak } −={Ak } <{v}
· + . (32)
reflective coefficients v ∈ CN and the beamforming matrix ={hdk } ={Ak } <{Ak } ={v}
W ∈ CM ×K , while ensuring that the unit modulus constraints
Given this real representation of Rk , the loss
on v and the total power constraint on W are satisfied.
function of the GNN can be expressed as
To this end, we first take z0D as the input to a linear layer −E [U (R1 (v, W ), . . . , RK (v, W ))]. Note that we need
D+1
fv (·) with 2N fully connected units as follows: CSI to generate training samples and to compute the network
utility, but once trained, the operation of the neural network
z0D+1 = fvD+1 (z0D ) ∈ R2N . (24) does not require CSI. We remark that the training of the
9
Fig. 5. Simulation layout of IRS assisted communication system.
neural network is done offline, so it does not affect the rectangular array placed on the (y, z)-plane in a 10 × 10
run-time complexity of the proposed approach. configuration. The BS antennas have a uniform linear array
During training, the neural network learns to adjust its configuration parallel to the x-axis.
weights to maximize the network utility, i.e., the objective We assume that the direct channels from the BS to the users
function of problem (6), in an unsupervised manner, using follow Rayleigh fading, i.e.,
the stochastic gradient descent method. The updates of the
neural network parameters including the computation of the hdk = β0,k h̃dk , (34)
corresponding gradients can be automatically implemented
using any standard numerical deep learning software package. where h̃dk ∼ CN (0, I), and β0,k denotes the path-loss of the
The overall end-to-end training allows us to jointly design direct link between the BS and the user k modeled (in dB)
the beamforming matrix and phase shifts from the received as 32.6 + 36.7 log(dBU BU
k ), where dk is the distance of the
pilots directly. As shown in the simulation results in the next direct link from the BS to the user k. We assume that the
two sections, the proposed deep learning method is able to IRS is deployed at a location where a line-of-sight channel
solve problem (6) more efficiently, in the sense that it would exists between the IRS and the users/BS, so we model the
need fewer pilots to achieve the same performance as the channel hrk ’s between the IRS and the user k and the channel
conventional separated channel estimation and network utility G between the BS and the IRS as Rician fading channels:
maximization approach.
r r !
ε r,LOS 1 r,NLOS
V. P ERFORMANCE FOR S UM -R ATE M AXIMIZATION hrk
= β1,k h̃ + h̃ , (35)
1+ε k 1+ε k
In this section, we evaluate the performance of the proposed r r !
deep learning method for the problem of sum-rate maximiza- ε LOS 1 NLOS
G = β2 G̃ + G̃ , (36)
tion in comparison to the channel estimation based approach. 1+ε 1+ε
Specifically, the sum-rate maximization problem is given as
where the superscript LOS represents the line-of-sight part of
" #
X
maximize E Rk (v, W ) the channel and the superscript NLOS represents non-line-
(W ,v)=g(Y )
k of-sight part, ε is the Rician factor which is set to be 10 in
X (33)
subject to kwk k2 ≤ Pd , simulations, and β1,k , β2 are the path-loss from the IRS to the
k user k and the path-loss from the BS to the IRS, respectively.
|vi | = 1, i = 1, 2, · · · , N, The entries of G̃NLOS and h̃kr,NLOS are modeled as i.i.d.
P standard Gaussian distributions, i.e., [G̃NLOS ]ij ∼ CN (0, 1)
so the loss function can be set as −E [ k Rk (v, W )].
and [h̃r,NLOS
k ]i ∼ CN (0, 1). The path-losses (in dB) of the BS-
IRS link and the IRS-user link are modeled as 30+22 log(dBI )
A. Simulation Setting and 30 + 22 log(dIUk ), respectively, where d
BI
is the distance
IU
We consider an IRS assisted multiuser MISO communica- between the BS and the IRS, and dk is the distance between
tion system as illustrated in Fig. 5, consisting of a BS with the IRS and the user k [5], [7]. The uplink pilot transmit power
8 antennas and an IRS with 100 passive elements. As shown and the downlink data transmit power are set to be 15dBm
in the Fig. 5, the (x, y, z)-coordinates of the BS and the IRS and 20dBm, and the uplink noise power is −100dBm and the
locations in meters are (100,100, 0) and (0, 0, 0), respectively. downlink noise power is −85dBm unless otherwise stated.
There are 3 users uniformly distributed in a rectangular area The line-of-sight part of the channel hrk is a function of
[5, 35] × [−35, 35] in the (x, y)-plane with z = −20 as shown the IRS/user locations. Specifically, let φ∗3,k , θ3,k
∗
denote the
in Fig. 5. We assume that the IRS is equipped with a uniform azimuth and elevation angles of arrival from the user k to the
10
IRS, as shown in Fig. 5. Then the n-th element of the IRS TABLE I
steering vector aIRS (φ∗3,k , θ3,k
∗
) can be expressed as [36] A RCHITECTURE OF THE FULLY CONNECTED NEURAL NETWORKS .
Name Size Activation Function

[aIRS (φ∗3,k , θ3,k
∗
)]n =
0
fw 2M τ × 1024 × 512 relu
2πdIRS
{i1 (n) sin(φ∗ ∗ ∗
ej λc 3,k ) cos(θ3,k )+i2 (n) sin(θ3,k )} , (37) fv0 512 × 1024 × 512 relu
f01 , f11 , f21 , f31 , f41 512 × 512 × 512 relu
where dIRS is the distance between two adjacent IRS elements, f02 , f12 , f22 , f32 , f42 512 × 512 × 512 relu
and λc is the carrier wavelength, i1 (n) = mod (n − 1, 10),
and i2 (n) = b(n − 1)/10c. In the simulations, we assume
2dIRS decrease on the validation data set over 10 consecutive training
λc = 1 without loss of generality. Thus, the line-of-sight
channel h̃r,LOS
k can be written as epochs.
In the testing stage, we compare the neural network with
h̃r,LOS
k = aIRS (φ∗3,k , θ3,k
∗
). (38) the following benchmarks:
Let (xk , yk , zk ) denote the location of the user k and • Benchmark 1. Perfect CSI with BCD: Given perfect CSI,
(xIRS , y IRS , z IRS ) denote the location of the IRS, we have the sum-rate maximization problem is solved by the block
coordinate descent (BCD) algorithm proposed in [5]. We
yk − y IRS
sin(φ∗3,k ) cos(θ3,k
∗
)= , (39) stop the BCD algorithm when the increase in sum rate
dIU
k between two consecutive iterations is below 10−3 .
∗ zk − z IRS • Benchmark 2. LMMSE channel estimation with BCD:
sin(θ3,k )= , (40)
dIU
k
We first estimate the channels using the LMMSE esti-
mator developed in Section III, then perform sum-rate
where dIUk is the distance between the IRS and the user k. maximization using the BCD algorithm [5]. The required
Similarly, let φ∗1 , θ1∗ denote the azimuth and elevation angles statistics for the LMMSE estimator are computed from
of arrival (AOA) to the BS, then the steering vector of the BS 10, 000 channel realizations.
is given by • Benchmark 3. Deep learning based channel estimation
2π(M −1)dBS with BCD: To understand the benefit of implicit vs. ex-
cos(φ∗ ∗
aBS (φ∗1 , θ1∗ ) = [1, · · · , ej λc 1 ) cos(θ1 ) ], (41)
plicit channel estimation, we implement a neural network
BS to explicitly estimate all the channels followed by using
where d is the distance between two adjacent BS antennas,
BS
and we assume 2dλc = 1. Let φ∗2 , θ2∗ denote the azimuth and BCD for designing the phase shifts and the beamformers,
elevation angles of departure (AoD) from the IRS to the BS, and compare its performance with the implicit channel
then we can write estimation strategy proposed in this paper. Specifically,
the neural network architecture is almost the same as the
G̃LOS = aBS (φ∗1 , θ1∗ )aIRS (φ∗2 , θ2∗ )H . (42) proposed GNN for the sum-rate maximization problem,
Given the location of the BS (xBS , y BS , z BS ) and the location except that the graph node k = 0 is removed since we
of the IRS (xIRS , y IRS , z IRS ), we have only have K channel matrices to be estimated. For the
graph node k 6= 0, the last layer representation vector
xIRS − xBS zkD is fed to a linear layer of size 2M (N + 1) to
cos(φ∗1 ) cos(θ1∗ ) = , (43)
dBI output the vectorized channel matrix estimation F̂k . The
BS IRS
y −y inputs to the GNN for the channel estimation are also
sin(φ∗2 ) cos(θ2∗ ) = , (44)
dBI Ỹ1 , · · · , ỸK , and the loss function for channel estimation
BS IRS
z −z is the averaged MSE of the channels over all the users,
sin(θ2∗ ) = , (45) 1 2
dBI
P
i.e., K k E[kF̂k − Fk kF ]. The implementation of this
where dBI is the distance between the IRS and the BS. neural network is the same as the proposed GNN for
solving the sum-rate maximization problem (33) but with
a different initial learning rate 10−4 .
B. Neural Network Training and Testing
We use a 2-layer GNN as proposed in Section IV, i.e., with
D = 2. The parameters of the fully connected neural networks C. Numerical Results
fw0 , fv0 and fkd with k = 0, · · · , 4, d = 1, 2 are summarized We first illustrate the impact of uplink pilot length on the
in Table I. downlink sum rate in Fig. 6(a). All the points are averaged over
We implement the proposed network using the deep learning 1000 channel realizations. As a trivial baseline, we also show
library TensorFlow [37]. The neural network is trained using the performance of the system with random phase shifts, while
the Adam optimizer [38] with an initial learning rate 10−3 , the beamforming matrix W is optimized using the WMMSE
and the learning rate is decreased after 300 iterations by a algorithm [39] assuming perfect CSI, to see how much gain
factor of 0.98. At each training epoch, we iterate 100 times to we can obtain by jointly optimizing the phase shifts and the
update the parameters of the neural network, and 1024 training beamforming matrix.
samples are used to compute the gradients in each iteration. From Fig. 6(a), using only 30 pilots, the proposed deep
We terminate the training process if the loss function does not learning approach without location information can achieve
11
5.5 6
5 5
4.5 4
4 3
3.5 2
3 1
2.5 0
0 20 40 60 80 100 120 0 10 20 30 40 50 60 70 80
(a) Sum rate vs. pilot length. (b) Testing sum rate vs. training epochs.
Fig. 6. Performance of the proposed GNN for the IRS system with M = 8, N = 100, K = 3, Pd = 20dBm, and Pu = 15dBm.
9 5.6 10.5
5.4 10
8
5.2 9.5
7 5 9
4.8 8.5
6
4.6
8
5
4.4
7.5
4 4.2
7
4
6.5
3
3.8
6
2 3.6
15 17 19 21 23 25 10 12 14 16 18 20 5.5
2 3 4 5 6 7 8
(a) Sum rate vs. downlink transmit power for K = 3 (b) Sum rate vs. uplink pilot transmit power for (c) Sum rate vs. number of users for Pu = 15dBm
and Pu = 15dBm. K = 3, Pd = 20dBm. and Pd = 25dBm.
Fig. 7. Generalization performance of the GNN in an IRS system with M = 8, N = 100 and L = 25K, where K is the number of users.
about 96% of the sum rate in Benchmark 1 which assumes However, the proposed deep learning approach for directly
perfect CSI. The deep learning approach using 15 pilots still maximizing the sum rate achieves even better performance,
achieves better performance than the Benchmark 2 with 120 which shows that the neural network for direct rate maxi-
pilots. For Benchmark 2, 303 pilot symbols are needed for mization is able to extract more pertinent information from the
perfect channel reconstruction in the noiseless case. These received pilots than the neural network for explicit estimation
observations show that the deep learning approach, which of the channel matrix Fˆk . Note also that the dimension of
directly optimizes the system objective based on the pilots the output of the neural network for solving the sum-rate
received, can reduce the pilot training overhead significantly. maximization problem is much smaller than the output of
the neural network needed for channel estimation. This is
Further, Fig. 6(a) shows that incorporating the user location
another reason that it is advantageous to maximize the sum
information to the input of the proposed GNN can further
rate directly instead of recovering the entries of the channel
improve the sum rate if we do not have sufficiently large
matrix first.
pilot length, as compared to the results without location
information. But as the pilot length increases, the improvement
Next, we show how much data is needed to train the
becomes marginal. This is because the received pilot signals
proposed neural network. In Fig. 6(b), we plot the sum rate
implicitly contain the location information of the users, so it
evaluated on testing data against the training epochs for the
can be learned by the neural network if the pilot length is
deep learning approach without location information. Recall
sufficiently large.
that we sample 102, 400 training data in each training epoch. It
Fig. 6(a) illustrates that the Benchmark 3 (Deep learning can be seen from Fig. 6(b) that 10 training epochs are sufficient
based channel estimation with BCD) can outperform the to achieve a satisfactory performance (greater than 90% of the
conventional LMMSE based channel estimation approach. sum rate achieved with 80 training epochs).
12
TABLE II 1
S UM RATE (bps/Hz) WITH N = 100, K = 3, Pd = 25dBm, Pu = 15dBm.
0.9
M L Deep Learning LMMSE+BCD Gain Perfect CSI
45 7.45 5.83 28% 0.8
8 8.5
75 7.81 6.59 18%
0.7
45 9.02 7.76 16%
16 11.6
75 9.70 8.86 9% 0.6
0.5
To evaluate the performance of GNN when the BS is
0.4
equipped with larger number of antennas, we train the same
GNN under the setting N = 100, K = 3, Pd = 25dBm, 0.3
Pu = 15dBm, for both M = 8 and M = 16. As can be 0.2
seen from Table V-C, the proposed GNN always outperforms
0.1
the benchmark of LMMSE with BCD. As expected the gain is
larger when the pilot length L is smaller. But, we also observe 0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
that the gain is smaller when the number of antennas M
increases from 8 to 16. This is because the problem dimension
increases as the number of BS antennas increases. We would
need to increase the number of parameters in the GNN to Fig. 8. Empirical cumulative distribution function of the minimum user rate
further improve its performance. with M = 4, N = 20, K = 3, L = 75, Pd = 20dBm, and Pu = 15dBm.
D. Generalizability
3) Generalization to different number of users: In this
We now test the trained neural network on different system simulation, the pilot length is set to be L = 25K, where
parameters to show its generalization capability. K is the number of users. The downlink transmit power is
1) Generalization to different downlink transmit powers: set as Pd = 25dBm. We train the proposed neural network
We fix the number of pilot length L = 75, and train two with K = 6 and test its performance on the settings with
neural networks under the settings with downlink transmit different number of users. The results are shown in Fig. 7(c).
power 15dBm and 25dBm, respectively. Then the trained It can be observed that the proposed GNN is able to generalize
neural networks are tested under different downlink transmit to different number of users, and it always outperforms the
powers. The results are shown in Fig. 7(a). For comparison, explicit channel estimation Benchmark 2.
we also plot the performance of the neural network with
the same training and testing downlink transmit power. All
VI. P ERFORMANCE FOR M AXIMIZING M INIMUM R ATE
the points are averaged over 1000 channel realizations. From
Fig. 7(a), it can be observed that the proposed deep learning The previous section shows the benefits of bypassing ex-
approach can significantly outperform Benchmark 2 under the plicit channel estimation for the sum-rate maximization prob-
different downlink transmit powers. Furthermore, training the lem. But the sum-rate objective does not provide fairness
neural network at either 15dBm or 25dBm (but testing it across the users, which is typically required in practical
at different powers) only incurs small losses. This suggests wireless communication systems. In this section, we consider
that the proposed neural network generalizes well to different the problem of maximizing the minimum user rate, i.e., max-
downlink transmit powers. min problem, as formulated below:

2) Generalization to different uplink pilot transmit powers:
In this scenario, we fix the pilot length L = 75 and the maximize E min Rk (v, W )
(W ,v)=g(Y ) k
downlink transmit power Pd = 20dBm. We first train a X
(46)
subject to kwk k2 ≤ Pd ,
neural network under the setting with uplink pilot transmit
k
power Pu = 15dBm. We then change the pilot transmit
power to generate the testing data. The results are shown in |vi | = 1, i = 1, 2, · · · , N.
Fig. 7(b). We also plot the performance of the neural network For the max-min problem (46), we use the same neural
with the same training and testing uplink transmit power for network architecture as in the sum-rate maximization problem,
comparison. All the points are averaged over 1000 channel but the loss function is now −E [mink Rk (v, W )]. We should
realizations. In Fig. 7(b), training the neural network at a fixed note that the minimum user rate function is differentiable
pilot power of 15dBm (but testing at different pilot powers) with respect to W and v almost everywhere except at the
achieves almost identical performance as training and testing points when Rk = Rj for k 6= j, thus the gradient based
at the same powers, which implies that the proposed neural back-propagation method can still be applied for training the
network generalizes well under different uplink pilot transmit proposed neural network.
powers. Besides, the proposed deep learning approach is quite The simulation setting is that of an IRS assisted wireless
robust to the pilot transmit power variation, while we observe communication system in which a BS with M = 4 and an IRS
a significant performance degradation of the Benchmark 2 as with N = 20 serve K = 3 users distributed in the rectangular
the uplink transmit power decreases. area of [5, 15] × [−15, 15] in the (x, y)-plane with z = −20.
13
TABLE III is defined as

G ENERALIZABILITY TO NUMBER OF USERS : MINIMUM RATE (bps/Hz)
WITH M = 4, N = 20, Pd = 20dBm, and Pu = 15dBm. fi (φ2 , θ2 , φ3 , θ3 ) = |aIRS (φ2 , θ2 )H diag(v)aIRS (φ3 , θ3 )|
L K Deep learning LMMSE+BCD Perfect CSI = |aIRS (φ2 , θ2 )H diag(aIRS (φ3 , θ3 ))v|
2 0.587 0.529 0.786
L = 5K 3 0.386 0.335 0.496
= |v H ãIRS (φ2 , θ2 , φ3 , θ3 )|, (47)
4 0.274 0.240 0.351
2 0.726 0.620 0.786 where
L = 25K 3 0.466 0.395 0.496
4 0.315 0.284 0.351 [ãIRS (φ2 , θ2 , φ3 , θ3 )]n =
ejπ{i1 (n)(sin(φ3 ) cos(θ3 )−sin(φ2 ) cos(θ2 ))+i2 (n)(sin(θ3 )−sin(θ2 ))} ,
The locations of the BS and the IRS remain the same as in and i1 (n) = mod (n − 1, 10) and i2 (n) = b(n − 1)/10c.
Fig. 5. The downlink transmit power is 20dBm, and the uplink Similarly, the array response at the BS is a function of the
pilot length is 75. The training parameters are the same as in angles φ1 , θ1 given by
the sum-rate maximization problem. fb (φ1 , θ1 ) = |aBS (φ1 , θ1 )H w|, (48)
For the baseline with the LMMSE channel estimator, the
max-min problem is solved using the BCD algorithm between where
the beamformer and the reflective coefficient as proposed
[aBS (φ1 , θ1 )]m = ejπ(m−1)(cos(φ1 ) cos(θ1 )) . (49)
in [12] in which semidefinite relaxation (SDR) is used for
designing the reflective coefficients. The complexity of SDR is As there exists a line-of-sight component in the channel
high; this is why we choose a simulation setting with N = 20. between the IRS and the BS/user, we expect the learned
In Fig. 8, we plot the empirical cumulative distribution func- beamforming vector w and the learned reflection coefficients
tion of the minimum user rate over 1000 channel realizations. v to match the angles of the line-of-sight channels, so that the
As can be seen from Fig. 8, the proposed deep learning method SNR of the user can be maximized.
outperforms the baselines with either the LMMSE or the deep In the numerical simulation, the pilot length is set to be L =
learning channel estimation. This shows that the proposed 25K. The locations of the BS and the IRS are (100, −100, 0)
deep learning framework can also be applied to the max-min and (0, 0, 0) respectively, so that φ∗1 = 2.356, θ1∗ = 0 and
problem. φ∗2 = −0.785, θ2∗ = 0.
In Table III, we test the generalization capability of the We first examine the single-user case, in which, the user is
trained GNN for settings with different number of users. The located at (30, 20, −20), so that φ∗3 = 0.588 and θ3∗ = −0.506.
GNN is trained with 3 users, but is also tested in the case In Fig. 9(a), we plot the learned array responses of the BS as
with 2 users and 4 users. It can be observed that the deep a function of φ1 for the case N = 100 and M = 8. It shows
learning approach always achieves 87% − 90% of the perfect that indeed the learned beamforming vector at the BS focuses
CSI benchmark for L = 25K (where K is the number of energy in the direction of the IRS. In Fig. 9(b), we plot the
users) in all cases, which shows that the GNN generalizes well learned array responses of the IRS as a function of φ3 and θ3
to settings with different number of users. We also observe that (with fixed φ2 = φ∗2 and θ2 = θ2∗ according to the BS-to-IRS
the deep learning approach with 5K pilot length can achieve direction). It is observed that indeed the IRS array response is
over 94% of the minimum rate achieved by the conventional maximized when φ3 ≈ φ∗3 and θ3 ≈ θ3∗ . This shows that the
LMMSE approach with 25K pilot length, which shows the learned configurations of the IRS indeed reflect the signal in
remarkable pilot length reduction achieved by the proposed the correct user direction.
GNN for the max-min problem. Moreover, we investigate the impact of the number of
elements N at the IRS on the reflective pattern. For the user
at location (30, 20, -20), the array responses of the IRS with
VII. I NTERPRETATION OF S OLUTIONS FROM GNN N = 30 and N = 50 elements (placed as a rectangular
array with 10 elements horizontally) are shown in Fig. 9(d)
To interpret the solutions obtained by GNN, in this section, and Fig. 9(c). Combined with Fig. 9(b) with N = 100, it is
we visually verify that the learned IRS indeed reflects the clear that the array response focuses better as the number of
signal in the desired directions. We train the proposed neural elements at the IRS increases.
network architecture for both a single-user case and a mul- Next, we examine a multiuser case in which the neural
tiuser case, and use the array response as a way to illustrate network is trained to maximize the minimum rate over 3 users
qualitatively the correctness of the beamforming pattern. Let located at (5, −12, −20), (5, 0, −20) and (5, 12, −20), so that
φ1 , φ2 , φ3 denote the azimuth angle of arrival from the IRS (φ∗3 , θ3∗ ) = (−1.176, −0.994), (0, −0.980), (1.176, −0.994)
to the BS, the azimuth angle of departure from the IRS to the for the three users respectively. Fig. 10 shows the BS and
BS, and the azimuth angle of arrival from the user to the IRS, IRS array responses learned by the GNN. From Fig. 10(b),
respectively. The corresponding elevation angle parameters are we indeed see three peaks matching the angles φ∗3 and θ3∗
denoted by θ1 , θ2 , θ3 , respectively. corresponding to the three users. In addition, we observe that
The array response at the IRS is a function of the incident the peak corresponding to the first user is weaker than the other
angle and the reflection angle of the wireless signals, which two users, but we can see from Fig. 10(a) that this user has the
14
3
-1.5 -1.5 -1.5
90 45
2.5 -1 -1 -1 25
80 40
70 35
2 -0.5 -0.5 -0.5 20
60 30
1.5 0 50 0 25 0 15
40 20
1 0.5 0.5 0.5 10
30 15
20 10
0.5 1 1 1 5
10 5
1.5 1.5 1.5

0
0 0.5 1 1.5 2 2.5 3 3.5 -1.5 -1 -0.5 0 0.5 1 1.5 -1.5 -1 -0.5 0 0.5 1 1.5 -1.5 -1 -0.5 0 0.5 1 1.5
(a) Array response of BS, M = 8. (b) Array response of IRS, N = 100. (c) Array response of IRS, N = 50. (d) Array response of IRS N = 30.
Fig. 9. Array response of the BS and the IRS obtained from GNN for the single-user case: (a) and (b) are for N = 100 and M = 8; (c) and (d) are for
N = 50, 30 and M = 8. The optimal φ∗1 = 2.356, φ∗3 = 0.588, θ3∗ = −0.506.
1.4
User 1 -1.5
User 2
1.2 User 3 60
-1
1 50
-0.5
0.8 40
0
0.6 30
0.5
0.4 20
1
0.2 10
1.5
0 0
0 0.5 1 1.5 2 2.5 3 3.5 -1.5 -1 -0.5 0 0.5 1 1.5
(a) Array response of the BS. (b) Array response of the IRS.
Fig. 10. Array response of the BS and the IRS obtained from GNN for maximizing the minimum rate over three users (i.e., K = 3), with N = 100, M = 8.
The optimal (φ∗3 , θ3∗ ) = (−1.176, −0.994), (0, −0.980), (1.176, −0.994) respectively for the three users. The azimuth angle of departure from BS to IRS
is φ∗1 = 2.356.
strongest array response at the BS. Therefore, the GNN indeed pilots to the desired IRS configuration and the desired per-
learns to jointly optimize the phase-shifts and the beamforming user beamformers at the BS. Simulation results show that the
matrix to ensure fairness in this scenario of maximizing the trained neural network produces interpretable results and can
minimum rate across the three users. The weaker IRS array efficiently learn to solve utility maximization problems using
response is compensated by a stronger BS array response. much fewer pilots as compared to the conventional approach.
We also note from Fig. 10(a) that the responses of the BS are
maximized at different angles for different users. This shows
that the BS beamformers have learned to differentiate the three R EFERENCES
users. Note that the combined BS and IRS array responses
[1] T. Jiang, H. V. Cheng, and W. Yu, “Learning to beamform for intelligent
would need to minimize the interference among the users and reflecting surface with implicit channel estimate,” in Proc. IEEE Global
not just to maximize each users’ direct channels. Note that the Commun. Conf. (GLOBECOM), Dec. 2020.
Rayleigh channel components from the BS to the users also [2] M. Di Renzo, A. Zappone, M. Debbah, M.-S. Alouini, C. Yuen,
J. de Rosny, and S. Tretyakov, “Smart radio environments empowered
impact the optimum BS and IRS array response patterns. by reconfigurable intelligent surfaces: How it works, state of research,
and road ahead,” IEEE J. Sel. Areas Commun., vol. 38, no. 11, pp. 2450
–2525, July 2019.
VIII. C ONCLUSION [3] C. Huang, S. Hu, G. C. Alexandropoulos, A. Zappone, C. Yuen,
Conventional communication system design always involves R. Zhang, M. Di Renzo, and M. Debbah, “Holographic MIMO surfaces
for 6G wireless networks: Opportunities, challenges, and trends,” IEEE
obtaining accurate CSI first, then designing the optimal trans- Trans. Wireless Commun., vol. 27, no. 5, pp. 118–125, Oct. 2020.
mission scheme according to the CSI. This design strategy [4] L. Subrt and P. Pechac, “Intelligent walls as autonomous parts of smart
is not practical for IRS due to the large number of passive indoor environments,” IET Commun., vol. 6, no. 8, pp. 1004–1010, May
2012.
reflective elements involved. This paper proposes an approach
[5] H. Guo, Y.-C. Liang, J. Chen, and E. G. Larsson, “Weighted sum-
that learns to configure the IRS and beamforming at the BS rate maximization for reconfigurable intelligent surface aided wireless
to maximize the system utility function directly based on the networks,” IEEE Trans. Wireless Commun., vol. 19, no. 5, pp. 3064–
received pilots, in effect bypassing explicit channel estimation. 3076, May 2020.
[6] X. Yang, C.-K. Wen, and S. Jin, “MIMO detection for reconfigurable
This is accomplished by a generalizable graph neural network intelligent surface-assisted millimeter wave systems,” IEEE J. Sel. Areas
architecture that unveils the direct mapping from the received Commun., vol. 38, no. 8, pp. 1777–1792, Aug. 2020.
15
[7] Q. Wu and R. Zhang, “Intelligent reflecting surface enhanced wireless [29] M. Lee, G. Yu, and G. Y. Li, “Graph embedding based wireless link
network via joint active and passive beamforming,” IEEE Trans. Wireless scheduling with few training samples,” IEEE Trans. Wireless Commun.,
Commun., vol. 18, no. 11, pp. 5394–5409, Nov. 2019. vol. 20, no. 4, pp. 2282 – 2294, Apr. 2021.
[8] C. Huang, A. Zappone, G. C. Alexandropoulos, M. Debbah, and [30] Y. Shen, Y. Shi, J. Zhang, and K. B. Letaief, “Graph neural networks for
C. Yuen, “Reconfigurable intelligent surfaces for energy efficiency in scalable radio resource management: Architecture design and theoretical
wireless communication,” IEEE Trans. Wireless Commun., vol. 18, no. 8, analysis,” IEEE J. Sel. Areas Commun., vol. 39, no. 1, pp. 101–115, Jan.
pp. 4157–4170, Aug. 2019. 2020.
[9] J. Zhu, Y. Huang, J. Wang, K. Navaie, and Z. Ding, “Power efficient [31] Ö. Özdogan and E. Björnson, “Deep learning-based phase reconfigura-
IRS-assisted NOMA,” IEEE Trans. Commun., vol. 69, no. 2, pp. 900– tion for intelligent reflecting surfaces,” in Proc. IEEE Asilomar Conf.
913, Feb. 2021. Signals, Syst., Comput., Nov. 2020.
[10] T. Jiang and Y. Shi, “Over-the-air computation via intelligent reflecting [32] K. Hornik, M. Stinchcombe, and H. White, “Multilayer feedforward
surfaces,” in Proc. IEEE Global Commun. Conf. (GLOBECOM), Dec. networks are universal approximators,” Neural Netw., vol. 2, no. 5, pp.
2019. 359–366, 1989.
[11] X. Yu, D. Xu, Y. Sun, D. W. K. Ng, and R. Schober, “Robust and secure [33] E. Björnson and B. Ottersten, “A framework for training-based esti-
wireless communications via intelligent reflecting surfaces,” IEEE J. Sel. mation in arbitrarily correlated Rician MIMO channels with Rician
Areas Commun., vol. 38, no. 11, pp. 2637–2652, Nov. 2020. disturbance,” IEEE Trans. Signal Process., vol. 58, no. 3, pp. 1807–
1820, Mar. 2009.
[12] H. Alwazani, A. Kammoun, A. Chaaban, M. Debbah, and M.-S. Alouini,
[34] K. Xu, W. Hu, J. Leskovec, and S. Jegelka, “How powerful are graph
“Intelligent reflecting surface-assisted multi-user MISO communication:
neural networks?” in Proc. Int. Conf. Learning Representation, May
Channel estimation and beamforming design,” IEEE Open J. Commun.
2019.
Soc., vol. 1, pp. 661–680, May 2020.
[35] M. Fey and J. E. Lenssen, “Fast graph representation learning with
[13] D. Mishra and H. Johansson, “Channel estimation and low-complexity pytorch geometric,” in Proc. Int. Conf. Learning Representation Work-
beamforming design for passive intelligent surface assisted MISO wire- shops, May 2019.
less energy transfer,” in Proc. IEEE Int. Acoustics, Speech and Signal [36] E. Björnson and L. Sanguinetti, “Rayleigh fading modeling and channel
Process. (ICASSP), Apr. 2019, pp. 4659–4663. hardening for reconfigurable intelligent surfaces,” IEEE Wireless Com-
[14] L. Wei, C. Huang, G. C. Alexandropoulos, C. Yuen, Z. Zhang, and mun. Lett., vol. 10, no. 4, pp. 830–834, April 2021.
M. Debbah, “Channel estimation for RIS-empowered multi-user MISO [37] M. Abadi et al., “TensorFlow: Large-scale machine learning on
wireless communications,” IEEE Trans. Commun., Early Access, 2021. heterogeneous systems,” 2015, software available from tensorflow.org.
[15] E. Björnson, Ö. Özdogan, and E. G. Larsson, “Intelligent reflecting [Online]. Available: https://www.tensorflow.org/
surface vs. decode-and-forward: How large surfaces are needed to beat [38] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,”
relaying?” IEEE Wireless Commun. Lett., Feb. 2019. in Proc. Int. Conf. Learn. Represent., May 2015. [Online]. Available:
[16] B. Zheng and R. Zhang, “Intelligent reflecting surface-enhanced OFDM: https://arxiv.org/abs/1412.6980
Channel estimation and reflection optimization,” IEEE Wireless Com- [39] Q. Shi, M. Razaviyayn, Z.-Q. Luo, and C. He, “An iteratively weighted
mun. Lett., vol. 9, no. 4, pp. 518–522, Apr. 2020. mmse approach to distributed sum-utility maximization for a MIMO
[17] J. Chen, Y.-C. Liang, H. V. Cheng, and W. Yu, “Channel estimation interfering broadcast channel,” IEEE Trans. Signal Process., vol. 59,
for reconfigurable intelligent surface aided multi-user mmwave MIMO no. 9, pp. 4331–4340, Sept. 2011.
systems,” to appear in IEEE Trans. Wireless Commun., 2021. [Online].
Available: https://arxiv.org/abs/1912.03619
[18] Z. Wang, L. Liu, and S. Cui, “Channel estimation for intelligent reflect-
ing surface assisted multiuser communications: Framework, algorithms,
and analysis,” IEEE Trans. Wireless Commun., vol. 19, no. 10, pp. 6607–
6620, Oct. 2020.
[19] A. M. Elbir and K. V. Mishra, “A survey of deep learning
architectures for intelligent reflecting surfaces,” 2020. [Online].
Available: https://arxiv.org/abs/2009.02540
[20] S. Liu, Z. Gao, J. Zhang, M. Di Renzo, and M.-S. Alouini, “Deep
denoising neural network assisted compressive channel estimation for
mmwave intelligent reflecting surfaces,” IEEE Trans. Veh. Technol.,
vol. 69, no. 8, pp. 9223–9228, Aug. 2020.
[21] A. M. Elbir, A. Papazafeiropoulos, P. Kourtessis, and S. Chatzinotas,
“Deep channel learning for large intelligent surfaces aided mm-wave
massive MIMO systems,” IEEE Wireless Commun. Lett., vol. 9, no. 9,
pp. 1447–1451, Sep. 2020.
[22] J. Gao, C. Zhong, X. Chen, H. Lin, and Z. Zhang, “Unsupervised
learning for passive beamforming,” IEEE Wireless Commun. Lett.,
vol. 24, no. 5, pp. 1052–1056, May 2020.
[23] K. Feng, Q. Wang, X. Li, and C.-K. Wen, “Deep reinforcement learning
based intelligent reflecting surface optimization for MISO communica-
tion systems,” IEEE Wireless Commun. Lett., vol. 9, no. 5, pp. 745–749,
May 2020.
[24] C. Huang, R. Mo, and C. Yuen, “Reconfigurable intelligent surface as-
sisted multiuser MISO systems exploiting deep reinforcement learning,”
IEEE J. Sel. Areas Commun., vol. 38, no. 8, pp. 1839–1850, Aug. 2020.
[25] A. Alkhateeb, S. Alex, P. Varkey, Y. Li, Q. Qu, and D. Tujkovic, “Deep
learning coordinated beamforming for highly-mobile millimeter wave
systems,” IEEE Access, vol. 6, pp. 37 328–37 348, June 2018.
[26] W. Cui, K. Shen, and W. Yu, “Spatial deep learning for wireless
scheduling,” IEEE J. Sel. Areas Commun., vol. 37, no. 6, pp. 1248–
1261, June 2019.
[27] C. Huang, G. C. Alexandropoulos, C. Yuen, and M. Debbah, “Indoor
signal focusing with deep learning designed reconfigurable intelligent
surfaces,” in Proc. IEEE Int. Workshop Signal Process. Adv. Wireless
Commun. (SPAWC), July 2019.
[28] M. Eisen and A. Ribeiro, “Optimal wireless resource allocation with
random edge graph neural networks,” IEEE Trans. Signal Process.,
vol. 68, pp. 2977–2991, Apr. 2020.

Learning To Reflect and To Beamform For Intelligent Reflecting Surface With Implicit Channel Estimation

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Learning To Reflect and To Beamform For Intelligent Reflecting Surface With Implicit Channel Estimation

Uploaded by

Copyright:

Available Formats

1

Learning to Reflect and to Beamform for Intelligent

the base-station (BS) toward the users. The optimal tuning of

performance. We remark that incorporating such heteroge-

Let sk ∈ C be the symbol to be transmitted from the BS B. Problem Formulation

DNN Aggregation Aggregation Linear

DNN DNN DNN DNN

(a) The IRS node. (b) The user node k.

Fig. 5. Simulation layout of IRS assisted communication system.

Name Size Activation Function

TABLE III is defined as

1.5 1.5 1.5

You might also like